Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> only one ProducerSendThread thread when running with multiple brokers (kafka 0.8)


Copy link to this message
-
Re: only one ProducerSendThread thread when running with multiple brokers (kafka 0.8)
The network is 10gbit so it seems unlikely. The 5 brokers were running
without much load or probs. The bottle neck is that no matter how many
threads I use for sending, the sync block in the send method will never go
faster and its always limited to a single thread.

I also use snappy outside and no compression in the producer. A single
producer gives me max 6-10k tps, with 10 producers I can get max 60k tps.
This is on my servers and with my payload.

My end conclusion was that its impossible to scale a single producer
instance, and more threads make no difference on the sending side.
On 1 Jan 2014 17:31, "Chris Hogue" <[EMAIL PROTECTED]> wrote:

> Have you found what the actual bottleneck is? Is it the network send? Of
> course this would be highly influenced by the brokers' performance. After
> removing all compression work from the brokers we were able to get enough
> throughput from them that it's not really a concern.
>
> Another rough side-effect of the single synchronous send thread is that a
> single degrading or otherwise slow broker can back up the producing for the
> whole app. I haven't heard a great solution to this but would love to if
> someone's come up with it.
>
> -Chris
>
>
>
> On Wed, Jan 1, 2014 at 9:10 AM, Gerrit Jansen van Vuuren <
> [EMAIL PROTECTED]> wrote:
>
> > I've seen this bottle neck regardless of using compression or not, bpth
> > situations give me poor performance on sending to kafka via the scala
> > producer api.
> > On 1 Jan 2014 16:42, "Chris Hogue" <[EMAIL PROTECTED]> wrote:
> >
> > > Hi.
> > >
> > > When writing that blog we were using Kafka 0.7 as well. Understanding
> > that
> > > it probably wasn't the primary design goal, the separate send threads
> per
> > > broker that offered a separation of compression were a convenient
> > > side-effect of that design.
> > >
> > > We've since built new systems on 0.8 that have concentrated high
> > throughput
> > > on a small number of producers and had this discovery early on as well.
> > >
> > > Instead we've taken responsibility for the compression before the
> > producer
> > > and done that on separate threads as appropriate. While helpful for
> > > compression on the producer application the main reason for this is to
> > > prevent the broker from uncompressing and re-compressing each message
> as
> > it
> > > assigns offsets. There's a significant throughput advantage in doing
> > this.
> > >
> > > Truthfully since switching to snappy the compression throughput on the
> > > producer is much less of a concern in the overall context of the
> > > application.
> > >
> > > There was some discussion of these issues in the 'Client Improvement
> > > Discussion' thread a while ago where Jay provided some insight and
> > > discussion on future directions.
> > >
> > > -Chris
> > >
> > >
> > >
> > >
> > > On Wed, Jan 1, 2014 at 5:42 AM, yosi botzer <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > This is very interesting, this is what I see as well. I wish someone
> > > could
> > > > explain why it is not as explained here:
> > > > http://engineering.gnip.com/kafka-async-producer/
> > > >
> > > >
> > > > On Wed, Jan 1, 2014 at 2:39 PM, Gerrit Jansen van Vuuren <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > > > I don't know the code enough to comment on that (maybe someone else
> > on
> > > > the
> > > > > user list can do that), but from what I've seen doing some heavy
> > > > profiling
> > > > > I only see one thread per producer instance, it doesn't matter how
> > many
> > > > > brokers or topics you have the number of threads is always 1 per
> > > > producer.
> > > > > If you create 2 producers 2 threads and so on.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jan 1, 2014 at 1:27 PM, yosi botzer <[EMAIL PROTECTED]
> >
> > > > wrote:
> > > > >
> > > > > > But shouldn't I see a separate thread per broker (I am using the
> > > async
> > > > > > mode)?  Why do I get a better performance sending a message that
> > has
> > > > > fewer

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB