Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - only one ProducerSendThread thread when running with multiple brokers (kafka 0.8)


Copy link to this message
-
Re: only one ProducerSendThread thread when running with multiple brokers (kafka 0.8)
Gerrit Jansen van Vuuren 2014-01-01, 17:24
The network is 10gbit so it seems unlikely. The 5 brokers were running
without much load or probs. The bottle neck is that no matter how many
threads I use for sending, the sync block in the send method will never go
faster and its always limited to a single thread.

I also use snappy outside and no compression in the producer. A single
producer gives me max 6-10k tps, with 10 producers I can get max 60k tps.
This is on my servers and with my payload.

My end conclusion was that its impossible to scale a single producer
instance, and more threads make no difference on the sending side.
On 1 Jan 2014 17:31, "Chris Hogue" <[EMAIL PROTECTED]> wrote:

> Have you found what the actual bottleneck is? Is it the network send? Of
> course this would be highly influenced by the brokers' performance. After
> removing all compression work from the brokers we were able to get enough
> throughput from them that it's not really a concern.
>
> Another rough side-effect of the single synchronous send thread is that a
> single degrading or otherwise slow broker can back up the producing for the
> whole app. I haven't heard a great solution to this but would love to if
> someone's come up with it.
>
> -Chris
>
>
>
> On Wed, Jan 1, 2014 at 9:10 AM, Gerrit Jansen van Vuuren <
> [EMAIL PROTECTED]> wrote:
>
> > I've seen this bottle neck regardless of using compression or not, bpth
> > situations give me poor performance on sending to kafka via the scala
> > producer api.
> > On 1 Jan 2014 16:42, "Chris Hogue" <[EMAIL PROTECTED]> wrote:
> >
> > > Hi.
> > >
> > > When writing that blog we were using Kafka 0.7 as well. Understanding
> > that
> > > it probably wasn't the primary design goal, the separate send threads
> per
> > > broker that offered a separation of compression were a convenient
> > > side-effect of that design.
> > >
> > > We've since built new systems on 0.8 that have concentrated high
> > throughput
> > > on a small number of producers and had this discovery early on as well.
> > >
> > > Instead we've taken responsibility for the compression before the
> > producer
> > > and done that on separate threads as appropriate. While helpful for
> > > compression on the producer application the main reason for this is to
> > > prevent the broker from uncompressing and re-compressing each message
> as
> > it
> > > assigns offsets. There's a significant throughput advantage in doing
> > this.
> > >
> > > Truthfully since switching to snappy the compression throughput on the
> > > producer is much less of a concern in the overall context of the
> > > application.
> > >
> > > There was some discussion of these issues in the 'Client Improvement
> > > Discussion' thread a while ago where Jay provided some insight and
> > > discussion on future directions.
> > >
> > > -Chris
> > >
> > >
> > >
> > >
> > > On Wed, Jan 1, 2014 at 5:42 AM, yosi botzer <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > This is very interesting, this is what I see as well. I wish someone
> > > could
> > > > explain why it is not as explained here:
> > > > http://engineering.gnip.com/kafka-async-producer/
> > > >
> > > >
> > > > On Wed, Jan 1, 2014 at 2:39 PM, Gerrit Jansen van Vuuren <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > > > I don't know the code enough to comment on that (maybe someone else
> > on
> > > > the
> > > > > user list can do that), but from what I've seen doing some heavy
> > > > profiling
> > > > > I only see one thread per producer instance, it doesn't matter how
> > many
> > > > > brokers or topics you have the number of threads is always 1 per
> > > > producer.
> > > > > If you create 2 producers 2 threads and so on.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jan 1, 2014 at 1:27 PM, yosi botzer <[EMAIL PROTECTED]
> >
> > > > wrote:
> > > > >
> > > > > > But shouldn't I see a separate thread per broker (I am using the
> > > async
> > > > > > mode)?  Why do I get a better performance sending a message that
> > has
> > > > > fewer