Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # dev >> Re: kafka 0.8 producer throughput


+
Jun Rao 2013-01-09, 05:41
Copy link to this message
-
Re: kafka 0.8 producer throughput
Some folks came up with a cool hack in 0.8 that makes acks=0 send no
response. This changes the performance for small message sends to be
equivalent to 0.7. This is proposed for inclusion in 0.8. It would
obviously be less useful for the java/scala client in 0.9 if we are able to
properly pipeline requests, but it would still be a valid option for
non-java clients who don't want to deal with the complexity of request
pipelining. JIRA is here for discussion:
  https://issues.apache.org/jira/browse/KAFKA-736

-Jay
On Wed, Jan 9, 2013 at 8:31 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:

> We haven't done a ton of performance work on 0.8 yet.
>
> Regardless, requiring the ack will certainly reduce per-producer
> throughput, but it is too early to say by how much. Obviously this won't
> impact broker throughput (so if you have many producers you may not notice).
>
> The plan to fix this is just to make the produce request non-blocking.
> This will allow the same kind of throughput we had before but still allow
> us to give you back and error response if you want it. The hope would be to
> make this change in 0.9
>
> -Jay
>
>
> On Wed, Jan 9, 2013 at 8:24 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
>
>> In 0.8, ack is always required. The ack returns an errorcode that
>> indicates
>> the reason if a produce request  fails (e.g., the request is sent to a
>> broker that's not a leader). It also returns the offset of the produced
>> messages. However, the producer can choose when to receive the acks (e.g.,
>> when data reaches 1 replica or all replicas). If the ack indicates an
>> error, the client can choose to retry. The retry logic is built into our
>> high level producer.
>>
>> Thanks,
>>
>> Jun
>>
>> On Wed, Jan 9, 2013 at 6:20 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
>>
>> > What's the ack for?  If it fails, it will try another broker?  Can this
>> be
>> > disabled or it's a major design change?
>> >
>> >
>> > On Wed, Jan 9, 2013 at 12:40 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
>> >
>> > > The 50MB/s number is for 0.7. We haven't carefully measured the
>> > performance
>> > > in 0.8 yet. We do expect the throughput that a single producer can
>> drive
>> > in
>> > > 0.8 to be less. This is because the 0.8 producer needs to wait for an
>> RPC
>> > > response from the broker while in 0.7, there is no ack for the
>> producer.
>> > > Nevertheless, 2MB/s seems low. Could you try increasing flush
>> interval to
>> > > sth bigger, like 20000?
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> > > On Tue, Jan 8, 2013 at 8:32 PM, Jun Guo -X (jungu - CIIC at Cisco) <
>> > > [EMAIL PROTECTED]> wrote:
>> > >
>> > > > According to Kafka official document, the producer throughput is
>> about
>> > > > 50MB/S. But I do some test, the producer throughout is only about
>> > 2MB/S.
>> > > > The test environment is the same with document says. One producer,
>> One
>> > > > broker, One Zookeeper are in independent machine. Message size is
>> 100
>> > > > bytes, batch size is 200, flush interval is 600 messages. The test
>> > > > environment is the same, the configuration is the same. The why
>> there
>> > is
>> > > > such big gap the my test result and the document says?
>> > > >
>> > >
>> >
>>
>
>