Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Re: Producer not distributing across all partitions


Copy link to this message
-
Re: Producer not distributing across all partitions
Guozhang Wang 2013-09-13, 22:05
Hello Joe,

The reason we make the producers to produce to a fixed partition for each
metadata-refresh interval are the following:

https://issues.apache.org/jira/browse/KAFKA-1017

https://issues.apache.org/jira/browse/KAFKA-959

So in a word the randomness is still preserved but within one
metadata-refresh interval the assignment is fixed.

I agree that the document should be updated accordingly.

Guozhang
On Fri, Sep 13, 2013 at 1:48 PM, Joe Stein <[EMAIL PROTECTED]> wrote:

> Isn't this a bug?
>
> I don't see why we would want users to have to code and generate random
> partition keys to randomly distributed the data to partitions, that is
> Kafka's job isn't it?
>
> Or if supplying a null value tell the user this is not supported (throw
> exception) in KeyedMessage like we do for topic and not treat null as a key
> to hash?
>
> My preference is to put those three lines back in and let key be null and
> give folks randomness unless its not a bug and there is a good reason for
> it?
>
> Is there something about
> https://issues.apache.org/jira/browse/KAFKA-691that requires the lines
> taken out? I haven't had a chance to look through
> it yet
>
> My thought is a new person coming in they would expect to see the
> partitions filling up in a round robin fashion as our docs says and unless
> we force them in the API to know they have to-do this or give them the
> ability for this to happen when passing nothing in
>
> /*******************************************
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> ********************************************/
>
>
> On Fri, Sep 13, 2013 at 4:17 PM, Drew Goya <[EMAIL PROTECTED]> wrote:
>
> > I ran into this problem as well Prashant.  The default partition key was
> > recently changed:
> >
> >
> >
> https://github.com/apache/kafka/commit/b71e6dc352770f22daec0c9a3682138666f032be
> >
> > It no longer assigns a random partition to data with a null partition
> key.
> >  I had to change my code to generate random partition keys to get the
> > randomly distributed behavior the producer used to have.
> >
> >
> > On Fri, Sep 13, 2013 at 11:42 AM, prashant amar <[EMAIL PROTECTED]>
> > wrote:
> >
> > > Thanks Neha
> > >
> > > I will try applying this property and circle back.
> > >
> > > Also, I have been attempting to execute kafka-producer-perf-test.sh
> and I
> > > receive the following error
> > >
> > >        Error: Could not find or load main class
> > > kafka.perf.ProducerPerformance
> > >
> > > I am running against 0.8.0-beta1
> > >
> > > Seems like perf is a separate project in the workspace.
> > >
> > > Does sbt package-assembly bundle the perf jar as well?
> > >
> > > Neither producer-perf-test not consumer-test are working with this
> build
> > >
> > >
> > >
> > > On Fri, Sep 13, 2013 at 9:56 AM, Neha Narkhede <
> [EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > As Jun suggested, one reason could be that the
> > > > topic.metadata.refresh.interval.ms is too high. Did you observe if
> the
> > > > distribution improves after topic.metadata.refresh.interval.ms has
> > > passed
> > > > ?
> > > >
> > > > Thanks
> > > > Neha
> > > >
> > > >
> > > > On Fri, Sep 13, 2013 at 4:47 AM, prashant amar <[EMAIL PROTECTED]>
> > > > wrote:
> > > >
> > > > > I am using kafka 08 version ...
> > > > >
> > > > >
> > > > > On Thu, Sep 12, 2013 at 8:44 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > > > >
> > > > > > Which revision of 0.8 are you using? In a recent change, a
> producer
> > > > will
> > > > > > stick to a partition for topic.metadata.refresh.interval.ms
> > (defaults
> > > > to
> > > > > > 10
> > > > > > mins) time before picking another partition at random.
> > > > > > Thanks,
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > > On Thu, Sep 12, 2013 at 1:56 PM, prashant amar <
> > [EMAIL PROTECTED]>
> > > > > > wrote:
> > > > > >
> > > > > > > I created a topic with 4 partitions and for some reason the