Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Kafka producer behavior


Copy link to this message
-
Re: Kafka producer behavior
Hanish Bansal 2013-12-19, 09:54
Thanks Gerrit !!

I am able to use that now.
On Wed, Dec 18, 2013 at 10:44 AM, Gerrit Jansen van Vuuren <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> this is a gotcha about kafka producer partitioning, you much send the
> messages with a non null key.
> If the key is null kafka will not call the partitioner.
>
> Because with this partitioner the key does not matter you can pass in a
> constant string like "1" etc.
>
> Oh one more thing, on performance:
>
> The produce's send method has a synchronized block on the producer
> instance, which means performance goes down the drain.
> I could only get (on a 12 core, 72 gig ram) machine 13K tps out of the
> producer. A way to solve this is to instantiate an array/list of N
> producers and then in your send code round robin over the producers.
> I got to 80K tps (for my use case) using 6 producer instances from a single
> box sending to 3 kafka servers.
>
> e.g.
>
>
> send ( msg ) {
>   producers[ producer-index.getAndIncrement() % producer_count ].send(msg)
> }
>
> Regards,
>  Gerrit
>
>
> On Wed, Dec 18, 2013 at 11:24 AM, Hanish Bansal <
> [EMAIL PROTECTED]> wrote:
>
> > Thanks for response Gerrit and Guozhang !!
> >
> > Hi Gerrit,
> >
> > I am trying to use  same round robin partitioner shared by you but hard
> > luck, still round robin partitioning not working.
> >
> > I have successfully registered RoundRobinPartitioner in kafka producer.
> >
> > Code of RoundRobinPartitioner class as:
> >
> >     public RoundRobinPartitioner(VerifiableProperties props){
> >              log.info("Using Round Robin Partitioner class...");
> >     }
> >
> >         @Override
> >         public int partition(String key, int partitions) {
> >             log.info("Inside partition method");
> >             int i = counter.getAndIncrement();
> >             if(i == Integer.MAX_VALUE){
> >                     counter.set(0);
> >              return 0;
> >             }else
> >              return i % partitions;
> >         }
> >
> > When i produce the data, first log message "Using Round Robin Partitioner
> > class..." is printed and second message "Inside partition method" is not
> > printed.
> >
> > From that we can ensure that RoundRobinPartitioner has been successfully
> > registered but logic of round robin is not getting called.
> >
> > Any help to resolve what i am missing ?
> >
> > Thanks in advance !!
> >
> >
> >
> > On Tue, Dec 17, 2013 at 5:59 PM, Guozhang Wang <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hello,
> > >
> > > This issue is known as in this JIRA:
> > >
> > > https://issues.apache.org/jira/browse/KAFKA-1067
> > >
> > > Guozhang
> > >
> > >
> > > On Tue, Dec 17, 2013 at 8:48 AM, Gerrit Jansen van Vuuren <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > hi,
> > > >
> > > > I've had the same issue with the kafka producer.
> > > >
> > > > you need to use a different partitioner than the default one provided
> > for
> > > > kafka.
> > > > I've created a round robin partitioner that works well for equally
> > > > distributing data across partitions.
> > > >
> > > >
> > > >
> > >
> >
> https://github.com/gerritjvv/pseidon/blob/master/pseidon-kafka/java/pseidon/kafka/util/RoundRobinPartitioner.java
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Dec 17, 2013 at 5:32 PM, Hanish Bansal <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > We are having kafka cluster of 2 nodes. (using 0.8.0 final release)
> > > > > Replication Factor: 2
> > > > > Number of partitions: 2
> > > > >
> > > > > I have created a topic "test-topic1" in kafka.
> > > > >
> > > > > When i am listing status of that topic using
> bin/kafka-list-topic.sh,
> > > the
> > > > > status is:
> > > > >
> > > > > topic: test-topic1    partition: 0    leader: 0       replicas: 0,1
> > > > isr:
> > > > > 0,1
> > > > > topic: test-topic1    partition: 1    leader: 1       replicas: 1,0
> > > > isr:
> > > > > 1,0
> > > > >
> > > > > As both partition are on two separate nodes so when we produce the

*Thanks & Regards*
*Hanish Bansal*