Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka producer behavior


Copy link to this message
-
Re: Kafka producer behavior
Thanks Gerrit !!

I am able to use that now.
On Wed, Dec 18, 2013 at 10:44 AM, Gerrit Jansen van Vuuren <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> this is a gotcha about kafka producer partitioning, you much send the
> messages with a non null key.
> If the key is null kafka will not call the partitioner.
>
> Because with this partitioner the key does not matter you can pass in a
> constant string like "1" etc.
>
> Oh one more thing, on performance:
>
> The produce's send method has a synchronized block on the producer
> instance, which means performance goes down the drain.
> I could only get (on a 12 core, 72 gig ram) machine 13K tps out of the
> producer. A way to solve this is to instantiate an array/list of N
> producers and then in your send code round robin over the producers.
> I got to 80K tps (for my use case) using 6 producer instances from a single
> box sending to 3 kafka servers.
>
> e.g.
>
>
> send ( msg ) {
>   producers[ producer-index.getAndIncrement() % producer_count ].send(msg)
> }
>
> Regards,
>  Gerrit
>
>
> On Wed, Dec 18, 2013 at 11:24 AM, Hanish Bansal <
> [EMAIL PROTECTED]> wrote:
>
> > Thanks for response Gerrit and Guozhang !!
> >
> > Hi Gerrit,
> >
> > I am trying to use  same round robin partitioner shared by you but hard
> > luck, still round robin partitioning not working.
> >
> > I have successfully registered RoundRobinPartitioner in kafka producer.
> >
> > Code of RoundRobinPartitioner class as:
> >
> >     public RoundRobinPartitioner(VerifiableProperties props){
> >              log.info("Using Round Robin Partitioner class...");
> >     }
> >
> >         @Override
> >         public int partition(String key, int partitions) {
> >             log.info("Inside partition method");
> >             int i = counter.getAndIncrement();
> >             if(i == Integer.MAX_VALUE){
> >                     counter.set(0);
> >              return 0;
> >             }else
> >              return i % partitions;
> >         }
> >
> > When i produce the data, first log message "Using Round Robin Partitioner
> > class..." is printed and second message "Inside partition method" is not
> > printed.
> >
> > From that we can ensure that RoundRobinPartitioner has been successfully
> > registered but logic of round robin is not getting called.
> >
> > Any help to resolve what i am missing ?
> >
> > Thanks in advance !!
> >
> >
> >
> > On Tue, Dec 17, 2013 at 5:59 PM, Guozhang Wang <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hello,
> > >
> > > This issue is known as in this JIRA:
> > >
> > > https://issues.apache.org/jira/browse/KAFKA-1067
> > >
> > > Guozhang
> > >
> > >
> > > On Tue, Dec 17, 2013 at 8:48 AM, Gerrit Jansen van Vuuren <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > hi,
> > > >
> > > > I've had the same issue with the kafka producer.
> > > >
> > > > you need to use a different partitioner than the default one provided
> > for
> > > > kafka.
> > > > I've created a round robin partitioner that works well for equally
> > > > distributing data across partitions.
> > > >
> > > >
> > > >
> > >
> >
> https://github.com/gerritjvv/pseidon/blob/master/pseidon-kafka/java/pseidon/kafka/util/RoundRobinPartitioner.java
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Dec 17, 2013 at 5:32 PM, Hanish Bansal <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > We are having kafka cluster of 2 nodes. (using 0.8.0 final release)
> > > > > Replication Factor: 2
> > > > > Number of partitions: 2
> > > > >
> > > > > I have created a topic "test-topic1" in kafka.
> > > > >
> > > > > When i am listing status of that topic using
> bin/kafka-list-topic.sh,
> > > the
> > > > > status is:
> > > > >
> > > > > topic: test-topic1    partition: 0    leader: 0       replicas: 0,1
> > > > isr:
> > > > > 0,1
> > > > > topic: test-topic1    partition: 1    leader: 1       replicas: 1,0
> > > > isr:
> > > > > 1,0
> > > > >
> > > > > As both partition are on two separate nodes so when we produce the

*Thanks & Regards*
*Hanish Bansal*

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB