Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Question about Partitions not receive equal number messages


Copy link to this message
-
Question about Partitions not receive equal number messages
Helin Xiang 2013-04-17, 10:00
Hi,
We are using kafka 0.7.2.

The situation is a little complicated:

1. We use Java API and multi-thread to send logs to kafka.  (like 16
threads).  Each thread contain its own kafka.javaapi.producer.Producer
object.
2. There is one topic which the partition of is set to 4. we use random
partition to send.
3. We generate messages of this topic at speed of 100 per second, so each
thread only gets several logs per seconds.

But we find the 4 partition gets unbalanced data. partition 0 gets logs 10
times  more than partition 1 ,2 and 3.  Partition 1 , 2 , 3 gets nearly
equal messages.

after that, we set threads to 1, this unbalanced phenomenon vanished.

we are not sure what happened under the java api of Producer.
Could any one explain it ?
Or is it necessary to generate new kafka.javaapi.producer.Producer object
in each thread? I hear the kafka.javaapi.producer.Producer class is thread
safe, but I don't know if 1 producer object can handle large throughput?
THANKS
--
*Best Regards

Xiang Helin*