Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - Question about Partitions not receive equal number messages


+
Helin Xiang 2013-04-17, 10:00
Copy link to this message
-
Re: Question about Partitions not receive equal number messages
Neha Narkhede 2013-04-17, 13:30
I suspect each of the threads are not assigned equal number of messages to
send. I don't think it matter whether you use one producer or more as long
as you distribute work amongst those threads equally.

Thanks,
Neha

On Wednesday, April 17, 2013, Helin Xiang wrote:

> Hi,
> We are using kafka 0.7.2.
>
> The situation is a little complicated:
>
> 1. We use Java API and multi-thread to send logs to kafka.  (like 16
> threads).  Each thread contain its own kafka.javaapi.producer.Producer
> object.
> 2. There is one topic which the partition of is set to 4. we use random
> partition to send.
> 3. We generate messages of this topic at speed of 100 per second, so each
> thread only gets several logs per seconds.
>
> But we find the 4 partition gets unbalanced data. partition 0 gets logs 10
> times  more than partition 1 ,2 and 3.  Partition 1 , 2 , 3 gets nearly
> equal messages.
>
> after that, we set threads to 1, this unbalanced phenomenon vanished.
>
> we are not sure what happened under the java api of Producer.
> Could any one explain it ?
> Or is it necessary to generate new kafka.javaapi.producer.Producer object
> in each thread? I hear the kafka.javaapi.producer.Producer class is thread
> safe, but I don't know if 1 producer object can handle large throughput?
>
>
> THANKS
>
>
> --
> *Best Regards
>
> Xiang Helin*
>

 
+
王国栋 2013-04-17, 14:48