Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Partitions and Streams

Copy link to this message
Partitions and Streams
Milind Parikh 2012-02-22, 08:06
I have a situation where one consumer cannot consume the data fast enough
from the producer.

So in the broker, I create two partitions for the topic. I then create two
consumers in two seperate jvms. Both consumers have topicCountMap = 2 and
partition 0 for consumer1 and partition 1 for consumer2. Both are running
before the producer starts.

Now when I run the default producer (in java example), I can see that one
of the consumer doesn't get all of the messages (which it shouldn't because
it should get roughly half). Now the second consumer is seemingly stuck
until I do CTRL^C on the first consumer. Then the pentup flush happens on
the second consumer and then the second consumer gets all the data that the
first one did not get.

public void run() {
    Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
*    topicCountMap.put(topic, new Integer(2));
*    Map<String, List<KafkaMessageStream<Message>>> consumerMap consumer.createMessageStreams(topicCountMap);
    KafkaMessageStream<Message> stream =  consumerMap.get(topic).get(*
    ConsumerIterator<Message> it = stream.iterator();
Shouldn't the consuming happen in parallel without needing to due a CRTL^C
(or an equivalent of a timeout) ? In other words, how can I get it to
parallel process. Is there something to do with