Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> consuming only half the messages produced


Copy link to this message
-
Re: consuming only half the messages produced
In this case, you have a replication factor of 1. So, each partition has
only 1 replica.

There are two possibilities: (1) The producer didn't send all messages. (2)
The producer successfully sent all messages, but the consumer didn't
receive all messages. Could you first check the producer log and make sure
there are no errors? If so, check the consumer log. You should see sth like
"Consumer ???? selected partitions : ". See if all partitions are selected.
Also, check if there is any error in the consumer log.

To understand leader and isr, you can read the replication design in
http://kafka.apache.org/08/design.html

Thanks,

Jun
On Thu, May 2, 2013 at 6:32 AM, Rob Withers <[EMAIL PROTECTED]> wrote:

> Regarding the partitions created:
> >         topic: unittest-test-msg        partition: 0    leader: 0
> replicas: 0     isr: 0
> >         topic: unittest-test-msg        partition: 1    leader: 1
> replicas: 1     isr: 1
> >         topic: unittest-test-msg        partition: 2    leader: 0
> replicas: 0     isr: 0
> >         topic: unittest-test-msg        partition: 3    leader: 1
> replicas: 1     isr: 1
>
> Is this saying that partitions 1 and 3, both on the same broker, are both
> the Leader partitions AND the Replica partitions?  I do not understand the
> significance of ISR, either.   How should this be read?
>
> Keeping in mind that these were auto-created, unsure of the implications, I
> will be carefully reading the following, next:
> https://cwiki.apache.org/KAFKA/kafka-replication.html.
>
> thanks,
> rob
>
> > -----Original Message-----
> > From: Rob Withers [mailto:[EMAIL PROTECTED]]
> > Sent: Thursday, May 02, 2013 7:20 AM
> > To: '[EMAIL PROTECTED]'
> > Subject: RE: consuming only half the messages produced
> >
> > Yes, I mean we can only consume half the messages produced.  I followed
> the
> > high-level consumer example here:
> > https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Examp
> > le.
> >
> > Let me give a more complete scenario:
> >
> > - We run 3 zookeepers
> > - We run 2 brokers
> > - We do not have a topic defined, but we have enabled topic auto-creation
> > (with a replication factor of 2? must check this)
> > - We connect the producer to both brokers (pocmsg5:9092,pocmsg6:9092)
> > - We stuff the topic into the KeyedMessage key with no Partitioner.  I
> was
> not
> > aware of the use of the key until last night.
> > - We generate 10 messages
> > - Topic auto-creation results in the following partitions:
> >         topic: unittest-test-msg        partition: 0    leader: 0
> replicas: 0     isr: 0
> >         topic: unittest-test-msg        partition: 1    leader: 1
> replicas: 1     isr: 1
> >         topic: unittest-test-msg        partition: 2    leader: 0
> replicas: 0     isr: 0
> >         topic: unittest-test-msg        partition: 3    leader: 1
> replicas: 1     isr: 1
> > - We construct a single Kafka stream by calling createStreams with a
> > zookeeper (pocmsg5:2181) and one thread
> >         public <K,V> Map<String, List<KafkaStream<K,V>>>
> > createMessageStreams(
> >                         Map<String, Integer> topicCountMap,
> >                         Decoder<K> keyDecoder,
> >                         Decoder<V> valueDecoder)
> > - We consume only half the messages
> > - It looks as if partitions 0 and 2 are on pocmsg5, while partitions 1
> and
> 3 are
> > on pocmsg6.
> >
> > Is it best to view the situation as 2 partitions, each a leader, with a
> replica
> > follower for each?
> > which partitions are leaders and which are replicas?
> > What happened with auto-creation and production and partitioning?
> > Which partition(s) is the zookeeper pointing the high-level consumer to
> read
> > from?
> >
> > thanks,
> > rob
> >
> > > -----Original Message-----
> > > From: Jun Rao [mailto:[EMAIL PROTECTED]]
> > > Sent: Wednesday, May 01, 2013 11:15 PM
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: consuming only half the messages produced
> > >
> > > Partition is different from replicas. A topic can have one or more

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB