Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> High level consumer Blocked when there is still message in topic


+
李帅 2013-11-13, 04:42
+
Jun Rao 2013-11-13, 05:28
+
hsy541@...> 2013-11-13, 19:54
Copy link to this message
-
Re: High level consumer Blocked when there is still message in topic
On Wed, Nov 13, 2013 at 11:54:07AM -0800, [EMAIL PROTECTED] wrote:
> Since you have a cluster, why not distribute the consumers in different
> nodes instead of threads. I think that's the only way to scale up with
> kafka.

Depending on your CPU-specs you should be able to add threads to scale
out - but yes if you want to scale out even more you would want more
instances on more nodes provided there are sufficient partitions to
load balance.

> Question here: if there are more and more high-level consumers, is there a
> bottleneck on the zookeeper?

The high-level consumer has a heavy dependence on zookeeper - so yes
there is a bottleneck there especially if each consumer consumes a lot
of topics. This will be addressed to a large degree by the client
rewrite; consumer coordinator approach
(https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ConsumerAPI)
and in-built offset management
(https://cwiki.apache.org/confluence/display/KAFKA/Inbuilt+Consumer+Offset+Management)

Joel
>
>
> On Tue, Nov 12, 2013 at 9:27 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
>
> > What's the max lag (reported in JMX) in the consumer? Can the consumer keep
> > up with the incoming data rate?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Tue, Nov 12, 2013 at 7:19 PM, 李帅 <[EMAIL PROTECTED]> wrote:
> >
> > > Hi,
> > >
> > >    I use Kafka 0.8 high level consumer reads message from topic
> > > stream, 3 replica and 10 paritions.
> > >
> > >    When I use 10 threads read the stream and runing for some time (one
> > > hour or one day),
> > >
> > > some threads block at m_stream.iterator().hasNext(), but the parition
> > > still has lots of messages.
> > >
> > >    I check consumer's fetch.message.max.bytes and broker's
> > > message.max.bytes, there is no
> > >
> > > message size bigger than these values.
> > >
> > >    The consumer configure is
> > >    props.put("zookeeper.session.timeout.ms", "4000");
> > >    props.put("zookeeper.sync.time.ms", "200");
> > >    props.put("auto.commit.interval.ms", "1000");
> > >
> > >
> > >    Please give me some option about how to avoid consumer block.
> > >
> > >    Is there some configure parameter can fix this problem.
> > >
> > > Thanks!
> > >
> > > XiaoTian
> > >
> >
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB