Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Re: one consumerConnector or many?


+
Jun Rao 2013-05-30, 03:36
+
Rob Withers 2013-05-29, 13:14
+
Chris Curtin 2013-05-29, 13:36
+
Jun Rao 2013-05-29, 15:58
Copy link to this message
-
Re: one consumerConnector or many?
That's a good question about # of sockets when a single consumer is
connecting. I'll let someone from LinkedIn comment if each consumer has a
socket per topic/partition or if it is per Broker, since I'm not familiar
with that part of the code.

On Wed, May 29, 2013 at 9:53 AM, Withers, Robert <[EMAIL PROTECTED]>wrote:

> Thanks for the info.  Are you saying that even with a single connector,
> with say 3 topics and 3 threads per topic and 3 brokers with 3 partitions
> for all 3 topics on all 3 brokers, that a consumer box would have 9 sockets
> open?  What if there are 6 partitions per topic, would that be 18 open
> sockets?
>
> I have read somewhere that a high partition number, per topic, is
> desirable, to scale out the consumers and to be prepared to dynamically
> scale out consumption during a traffic spike.  Is it so?  100 topics, with
> 16 brokers and 200 partitions per topic with 1 consumer connector (just
> hypothetically so) would be 1600 sockets or 20000 sockets?
>
> For sure these boxes have plenty of ports.  I am just thinking through
> possible failures and what flexibility we have in configuration of
> producers/consumers to topics.  Really the question is best practices in
> this area.  A producer server handling 100+ msg types could also connect
> quite a bit.  So, perhaps it is best to restrict producer and consumer
> servers to process a restricted "class" of types.  Certainly if the
> producer is also hosting a web server, but perhaps not as dire on the
> consumer side.
>
> thanks,
> rob
> ________________________________________
> From: Chris Curtin [[EMAIL PROTECTED]]
> Sent: Wednesday, May 29, 2013 7:36 AM
> To: users
> Subject: Re: one consumerConnector or many?
>
> I'd look at a variation of #2. Can your messages by grouped into a 'class
> (for lack of a better term)' that are consumed together? For example a
> 'class' of 'auditing events' or 'sensor events'. The idea would to then
> have a topic for 'class'.
>
> A couple of benefits to this:
> - you can define your consumption of a 'class's resources by value. So the
> 'audit' topic may only get a 2 threaded consumer while the 'sensor' class
> gets a 10 threaded consumer.
> - you can stop processing a 'class' of messages if you need to without
> taking all the consumers off line (Assuming you have different processors
> or a way while running to alter your number of threads per topic.)
>
> Since it sounds like you may be frequently adding new message types this
> approach also allows you to decide if you want to shutdown only a part of
> your processing to add the new code to handle the message.
>
> Finally, why the concern about socket use? A well configured Windows or
> Linux machine can have thousands of open sockets without problems. Since
> 0.8.0 only connects to the Broker with the topic/partition you end up with
> 1 socket per topic/partition and consumer.
>
> Hope this helps,
>
> Chris
>
>
> On Wed, May 29, 2013 at 9:13 AM, Rob Withers <[EMAIL PROTECTED]> wrote:
>
> > In thinking about the design of consumption, we have in mind a generic
> > consumer server which would consume from more than one message type.  The
> > handling of each type of message would be different.  I suppose we could
> > have upwards of say 50 different message types, eventually, maybe 100+
> > different types.  Which of the following designs would be best and why
> > would
> > the other options be bad?
> >
> >
> >
> > 1)      Have all message types go through one topic and use a dispatcher
> > pattern to select the correct handler.  Use one consumerConnector.
> >
> > 2)      Use a different topic for each message type, but still use one
> > consumerConnector and a dispatcher pattern.
> >
> > 3)      Use a different topic for each message type and have a separate
> > consumerConnector for each topic.
> >
> >
> >
> > I am struggling with whether my assumptions are correct.  It seems that a
> > single connector for a topic would establish one socket to each broker,

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB