Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - one consumerConnector or many?


Copy link to this message
-
Re: one consumerConnector or many?
Chris Curtin 2013-05-29, 14:53
That's a good question about # of sockets when a single consumer is
connecting. I'll let someone from LinkedIn comment if each consumer has a
socket per topic/partition or if it is per Broker, since I'm not familiar
with that part of the code.

On Wed, May 29, 2013 at 9:53 AM, Withers, Robert <[EMAIL PROTECTED]>wrote:

> Thanks for the info.  Are you saying that even with a single connector,
> with say 3 topics and 3 threads per topic and 3 brokers with 3 partitions
> for all 3 topics on all 3 brokers, that a consumer box would have 9 sockets
> open?  What if there are 6 partitions per topic, would that be 18 open
> sockets?
>
> I have read somewhere that a high partition number, per topic, is
> desirable, to scale out the consumers and to be prepared to dynamically
> scale out consumption during a traffic spike.  Is it so?  100 topics, with
> 16 brokers and 200 partitions per topic with 1 consumer connector (just
> hypothetically so) would be 1600 sockets or 20000 sockets?
>
> For sure these boxes have plenty of ports.  I am just thinking through
> possible failures and what flexibility we have in configuration of
> producers/consumers to topics.  Really the question is best practices in
> this area.  A producer server handling 100+ msg types could also connect
> quite a bit.  So, perhaps it is best to restrict producer and consumer
> servers to process a restricted "class" of types.  Certainly if the
> producer is also hosting a web server, but perhaps not as dire on the
> consumer side.
>
> thanks,
> rob
> ________________________________________
> From: Chris Curtin [[EMAIL PROTECTED]]
> Sent: Wednesday, May 29, 2013 7:36 AM
> To: users
> Subject: Re: one consumerConnector or many?
>
> I'd look at a variation of #2. Can your messages by grouped into a 'class
> (for lack of a better term)' that are consumed together? For example a
> 'class' of 'auditing events' or 'sensor events'. The idea would to then
> have a topic for 'class'.
>
> A couple of benefits to this:
> - you can define your consumption of a 'class's resources by value. So the
> 'audit' topic may only get a 2 threaded consumer while the 'sensor' class
> gets a 10 threaded consumer.
> - you can stop processing a 'class' of messages if you need to without
> taking all the consumers off line (Assuming you have different processors
> or a way while running to alter your number of threads per topic.)
>
> Since it sounds like you may be frequently adding new message types this
> approach also allows you to decide if you want to shutdown only a part of
> your processing to add the new code to handle the message.
>
> Finally, why the concern about socket use? A well configured Windows or
> Linux machine can have thousands of open sockets without problems. Since
> 0.8.0 only connects to the Broker with the topic/partition you end up with
> 1 socket per topic/partition and consumer.
>
> Hope this helps,
>
> Chris
>
>
> On Wed, May 29, 2013 at 9:13 AM, Rob Withers <[EMAIL PROTECTED]> wrote:
>
> > In thinking about the design of consumption, we have in mind a generic
> > consumer server which would consume from more than one message type.  The
> > handling of each type of message would be different.  I suppose we could
> > have upwards of say 50 different message types, eventually, maybe 100+
> > different types.  Which of the following designs would be best and why
> > would
> > the other options be bad?
> >
> >
> >
> > 1)      Have all message types go through one topic and use a dispatcher
> > pattern to select the correct handler.  Use one consumerConnector.
> >
> > 2)      Use a different topic for each message type, but still use one
> > consumerConnector and a dispatcher pattern.
> >
> > 3)      Use a different topic for each message type and have a separate
> > consumerConnector for each topic.
> >
> >
> >
> > I am struggling with whether my assumptions are correct.  It seems that a
> > single connector for a topic would establish one socket to each broker,