Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - Non-blocking Kafka stream iterators


+
Ryan LeCompte 2013-01-21, 03:15
+
Jun Rao 2013-01-21, 06:16
+
Ryan LeCompte 2013-01-21, 06:18
+
navneet sharma 2013-01-21, 07:03
+
Jun Rao 2013-01-21, 16:06
+
Jay Kreps 2013-01-21, 17:27
+
Evan Chan 2013-01-22, 17:38
+
Jay Kreps 2013-01-22, 18:16
+
Evan Chan 2013-01-22, 18:53
+
Jay Kreps 2013-01-22, 20:47
+
Evan Chan 2013-01-22, 20:57
Copy link to this message
-
Re: Non-blocking Kafka stream iterators
Chris Riccomini 2013-01-22, 22:35
Hey Guys,

One other potentially large benefit is to decouple broker dependencies
from consumer/producer dependencies. This makes upgrading the
consumer/producer and managing jar conflicts a lot less of a hassle.
Putting the consumer and producer in their own packages might hopefully
alleviate some of this. I'm not sure how much the broker is pulling in
that the consumer/producer aren't using, but it might be worth a look, if
there are a lot of jars that only the broker is using.

Cheers,
Chris

On 1/22/13 12:57 PM, "Evan Chan" <[EMAIL PROTECTED]> wrote:

>Hi Jay,
>
>Actually, it's mostly the ability to easily cross-build;   also the ease
>of
>understanding the code (less code to grok) and implementing alternatives
>(I
>guess all of those falls under cleanliness).
>
>thanks,
>Evan
>
>
>On Tue, Jan 22, 2013 at 12:47 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>
>> Hi Evan,
>>
>> Makes sense. Is your goal in separating the client shrinking the jar
>>size?
>> or just general cleanliness?
>>
>> -Jay
>>
>>
>> On Tue, Jan 22, 2013 at 10:53 AM, Evan Chan <[EMAIL PROTECTED]> wrote:
>>
>> > Jay,
>> >
>> > Comments inlined.
>> >
>> > On Tue, Jan 22, 2013 at 10:15 AM, Jay Kreps <[EMAIL PROTECTED]>
>>wrote:
>> >
>> > > Hey Evan,
>> > >
>> > > Great points, some comments:
>> > > - Not sure if I understand what you mean by separating consumer and
>> main
>> > > logic.
>> > >
>> >
>> > I just meant having a separate Scala/Java client jar, so it's more
>> > lightweight and easier to build independently.... kind of like the
>> > consumers for the other languages.
>> >
>> >
>> > > - Yes, cross-building, I think this is in progress now for kafka as
>>a
>> > whole
>> > > so it should be in either 0.8 or 0.8.1
>> > > - Yes, forgot to mention offset initialization, but that is
>>definitely
>> > > needed.
>> > >
>> > > For the hasNext functionality, even that is not very good since if
>>you
>> > have
>> > > two streams and want to take the next message from either you would
>> have
>> > to
>> > > busy wait calling hasNext on both in a loop.
>> > >
>> > > An alternative would be something like
>> > > val client = new ConsumerClient(topics, config)
>> > > client.select(timeout: Long): Iterator[MessageAndMetadata]
>> > >
>> > > This method would have no internal threading. It would
>>scatter-gather
>> > over
>> > > the topic/partitions assigned to this consumer (whether they are
>> > statically
>> > > or dynamically assigned would be specified in the config). The
>>select
>> > call
>> > > would internally just do an epoll/select on all the connections and
>> > return
>> > > the first message set it gets back or an empty list if it hits the
>> > timeout
>> > > and no one has responded.
>> > >
>> >
>> > Hm, I like that API actually.  It would definitely be more flexible.
>> >
>> >
>> > >
>> > > This api is less intuitive then the blocking iterator, but more
>> flexible
>> > > and enables a better, faster implementation. There would be no
>>threads
>> > > aside from the client's thread. It allows non-blocking or blocking
>> > > consumption. And it generalizes easily to consuming from many
>> > > topics/partitions simultaneously.
>> > >
>> > > We could implement an iterator like wrapper for this to ease the
>> > transition
>> > > that just used this api under the covers.
>> > >
>> > > Anyhow this is a ways out, and we haven't really had any proposals
>>or
>> > > discussions on it, but this is what I was thinking.
>> > >
>> > > -Jay
>> > >
>> > >
>> > >
>> > >
>> > > On Tue, Jan 22, 2013 at 9:37 AM, Evan Chan <[EMAIL PROTECTED]> wrote:
>> > >
>> > > > Jay,
>> > > >
>> > > > For the consumer:
>> > > > - Separation of the consumer logic from the main logic
>> > > > - Making it easier to build the consumer for different versions of
>> > Scala
>> > > > (say 2.10)
>> > > > - Make it easier to read from any offset you want, while being
>>able
>> to
>> > > keep
>> > > > partition management features
>> > > > - Better support for Akka and other non-blocking / event-based
>> > frameworks
 
+
Neha Narkhede 2013-01-23, 05:30
+
Ryan LeCompte 2013-01-22, 18:55