Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Non-blocking Kafka stream iterators


+
Ryan LeCompte 2013-01-21, 03:15
+
Jun Rao 2013-01-21, 06:16
+
Ryan LeCompte 2013-01-21, 06:18
+
navneet sharma 2013-01-21, 07:03
+
Jun Rao 2013-01-21, 16:06
+
Jay Kreps 2013-01-21, 17:27
+
Evan Chan 2013-01-22, 17:38
Copy link to this message
-
Re: Non-blocking Kafka stream iterators
Hey Evan,

Great points, some comments:
- Not sure if I understand what you mean by separating consumer and main
logic.
- Yes, cross-building, I think this is in progress now for kafka as a whole
so it should be in either 0.8 or 0.8.1
- Yes, forgot to mention offset initialization, but that is definitely
needed.

For the hasNext functionality, even that is not very good since if you have
two streams and want to take the next message from either you would have to
busy wait calling hasNext on both in a loop.

An alternative would be something like
val client = new ConsumerClient(topics, config)
client.select(timeout: Long): Iterator[MessageAndMetadata]

This method would have no internal threading. It would scatter-gather over
the topic/partitions assigned to this consumer (whether they are statically
or dynamically assigned would be specified in the config). The select call
would internally just do an epoll/select on all the connections and return
the first message set it gets back or an empty list if it hits the timeout
and no one has responded.

This api is less intuitive then the blocking iterator, but more flexible
and enables a better, faster implementation. There would be no threads
aside from the client's thread. It allows non-blocking or blocking
consumption. And it generalizes easily to consuming from many
topics/partitions simultaneously.

We could implement an iterator like wrapper for this to ease the transition
that just used this api under the covers.

Anyhow this is a ways out, and we haven't really had any proposals or
discussions on it, but this is what I was thinking.

-Jay
On Tue, Jan 22, 2013 at 9:37 AM, Evan Chan <[EMAIL PROTECTED]> wrote:

> Jay,
>
> For the consumer:
> - Separation of the consumer logic from the main logic
> - Making it easier to build the consumer for different versions of Scala
> (say 2.10)
> - Make it easier to read from any offset you want, while being able to keep
> partition management features
> - Better support for Akka and other non-blocking / event-based frameworks
> (instead of a timeout, implement true hasNext functionality, for example)
>
> thanks,
> Evan
>
>
> On Mon, Jan 21, 2013 at 9:27 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>
> > It's worth mentioning that we are interested in exploring potential
> > generalizations of the producer and consumer API, but as a practical
> matter
> > most of the committers are working on getting a stable 0.8 release out
> the
> > door. So an improved consumer and producer api would be a 0.9 feature.
> >
> > If you have a concrete thing you are trying to do now that is awkward it
> > would be great to hear about the use case.
> >
> > Possible goals of improving the apis and client impls would include the
> > following:
> >
> > Producer:
> > 1. Include the offset in the information returned to the producer
> > 2. Pipeline producer requests to improve throughput for synchronous
> > production
> >
> > Consumer
> > 1. Simplify api while supporting various advanced use cases like
> > multi-stream consumption
> > 2. Make partition assignment optional and server-side (this is currently
> > the difference between the zk consumer and the simple consumer)
> > 3. Make offset management optional
> > 4. Remove threading from the consumer
> > 5. Simplify consumer memory management
> >
> > -Jay
> >
> >
> >
> >
> > On Mon, Jan 21, 2013 at 8:05 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> >
> > > No, but you can implement it in your application.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Sun, Jan 20, 2013 at 11:02 PM, navneet sharma <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > Is there any property to make consumer work for lets say only 10 mins
> > (ie
> > > > some kind of timer)
> > > > So, i want to close the consumer after 10 mins reading from broker..
> > > >
> > > > Thanks,
> > > > Navneet Sharma
> > > >
> > > >
> > > > On Mon, Jan 21, 2013 at 11:48 AM, Ryan LeCompte <[EMAIL PROTECTED]>
> > > > wrote:
> > > >
> > > > > Perfect. Thanks Jun!

 
+
Evan Chan 2013-01-22, 18:53
+
Jay Kreps 2013-01-22, 20:47
+
Evan Chan 2013-01-22, 20:57
+
Chris Riccomini 2013-01-22, 22:35
+
Neha Narkhede 2013-01-23, 05:30
+
Ryan LeCompte 2013-01-22, 18:55