Re: Non-blocking Kafka stream iterators
Great points, some comments:
- Not sure if I understand what you mean by separating consumer and main
- Yes, cross-building, I think this is in progress now for kafka as a whole
so it should be in either 0.8 or 0.8.1
- Yes, forgot to mention offset initialization, but that is definitely
For the hasNext functionality, even that is not very good since if you have
two streams and want to take the next message from either you would have to
busy wait calling hasNext on both in a loop.
An alternative would be something like
val client = new ConsumerClient(topics, config)
client.select(timeout: Long): Iterator[MessageAndMetadata]
This method would have no internal threading. It would scatter-gather over
the topic/partitions assigned to this consumer (whether they are statically
or dynamically assigned would be specified in the config). The select call
would internally just do an epoll/select on all the connections and return
the first message set it gets back or an empty list if it hits the timeout
and no one has responded.
This api is less intuitive then the blocking iterator, but more flexible
and enables a better, faster implementation. There would be no threads
aside from the client's thread. It allows non-blocking or blocking
consumption. And it generalizes easily to consuming from many
We could implement an iterator like wrapper for this to ease the transition
that just used this api under the covers.
Anyhow this is a ways out, and we haven't really had any proposals or
discussions on it, but this is what I was thinking.
On Tue, Jan 22, 2013 at 9:37 AM, Evan Chan <[EMAIL PROTECTED]> wrote: