Kafka, mail # user - Re: processing a batch of messages in a "transaction" - 2013-11-22, 04:06
Solr & Elasticsearch trainings in New York & San Fransisco [more info][hide]
 Search Hadoop and all its subprojects:

Switch to Threaded View
Copy link to this message
-
Re: processing a batch of messages in a "transaction"
Imran,

Remember too, that different threads will always be processing a different
set of partitions.  No 2 threads will ever own the same partition,
simultaneously.

A consumer connector can own many partitions (split among its threads),
each with a different offset.  So, yes, it is complicated, as you say, to
try to get coherent committing when you want to commit batches of messages,
while using multiple threads.

In this case, you would need to make sure that a commit happens only after
all threads have successfully processed a batch of messages (but no more),
and are all waiting for a single commit to start and finish.

So, it may be easier to think in terms of not having multiple threads,
etc., and instead limit the number of partitions/topics a single thread
might work on.

It is all pretty complicated, but I think it is so in the name of
high-throughput and performance.  But there clearly is room for refactoring
(coming in 0.9!).

Jason
On Thu, Nov 21, 2013 at 2:51 PM, Imran Rashid <[EMAIL PROTECTED]> wrote:
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB