Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Partition election on consumer


Copy link to this message
-
Partition election on consumer
Hi,

we currently face a "problem" on our consumer cluster, which may have a
simple solution. Never the less I could not find this solution yet.

Description of problem:
1 kafka topic with 24 partitions (kafka version 0.8 Beta1
2 or more consumers in same consumer group. Each consumer processes its
partitions by aggregating topic data into a relational database. Each
consumer hashes the aggregation data locally for commiting data into the
relational database. After commit to database the consumerConnector commits
the offsets to kafka.

Problem is: If I connect a new consumer, the consumerconnector recalculates
the partitions to read from on each consumer instance. That causes our
system currently to process topic-data twice because of the local
aggregation within the consumer.

Is there any possibility to catch the event of new partition selection in
conumserConnector to commit the offsets and data before reconnecting to new
partitions?

Thanks in advance
Markus

--
Markus Roder
Distelweg 4
97318 Kitzingen
Mail: [EMAIL PROTECTED]
Profil: http://gplus.to/markusroder

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB