Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Blur >> mail # user >> Kafka-Blur Consumer for real time indexing Kafka messages

Copy link to this message
Kafka-Blur Consumer for real time indexing Kafka messages
Hi Aaron,

Recently I have written a Kafka Blur Consumer for indexing real time
streams into Blur cluster. Just pushed it to git
( https://github.com/dibbhatt/kafka-blur-consumer)

This utility will help to pull messages from Kafka Cluster and Index into
target Apache Blur Cluster. Kafka-Blur consumer will detect the number of
Partitions for a Kafka Topic and spawn that many threads to index kafka
partitions in parallel into Target Blur Table. Kafka-Blur Consumer uses
Blur Thrift Client's enqueueMutate . Kafka-Blur Consumer uses Zookeeper for
storing the latest offset of the indexed messages, which will help to
recover in case of failure . Let me know your view.  Let me know if this is
possible to push into Blur Contrib ?

Is there any progress on the persistent queue (HDFS backed) which you have
started working sometime back ?


On Fri, Mar 14, 2014 at 7:15 AM, Aaron McCurry <[EMAIL PROTECTED]> wrote: