Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Blur >> mail # user >> Kafka-Blur Consumer for real time indexing Kafka messages


Copy link to this message
-
Kafka-Blur Consumer for real time indexing Kafka messages
Hi Aaron,

Recently I have written a Kafka Blur Consumer for indexing real time
streams into Blur cluster. Just pushed it to git
( https://github.com/dibbhatt/kafka-blur-consumer)

This utility will help to pull messages from Kafka Cluster and Index into
target Apache Blur Cluster. Kafka-Blur consumer will detect the number of
Partitions for a Kafka Topic and spawn that many threads to index kafka
partitions in parallel into Target Blur Table. Kafka-Blur Consumer uses
Blur Thrift Client's enqueueMutate . Kafka-Blur Consumer uses Zookeeper for
storing the latest offset of the indexed messages, which will help to
recover in case of failure . Let me know your view.  Let me know if this is
possible to push into Blur Contrib ?

Is there any progress on the persistent queue (HDFS backed) which you have
started working sometime back ?

Regards,
Dibyendu

On Fri, Mar 14, 2014 at 7:15 AM, Aaron McCurry <[EMAIL PROTECTED]> wrote:
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB