Blur, mail # user - Kafka-Blur Consumer for real time indexing Kafka messages - 2014-04-13, 17:27
 Search Hadoop and all its subprojects:

Switch to Threaded View
Copy link to this message
-
Kafka-Blur Consumer for real time indexing Kafka messages
Hi Aaron,

Recently I have written a Kafka Blur Consumer for indexing real time
streams into Blur cluster. Just pushed it to git
( https://github.com/dibbhatt/kafka-blur-consumer)

This utility will help to pull messages from Kafka Cluster and Index into
target Apache Blur Cluster. Kafka-Blur consumer will detect the number of
Partitions for a Kafka Topic and spawn that many threads to index kafka
partitions in parallel into Target Blur Table. Kafka-Blur Consumer uses
Blur Thrift Client's enqueueMutate . Kafka-Blur Consumer uses Zookeeper for
storing the latest offset of the indexed messages, which will help to
recover in case of failure . Let me know your view.  Let me know if this is
possible to push into Blur Contrib ?

Is there any progress on the persistent queue (HDFS backed) which you have
started working sometime back ?

Regards,
Dibyendu

On Fri, Mar 14, 2014 at 7:15 AM, Aaron McCurry <[EMAIL PROTECTED]> wrote:
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB