Recently I have written a Kafka Blur Consumer for indexing real time
streams into Blur cluster. Just pushed it to git
This utility will help to pull messages from Kafka Cluster and Index into
target Apache Blur Cluster. Kafka-Blur consumer will detect the number of
Partitions for a Kafka Topic and spawn that many threads to index kafka
partitions in parallel into Target Blur Table. Kafka-Blur Consumer uses
Blur Thrift Client's enqueueMutate . Kafka-Blur Consumer uses Zookeeper for
storing the latest offset of the indexed messages, which will help to
recover in case of failure . Let me know your view. Let me know if this is
possible to push into Blur Contrib ?
Is there any progress on the persistent queue (HDFS backed) which you have
started working sometime back ?
On Fri, Mar 14, 2014 at 7:15 AM, Aaron McCurry <[EMAIL PROTECTED]> wrote: