Home | About | Sematext search-lucene.com search-hadoop.com
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka/Hadoop consumers and producers


Copy link to this message
-
Re: Kafka/Hadoop consumers and producers
Vadim,

The advantages of Camus compared to the contrib consumer are the following
(but perhaps I'm forgetting some) :

   - The ability to fetch all/many topics in one job (Map Reduce can
   otherwise introduce a lot of overhead for small topics).
   - Smarter load balancing of topic partitions across tasks.
   - Built-in error detection and logging.
   - Support for speculative execution.
   - Automatic and complete handling of incremental imports (the contribs
   need a bit of hand holding).
   - Various configuration parameters for bucket sizes, etc.
   - Automatic discovery of new topics (if you use the external avro schema
   repo).
   - Automatic reporting of metrics (if you use Kafka Audit).

However, Camus is currently pretty coupled with avro, and to a lesser
extent with certain conventions within avro schemas, whereas the contrib is
pretty much raw.

Hopefully, that answers your question (?)

Felix
On Wed, Jul 3, 2013 at 4:20 AM, Vadim Keylis <[EMAIL PROTECTED]> wrote:
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB