The advantages of Camus compared to the contrib consumer are the following
(but perhaps I'm forgetting some) :
- The ability to fetch all/many topics in one job (Map Reduce can
otherwise introduce a lot of overhead for small topics).
- Smarter load balancing of topic partitions across tasks.
- Built-in error detection and logging.
- Support for speculative execution.
- Automatic and complete handling of incremental imports (the contribs
need a bit of hand holding).
- Various configuration parameters for bucket sizes, etc.
- Automatic discovery of new topics (if you use the external avro schema
- Automatic reporting of metrics (if you use Kafka Audit).
However, Camus is currently pretty coupled with avro, and to a lesser
extent with certain conventions within avro schemas, whereas the contrib is
pretty much raw.
Hopefully, that answers your question (?)
On Wed, Jul 3, 2013 at 4:20 AM, Vadim Keylis <[EMAIL PROTECTED]> wrote: