On Jan 6, 2013, at 11:11pm, Guy Doulberg wrote:
Interesting - we build ETLs on top of Hadoop using Cascading (open source workflow API), which has a lot of what it calls "Taps" for connecting to data sources and sinks.
But I haven't heard of a Kafka Tap. Should be possible to implement, though.
One issue is that Hadoop is batch oriented, so there's a bit of an impedance mismatch when you've got a streaming data source, but from experience it's possible to get that to work.
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr