When you need your data streams to be incrementally loaded into hadoop for
offline batch processing and/or ad-hoc querying - some things cannot (or
are expensive to) be computed in real-time. So you have a hadoop job that
consumes kafka stream, potentially formats the data and saves into hdfs.
On 30 October 2012 23:28, Hussein Baghdadi <[EMAIL PROTECTED]> wrote:
> Hi,Kafka comes with a support for Hadoop. I'm not sure what does this
> mean.Kafka is a publish-subscribe messaging system. What are some of the
> typical usage of Kafka-support for Hadoop producers and consumers?Well,
> producers are easy to digest. MapReduce job emitting data to Kafka.But what
> about Hadoop consumers?Hadoop is a batching system, not a continuous
> running system (as Storm or Dempsy). Say Kafka gets some data, what will
> happen?Thanks for help and time.
www.visualdna.com | t: +44 (0) 207 734 7033