Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Re: Kafka/Hadoop consumers and producers


+
Oleg Ruchovets 2013-08-07, 19:44
+
Oleg Ruchovets 2013-08-09, 17:27
Copy link to this message
-
Re: Kafka/Hadoop consumers and producers
I think the answer is that there is currently no strong community-backed
solution to consume non-Avro data from Kafka to HDFS.

A lot of people do it, but I think most people adapted and expanded the
contrib code to fit their needs.

--
Felix
On Fri, Aug 9, 2013 at 1:27 PM, Oleg Ruchovets <[EMAIL PROTECTED]> wrote:

> Yes , I am definitely interested with such capabilities. We also using
> kafka 0.7.
>    Guys I already asked , but nobody answer: what community using to
> consume from kafka to hdfs?
> My assumption was that if Camus support only Avro it will not be suitable
> for all , but people transfer from kafka to hadoop somehow. So the question
> is what is the alternatives to Camus to transfer messages from kafka to
> hdfs?
> Thanks
> Oleg.
>
>
> On Fri, Aug 9, 2013 at 6:21 AM, Andrew Psaltis <[EMAIL PROTECTED]
> >wrote:
>
> > Felix,
> > The Camus route is the direction I have headed for allot of the reasons
> > that you described. The only wrinkle is we are still on Kafka 0.7.3 so I
> am
> > in the process of back porting this patch:
> >
> https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8that
> > is described here:
> > https://groups.google.com/forum/#!topic/camus_etl/VcETxkYhzg8 -- so that
> > we can handle reading and writing non-avro'ized (if that is a word) data.
> >
> > I hope to have that done sometime in the morning and would be happy to
> > share it if others can benefit from it.
> >
> > Thanks,
> > Andrew
> >
> >
> > On Thursday, August 8, 2013 7:18:27 PM UTC-6, Felix GV wrote:
> >
> >> The contrib code is simple and probably wouldn't require too much work
> to
> >> fix, but it's a lot less robust than Camus, so you would ideally need
> to do
> >> some work to make it solid against all edge cases, failure scenarios and
> >> performance bottlenecks...
> >>
> >> I would definitely recommend investing in Camus instead, since it
> already
> >> covers a lot of the challenges I'm mentioning above, and also has more
> >> community support behind it at the moment (as far as I can tell,
> anyway),
> >> so it is more likely to keep getting improvements than the contrib code.
> >>
> >> --
> >> Felix
> >>
> >>
> >> On Thu, Aug 8, 2013 at 9:28 AM, <[EMAIL PROTECTED]> wrote:
> >>
> >>> We also have a need today to ETL from Kafka into Hadoop and we do not
> >>> currently nor have any plans to use Avro.
> >>>
> >>> So is the official direction based on this discussion to ditch the
> Kafka
> >>> contrib code and direct people to use Camus without Avro as Ken
> described
> >>> or are both solutions going to survive?
> >>>
> >>> I can put time into the contrib code and/or work on documenting the
> >>> tutorial on how to make Camus work without Avro.
> >>>
> >>> Which is the preferred route, for the long term?
> >>>
> >>> Thanks,
> >>> Andrew
> >>>
> >>> On Wednesday, August 7, 2013 10:50:53 PM UTC-6, Ken Goodhope wrote:
> >>> > Hi Andrew,
> >>> >
> >>> >
> >>> >
> >>> > Camus can be made to work without avro. You will need to implement a
> >>> message decoder and and a data writer.   We need to add a better
> tutorial
> >>> on how to do this, but it isn't that difficult. If you decide to go
> down
> >>> this path, you can always ask questions on this list. I try to make
> sure
> >>> each email gets answered. But it can take me a day or two.
> >>> >
> >>> >
> >>> >
> >>> > -Ken
> >>> >
> >>> >
> >>> >
> >>> > On Aug 7, 2013, at 9:33 AM, [EMAIL PROTECTED] wrote:
> >>> >
> >>> >
> >>> >
> >>> > > Hi all,
> >>> >
> >>> > >
> >>> >
> >>> > > Over at the Wikimedia Foundation, we're trying to figure out the
> >>> best way to do our ETL from Kafka into Hadoop.  We don't currently use
> Avro
> >>> and I'm not sure if we are going to.  I came across this post.
> >>> >
> >>> > >
> >>> >
> >>> > > If the plan is to remove the hadoop-consumer from Kafka contrib, do
> >>> you think we should not consider it as one of our viable options?
> >>> >
> >>> > >
> >>> >
> >>> > > Thanks!
> >>> >
> >>>
 
+
Andrew Otto 2013-08-09, 18:49
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB