Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # dev - Kafka/Hadoop consumers and producers


Copy link to this message
-
Re: Kafka/Hadoop consumers and producers
Kam Kasravi 2013-08-13, 17:34
Thanks Andrew - I like the shell wrapper - very clean and simple. 
What installs all the kafka dependencies under /usr/share/java?
________________________________
 From: Andrew Otto <[EMAIL PROTECTED]>
To: Kam Kasravi <[EMAIL PROTECTED]>
Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; Ken Goodhope <[EMAIL PROTECTED]>; Andrew Psaltis <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; Felix GV <[EMAIL PROTECTED]>; Cosmin Lehene <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Monday, August 12, 2013 7:00 PM
Subject: Re: Kafka/Hadoop consumers and producers
 

We've done a bit of work over at Wikimedia to debianize Kafka and make it behave like a regular service.

https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian

Most relevant, Ken, is an init script for Kafka:
  https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian/kafka.init

And a bin/kafka shell wrapper for the kafka/bin/*.sh scripts:
  https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian/bin/kafka

I'm about to add an init script for MirrorMaker as well, so mirroring can be demonized and run as a service.
On Aug 12, 2013, at 8:16 PM, Kam Kasravi <[EMAIL PROTECTED]> wrote:

> I would like to do this refactoring since I did a high level consumer a while ago.
> A few weeks ago I had opened KAFKA-949 Kafka on Yarn which I was also hoping to add to contribute.
> It's almost done. KAFKA-949 is paired with BIGTOP-989 which adds kafka 0.8 to the bigtop distribution.
> KAFKA-949 basically allows kafka brokers to be started up using sysvinit services and would ease some of the
> startup/configuration issues that newbies have when getting started with kafka. Ideally I would like to
> fold a number of kafka/bin/* commands into the kafka service. Andrew please let me know if would like to
> pick this up instead. Thanks!
>
> Kam
>
> From: Jay Kreps <[EMAIL PROTECTED]>
> To: Ken Goodhope <[EMAIL PROTECTED]>
> Cc: Andrew Psaltis <[EMAIL PROTECTED]>; [EMAIL PROTECTED]; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; Felix GV <[EMAIL PROTECTED]>; Cosmin Lehene <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Sent: Saturday, August 10, 2013 3:30 PM
> Subject: Re: Kafka/Hadoop consumers and producers
>
> So guys, just to throw my 2 cents in:
>
> 1. We aren't deprecating anything. I just noticed that the Hadoop contrib
> package wasn't getting as much attention as it should.
>
> 2. Andrew or anyone--if there is anyone using the contrib package who would
> be willing to volunteer to kind of adopt it that would be great. I am happy
> to help in whatever way I can. The practical issue is that most of the
> committers are either using Camus or not using Hadoop at all so we just
> haven't been doing a good job of documenting, bug fixing, and supporting
> the contrib packages.
>
> 3. Ken, if you could document how to use Camus that would likely make it a
> lot more useful to people. I think most people would want a full-fledged
> ETL solution and would likely prefer Camus, but very few people are using
> Avro.
>
> -Jay
>
>
> On Fri, Aug 9, 2013 at 12:27 PM, Ken Goodhope <[EMAIL PROTECTED]> wrote:
>
> > I just checked and that patch is in .8 branch.  Thanks for working on
> > back porting it Andrew.  We'd be happy to commit that work to master.
> >
> > As for the kafka contrib project vs Camus, they are similar but not quite
> > identical.  Camus is intended to be a high throughput ETL for bulk
> > ingestion of Kafka data into HDFS.  Where as what we have in contrib is
> > more of a simple KafkaInputFormat.  Neither can really replace the other.
> > If you had a complex hadoop workflow and wanted to introduce some Kafka