Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # dev >> Kafka/Hadoop consumers and producers

Jay Kreps 2013-07-03, 00:02
sudheer1414.m@... 2013-10-02, 14:26
Cosmin Lehene 2013-07-03, 09:57
Felix GV 2013-07-03, 17:57
Ken Goodhope 2013-07-04, 04:29
Jay Kreps 2013-07-03, 23:49
aotto@... 2013-08-07, 16:42
Ken 2013-08-08, 04:51
psaltis.andrew@... 2013-08-08, 14:11
Felix GV 2013-08-09, 01:19
Andrew Psaltis 2013-08-09, 03:22
dibyendu.bhattachary...@... 2013-08-10, 04:02
Andrew Psaltis 2013-08-09, 15:52
Ken Goodhope 2013-08-09, 19:27
Jay Kreps 2013-08-10, 22:30
Copy link to this message
Re: Kafka/Hadoop consumers and producers
I would like to do this refactoring since I did a high level consumer a while ago. 
A few weeks ago I had opened KAFKA-949 Kafka on Yarn which I was also hoping to add to contribute.
It's almost done. KAFKA-949 is paired with BIGTOP-989 which adds kafka 0.8 to the bigtop distribution.
KAFKA-949 basically allows kafka brokers to be started up using sysvinit services and would ease some of the 
startup/configuration issues that newbies have when getting started with kafka. Ideally I would like to 
fold a number of kafka/bin/* commands into the kafka service. Andrew please let me know if would like to 
pick this up instead. Thanks!

 From: Jay Kreps <[EMAIL PROTECTED]>
To: Ken Goodhope <[EMAIL PROTECTED]>
Sent: Saturday, August 10, 2013 3:30 PM
Subject: Re: Kafka/Hadoop consumers and producers

So guys, just to throw my 2 cents in:

1. We aren't deprecating anything. I just noticed that the Hadoop contrib
package wasn't getting as much attention as it should.

2. Andrew or anyone--if there is anyone using the contrib package who would
be willing to volunteer to kind of adopt it that would be great. I am happy
to help in whatever way I can. The practical issue is that most of the
committers are either using Camus or not using Hadoop at all so we just
haven't been doing a good job of documenting, bug fixing, and supporting
the contrib packages.

3. Ken, if you could document how to use Camus that would likely make it a
lot more useful to people. I think most people would want a full-fledged
ETL solution and would likely prefer Camus, but very few people are using

On Fri, Aug 9, 2013 at 12:27 PM, Ken Goodhope <[EMAIL PROTECTED]> wrote:

> I just checked and that patch is in .8 branch.   Thanks for working on
> back porting it Andrew.  We'd be happy to commit that work to master.
> As for the kafka contrib project vs Camus, they are similar but not quite
> identical.  Camus is intended to be a high throughput ETL for bulk
> ingestion of Kafka data into HDFS.  Where as what we have in contrib is
> more of a simple KafkaInputFormat.  Neither can really replace the other.
> If you had a complex hadoop workflow and wanted to introduce some Kafka
> data into that workflow, using Camus would be a gigantic overkill and a
> pain to setup.  On the flipside, if what you want is frequent reliable
> ingest of Kafka data into HDFS, a simple InputFormat doesn't provide you
> with that.
> I think it would be preferable to simplify the existing contrib
> Input/OutputFormats by refactoring them to use the more stable higher level
> Kafka APIs.  Currently they use the lower level APIs.  This should make
> them easier to maintain, and user friendly enough to avoid the need for
> extensive documentation.
> Ken
> On Fri, Aug 9, 2013 at 8:52 AM, Andrew Psaltis <[EMAIL PROTECTED]>wrote:
>> Dibyendu,
>> According to the pull request: https://github.com/linkedin/camus/pull/15it was merged into the camus-kafka-0.8
>> branch. I have not checked if the code was subsequently removed, however,
>> two at least one the important files from this patch (camus-api/src/main/java/com/linkedin/camus/etl/RecordWriterProvider.java)
>> is still present.
>> Thanks,
>> Andrew
>>  On Fri, Aug 9, 2013 at 9:39 AM, <[EMAIL PROTECTED]>wrote:
>>>  Hi Ken,
>>> I am also working on making the Camus fit for Non Avro message for our
>>> requirement.
>>> I see you mentioned about this patch (
>>> https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8)
>>> which supports custom data writer for Camus. But this patch is not pulled
Andrew Otto 2013-08-13, 02:01
Kam Kasravi 2013-08-13, 17:34
Andrew Otto 2013-08-13, 20:03
Kam Kasravi 2013-08-13, 22:03
Andrew Otto 2013-08-13, 23:17
Andrew Psaltis 2013-08-13, 03:20
Kam Kasravi 2013-08-13, 17:27
Andrew Otto 2013-08-13, 14:47
Abhi Basu 2013-11-22, 17:33