Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # dev >> Kafka/Hadoop consumers and producers


+
Jay Kreps 2013-07-03, 00:02
+
sudheer1414.m@... 2013-10-02, 14:26
+
Cosmin Lehene 2013-07-03, 09:57
+
Felix GV 2013-07-03, 17:57
+
Ken Goodhope 2013-07-04, 04:29
+
Jay Kreps 2013-07-03, 23:49
+
aotto@... 2013-08-07, 16:42
+
Ken 2013-08-08, 04:51
+
psaltis.andrew@... 2013-08-08, 14:11
+
Felix GV 2013-08-09, 01:19
+
Andrew Psaltis 2013-08-09, 03:22
+
dibyendu.bhattachary...@... 2013-08-10, 04:02
+
Andrew Psaltis 2013-08-09, 15:52
+
Ken Goodhope 2013-08-09, 19:27
+
Jay Kreps 2013-08-10, 22:30
Copy link to this message
-
Re: Kafka/Hadoop consumers and producers
I would like to do this refactoring since I did a high level consumer a while ago. 
A few weeks ago I had opened KAFKA-949 Kafka on Yarn which I was also hoping to add to contribute.
It's almost done. KAFKA-949 is paired with BIGTOP-989 which adds kafka 0.8 to the bigtop distribution.
KAFKA-949 basically allows kafka brokers to be started up using sysvinit services and would ease some of the 
startup/configuration issues that newbies have when getting started with kafka. Ideally I would like to 
fold a number of kafka/bin/* commands into the kafka service. Andrew please let me know if would like to 
pick this up instead. Thanks!

Kam
________________________________
 From: Jay Kreps <[EMAIL PROTECTED]>
To: Ken Goodhope <[EMAIL PROTECTED]>
Cc: Andrew Psaltis <[EMAIL PROTECTED]>; [EMAIL PROTECTED]; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; Felix GV <[EMAIL PROTECTED]>; Cosmin Lehene <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Saturday, August 10, 2013 3:30 PM
Subject: Re: Kafka/Hadoop consumers and producers
 

So guys, just to throw my 2 cents in:

1. We aren't deprecating anything. I just noticed that the Hadoop contrib
package wasn't getting as much attention as it should.

2. Andrew or anyone--if there is anyone using the contrib package who would
be willing to volunteer to kind of adopt it that would be great. I am happy
to help in whatever way I can. The practical issue is that most of the
committers are either using Camus or not using Hadoop at all so we just
haven't been doing a good job of documenting, bug fixing, and supporting
the contrib packages.

3. Ken, if you could document how to use Camus that would likely make it a
lot more useful to people. I think most people would want a full-fledged
ETL solution and would likely prefer Camus, but very few people are using
Avro.

-Jay
On Fri, Aug 9, 2013 at 12:27 PM, Ken Goodhope <[EMAIL PROTECTED]> wrote:

> I just checked and that patch is in .8 branch.   Thanks for working on
> back porting it Andrew.  We'd be happy to commit that work to master.
>
> As for the kafka contrib project vs Camus, they are similar but not quite
> identical.  Camus is intended to be a high throughput ETL for bulk
> ingestion of Kafka data into HDFS.  Where as what we have in contrib is
> more of a simple KafkaInputFormat.  Neither can really replace the other.
> If you had a complex hadoop workflow and wanted to introduce some Kafka
> data into that workflow, using Camus would be a gigantic overkill and a
> pain to setup.  On the flipside, if what you want is frequent reliable
> ingest of Kafka data into HDFS, a simple InputFormat doesn't provide you
> with that.
>
> I think it would be preferable to simplify the existing contrib
> Input/OutputFormats by refactoring them to use the more stable higher level
> Kafka APIs.  Currently they use the lower level APIs.  This should make
> them easier to maintain, and user friendly enough to avoid the need for
> extensive documentation.
>
> Ken
>
>
> On Fri, Aug 9, 2013 at 8:52 AM, Andrew Psaltis <[EMAIL PROTECTED]>wrote:
>
>> Dibyendu,
>> According to the pull request: https://github.com/linkedin/camus/pull/15it was merged into the camus-kafka-0.8
>> branch. I have not checked if the code was subsequently removed, however,
>> two at least one the important files from this patch (camus-api/src/main/java/com/linkedin/camus/etl/RecordWriterProvider.java)
>> is still present.
>>
>> Thanks,
>> Andrew
>>
>>
>>  On Fri, Aug 9, 2013 at 9:39 AM, <[EMAIL PROTECTED]>wrote:
>>
>>>  Hi Ken,
>>>
>>> I am also working on making the Camus fit for Non Avro message for our
>>> requirement.
>>>
>>> I see you mentioned about this patch (
>>> https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8)
>>> which supports custom data writer for Camus. But this patch is not pulled
 
+
Andrew Otto 2013-08-13, 02:01
+
Kam Kasravi 2013-08-13, 17:34
+
Andrew Otto 2013-08-13, 20:03
+
Kam Kasravi 2013-08-13, 22:03
+
Andrew Otto 2013-08-13, 23:17
+
Andrew Psaltis 2013-08-13, 03:20
+
Kam Kasravi 2013-08-13, 17:27
+
Andrew Otto 2013-08-13, 14:47
+
Abhi Basu 2013-11-22, 17:33
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB