Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> ETL with Kafka


+
Guy Doulberg 2013-01-06, 07:49
+
David Arthur 2013-01-06, 22:29
+
Russell Jurney 2013-01-07, 07:00
+
Guy Doulberg 2013-01-07, 07:12
+
Ken Krugler 2013-01-07, 17:57
Copy link to this message
-
Re: ETL with Kafka
Just to be clear - a Kafka 'Tap' of sorts exists in contrib: it scans
Hadoop records, which may be ETL'd first, and emits new Kafka events.
On Mon, Jan 7, 2013 at 9:57 AM, Ken Krugler <[EMAIL PROTECTED]>wrote:

> Hi Guy,
>
> On Jan 6, 2013, at 11:11pm, Guy Doulberg wrote:
>
> > Hi,
> > Thanks David,
> >
> > I am looking for a product (open source or not), something like Talend
> or Pentaho that in which I can design the ETL (from and to kafka), and run
> the the ETL in Storm/ IronCount or even maybe I can run it in Hadoop
> Map/Reduce.
>
> Interesting - we build ETLs on top of Hadoop using Cascading (open source
> workflow API), which has a lot of what it calls "Taps" for connecting to
> data sources and sinks.
>
> But I haven't heard of a Kafka Tap. Should be possible to implement,
> though.
>
> One issue is that Hadoop is batch oriented, so there's a bit of an
> impedance mismatch when you've got a streaming data source, but from
> experience it's possible to get that to work.
>
> -- Ken
>
> > The product should be complete and supports many connections to many
> data sources and targets, In that sense if you know of a connection to
> Talend or Pentaho it will be great.
> >
> > Thanks again.
> > ,
> >
> >
> > On 01/07/2013 12:28 AM, David Arthur wrote:
> >> Storm has support for Kafka, if that's the sort of thing you're looking
> >> for. Maybe you could describe your use case a bit more?
> >>
> >> On Sunday, January 6, 2013, Guy Doulberg wrote:
> >>
> >>> Hi
> >>>
> >>> I am looking for an ETL tool that can connect to kafka, as a consumer
> and
> >>> as a producer,
> >>>
> >>> Have you heard of such a tool?
> >>>
> >>> Thanks
> >>> Guy
> >>>
> >>>
> >
>
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Cassandra & Solr
>
>
>
>
>
>
--
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com

 
+
Ken Krugler 2013-01-07, 21:51
+
Russell Jurney 2013-01-07, 22:06
+
Ken Krugler 2013-01-07, 22:21
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB