Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> S3 Consumer for Super Duper Blog Post!


+
Russell Jurney 2012-08-18, 04:49
+
Matthew Rathbone 2012-08-18, 16:20
Copy link to this message
-
Re: S3 Consumer for Super Duper Blog Post!
Thanks for your response, and glad to hear you need this as well and are
working on it.

Does using s3n:// file-path require that you have a Hadoop cluster running?
I use S3 and EMR, so my Hadoop clusters are temporary.  I do use Hadoop
with S3 to consume the data Kafka produces, so I am fine with Hadoop as a
dependency - at the library level, but not if a cluster must persist for
the Kafka S3 consumer to work.

On Sat, Aug 18, 2012 at 9:20 AM, Matthew Rathbone <[EMAIL PROTECTED]>wrote:

> Hey Russell,
>
> We're actually about to start work on this exact thing here at foursquare
> as we're about to start prototyping kafka to replace our aging log
> infrastructure.
>
> We'd planned on just using the hadoop-consumer, but setting the output
> directory to a S3n:// file-path.
>
> I'm assuming that you want to build a consumer that operates outside of
> hadoop?
>
>
>
> On Sat, Aug 18, 2012 at 12:49 AM, Russell Jurney
> <[EMAIL PROTECTED]>wrote:
>
> > Ok, this is the last time I'm gonna beg for an S3 sink for Kafka. I'm
> > not trolling, and this is Your Big Chance to help!
> >
> > I'm gonna blog about using Whirr to boot Zookeeper and then to boot
> > Kafka in the cloud and then create events in an application that get
> > sunk to Amazon S3, where they will be processed by
> > Pig/Hadoop/ElasticMapReduce, mined into gems and republished in some
> > esoteric NoSQL DB and then served in the very app that generated the
> > events in the first place.
> >
> > So, if someone else doesn't contribute an S3 consumer for Kafka in the
> > next month or so... so help me Bob, I'm gonna write it myself. Now,
> > some of you may not know me, but I am the 3rd best software engineer
> > in the world:
> > http://www.quora.com/Who-are-some-of-the-best-software-engineers-alive
> >
> > Those of you that have seen my code, however, are aware that as a
> > programmer, I am substandard. There's a gene that imparts exception
> > handling and algorithms, and they're missing from my genome.
> >
> > So let me be clear: you don't want me to write the S3 sink. A Kafka
> > committer or someone with a real job should write the S3 sink. As soon
> > as that thing is written and my blog post goes out, Kafka use will
> > spike and you'll all be famous.
> >
> > So this is a direct threat: I am writing an S3 consumer for Kafka
> > unless one of you steps up. And you will rue the day that piece of
> > crap ships.
> >
> > In return for your contribution, you will be named in my blog post as
> > open source citizen of the month, to be accompanied by a commemorative
> > plaque with a pixelated photo of me.
> >
> > Yours truly,
> >
> > Russell Jurney http://datasyndrome.com
> >
>
>
>
> --
> Matthew Rathbone
> Foursquare | Software Engineer | Server Engineering Team
> [EMAIL PROTECTED] | @rathboma <http://twitter.com/rathboma> |
> 4sq<http://foursquare.com/rathboma>
>

--
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
+
Russell Jurney 2012-08-19, 14:10
+
Matthew Rathbone 2012-08-20, 16:10
+
Niek Sanders 2012-08-24, 02:46
+
Parviz deyhim 2012-10-05, 03:43
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB