Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Kafka events to S3?


Copy link to this message
-
Re: Kafka events to S3?
S Ahmed 2012-05-23, 18:19
>Kafka handles
>scaling the consumption while making sure each consumer gets a subset of
>data.
Is there a writeup on the algorithm used to do that? Sounds interesting :)

Agreed, this sounds like more of a contrib.

On Wed, May 23, 2012 at 1:49 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:

> Basically it would just be a consumer that wrote to S3. Kafka handles
> scaling the consumption while making sure each consumer gets a subset of
> data. Probably we could make some command line tool. You would need some
> way to let the user control the format of the S3 data in a pluggable
> fashion. It could be a contrib package, or even just a separate github
> mini-project since it just works off the public api and would really just
> be used by people who want to get stuff into S3.
>
> -Jay
>
> On Wed, May 23, 2012 at 8:21 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
>
> > What would be needed to do this?
> >
> > Just thinking off the top of my head:
> >
> > 1. create a zookeeper store to keep track of the last message offset
> > persisted to s3, and which messages each consumer is processing.
> >
> > 2. pull messages off and group in whatever grouping you want (per
> message,
> > 10 messages, etc.), and spin off a executorservice to push to s3, update
> > the zookeeper offset.
> >
> > I'm new to kafka, but I would have to investigate on how multiple
> consumers
> > can pull messages and push to s3, while not having the consumers pull the
> > same messages.
> > Setting up a zookeeper store to track progress specifically for what has
> > been pushed to s3.
> >
> >
> > On Wed, May 23, 2012 at 1:35 AM, Russell Jurney <
> [EMAIL PROTECTED]
> > >wrote:
> >
> > > Yeah, no kidding. I keep waiting on one :)
> > >
> > > Russell Jurney http://datasyndrome.com
> > >
> > > On May 22, 2012, at 10:31 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
> > >
> > > > No. Patches accepted.
> > > >
> > > > -Jay
> > > >
> > > > On Tue, May 22, 2012 at 10:23 PM, Russell Jurney
> > > > <[EMAIL PROTECTED]>wrote:
> > > >
> > > >> Is there a simple way to dump Kafka events to S3 yet?
> > > >>
> > > >> Russell Jurney http://datasyndrome.com
> > > >>
> > >
> >
>