|
Gautam Singaraju
2012-03-20, 16:07
Elben Shira
2012-03-20, 16:10
Evan Chan
2012-03-20, 16:15
Jun Rao
2012-03-20, 16:24
Patricio Echagüe
2012-03-20, 16:41
Gautam Singaraju
2012-03-20, 17:10
Dave Fayram
2012-03-20, 18:19
Russell Jurney
2012-03-20, 19:56
Niek Sanders
2012-03-20, 22:04
Russell Jurney
2012-03-20, 22:23
Felix GV
2012-03-20, 22:32
Russell Jurney
2012-03-20, 23:51
Neha Narkhede
2012-03-21, 05:03
Russell Jurney
2012-03-21, 05:07
Vaibhav Puranik
2012-03-21, 05:21
Neha Narkhede
2012-03-21, 15:09
Niek Sanders
2012-03-21, 18:01
Vaibhav Puranik
2012-03-21, 18:59
Russell Jurney
2012-03-21, 19:46
Tim Lossen
2012-03-21, 19:50
Russell Jurney
2012-03-21, 19:52
Russell Jurney
2012-03-21, 20:33
Vaibhav Puranik
2012-03-21, 20:37
Russell Jurney
2012-03-21, 20:44
Vaibhav Puranik
2012-03-22, 05:01
Russell Jurney
2012-03-23, 02:15
|
-
Kafka in AWS?Gautam Singaraju 2012-03-20, 16:07
We are have been considering Kafka for a new Data Platform. Has someone
used Kafka in AWS? If so, could you please share your experiences with us? Thank you! --- Gautam
-
Re: Kafka in AWS?Elben Shira 2012-03-20, 16:10
There's some on the mailing list archives:
http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201202.mbox/%[EMAIL PROTECTED]%3E http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201203.mbox/%3CCAFHvO5s2wHSiegUWGPsfV-eN45SK%3Djfh8ObsU1gHZczNdGg-gg%40mail.gmail.com%3E Elben On Tue, Mar 20, 2012 at 11:07 AM, Gautam Singaraju < [EMAIL PROTECTED]> wrote: > We are have been considering Kafka for a new Data Platform. Has someone > used Kafka in AWS? If so, could you please share your experiences with us? > > Thank you! > --- > Gautam >
-
Re: Kafka in AWS?Evan Chan 2012-03-20, 16:15
We deploy Kafka in AWS. The only negative so far I've found is that native
JMX is difficult or impossible to use with AWS, because it opens up secondary ports that you don't have control over. However, I've heard there are alternative JMX implementations that allow HTTP and other alternative protocols which may be more AWS friendly. On Tue, Mar 20, 2012 at 9:10 AM, Elben Shira <[EMAIL PROTECTED]> wrote: > There's some on the mailing list archives: > > > http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201202.mbox/%[EMAIL PROTECTED]%3E > > > http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201203.mbox/%3CCAFHvO5s2wHSiegUWGPsfV-eN45SK%3Djfh8ObsU1gHZczNdGg-gg%40mail.gmail.com%3E > > Elben > > > On Tue, Mar 20, 2012 at 11:07 AM, Gautam Singaraju < > [EMAIL PROTECTED]> wrote: > > > We are have been considering Kafka for a new Data Platform. Has someone > > used Kafka in AWS? If so, could you please share your experiences with > us? > > > > Thank you! > > --- > > Gautam > > > -- -- *Evan Chan* Senior Software Engineer | [EMAIL PROTECTED] | (650) 996-4600 www.ooyala.com | blog <http://www.ooyala.com/blog> | @ooyala<http://www.twitter.com/ooyala>
-
Re: Kafka in AWS?Jun Rao 2012-03-20, 16:24
Evan,
We do have mx4j support. See kafka-78. Thanks, Jun On Tue, Mar 20, 2012 at 9:15 AM, Evan Chan <[EMAIL PROTECTED]> wrote: > We deploy Kafka in AWS. The only negative so far I've found is that native > JMX is difficult or impossible to use with AWS, because it opens up > secondary ports that you don't have control over. However, I've heard > there are alternative JMX implementations that allow HTTP and other > alternative protocols which may be more AWS friendly. > > On Tue, Mar 20, 2012 at 9:10 AM, Elben Shira <[EMAIL PROTECTED]> wrote: > > > There's some on the mailing list archives: > > > > > > > http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201202.mbox/%[EMAIL PROTECTED]%3E > > > > > > > http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201203.mbox/%3CCAFHvO5s2wHSiegUWGPsfV-eN45SK%3Djfh8ObsU1gHZczNdGg-gg%40mail.gmail.com%3E > > > > Elben > > > > > > On Tue, Mar 20, 2012 at 11:07 AM, Gautam Singaraju < > > [EMAIL PROTECTED]> wrote: > > > > > We are have been considering Kafka for a new Data Platform. Has someone > > > used Kafka in AWS? If so, could you please share your experiences with > > us? > > > > > > Thank you! > > > --- > > > Gautam > > > > > > > > > -- > -- > *Evan Chan* > Senior Software Engineer | > [EMAIL PROTECTED] | (650) 996-4600 > www.ooyala.com | blog <http://www.ooyala.com/blog> | > @ooyala<http://www.twitter.com/ooyala> >
-
Re: Kafka in AWS?Patricio Echagüe 2012-03-20, 16:41
We are able to connect to jmx via console by creating a simple SSH tunnel.
In addition to that, jmxtrans installed in the same machine sends jmx info from kafka to ganglia. Sent from my Android On Mar 20, 2012 9:25 AM, "Jun Rao" <[EMAIL PROTECTED]> wrote: > Evan, > > We do have mx4j support. See kafka-78. > > Thanks, > > Jun > > On Tue, Mar 20, 2012 at 9:15 AM, Evan Chan <[EMAIL PROTECTED]> wrote: > > > We deploy Kafka in AWS. The only negative so far I've found is that > native > > JMX is difficult or impossible to use with AWS, because it opens up > > secondary ports that you don't have control over. However, I've heard > > there are alternative JMX implementations that allow HTTP and other > > alternative protocols which may be more AWS friendly. > > > > On Tue, Mar 20, 2012 at 9:10 AM, Elben Shira <[EMAIL PROTECTED]> > wrote: > > > > > There's some on the mailing list archives: > > > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201202.mbox/%[EMAIL PROTECTED]%3E > > > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201203.mbox/%3CCAFHvO5s2wHSiegUWGPsfV-eN45SK%3Djfh8ObsU1gHZczNdGg-gg%40mail.gmail.com%3E > > > > > > Elben > > > > > > > > > On Tue, Mar 20, 2012 at 11:07 AM, Gautam Singaraju < > > > [EMAIL PROTECTED]> wrote: > > > > > > > We are have been considering Kafka for a new Data Platform. Has > someone > > > > used Kafka in AWS? If so, could you please share your experiences > with > > > us? > > > > > > > > Thank you! > > > > --- > > > > Gautam > > > > > > > > > > > > > > > -- > > -- > > *Evan Chan* > > Senior Software Engineer | > > [EMAIL PROTECTED] | (650) 996-4600 > > www.ooyala.com | blog <http://www.ooyala.com/blog> | > > @ooyala<http://www.twitter.com/ooyala> > > >
-
Re: Kafka in AWS?Gautam Singaraju 2012-03-20, 17:10
Thank you all! We will let you know of our experiences soon.
--- Gautam 2012/3/20 Patricio Echagüe <[EMAIL PROTECTED]> > We are able to connect to jmx via console by creating a simple SSH tunnel. > > In addition to that, jmxtrans installed in the same machine sends jmx info > from kafka to ganglia. > > Sent from my Android > On Mar 20, 2012 9:25 AM, "Jun Rao" <[EMAIL PROTECTED]> wrote: > > > Evan, > > > > We do have mx4j support. See kafka-78. > > > > Thanks, > > > > Jun > > > > On Tue, Mar 20, 2012 at 9:15 AM, Evan Chan <[EMAIL PROTECTED]> wrote: > > > > > We deploy Kafka in AWS. The only negative so far I've found is that > > native > > > JMX is difficult or impossible to use with AWS, because it opens up > > > secondary ports that you don't have control over. However, I've heard > > > there are alternative JMX implementations that allow HTTP and other > > > alternative protocols which may be more AWS friendly. > > > > > > On Tue, Mar 20, 2012 at 9:10 AM, Elben Shira <[EMAIL PROTECTED]> > > wrote: > > > > > > > There's some on the mailing list archives: > > > > > > > > > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201202.mbox/%[EMAIL PROTECTED]%3E > > > > > > > > > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201203.mbox/%3CCAFHvO5s2wHSiegUWGPsfV-eN45SK%3Djfh8ObsU1gHZczNdGg-gg%40mail.gmail.com%3E > > > > > > > > Elben > > > > > > > > > > > > On Tue, Mar 20, 2012 at 11:07 AM, Gautam Singaraju < > > > > [EMAIL PROTECTED]> wrote: > > > > > > > > > We are have been considering Kafka for a new Data Platform. Has > > someone > > > > > used Kafka in AWS? If so, could you please share your experiences > > with > > > > us? > > > > > > > > > > Thank you! > > > > > --- > > > > > Gautam > > > > > > > > > > > > > > > > > > > > > -- > > > -- > > > *Evan Chan* > > > Senior Software Engineer | > > > [EMAIL PROTECTED] | (650) 996-4600 > > > www.ooyala.com | blog <http://www.ooyala.com/blog> | > > > @ooyala<http://www.twitter.com/ooyala> > > > > > >
-
Re: Kafka in AWS?Dave Fayram 2012-03-20, 18:19
We've been successfully using Kafka on AWS as well, and JMX wise we
just use an SSH tunnel. In general, we've been very happy with the performance on AWS, which some people have reservations about due to the I/O situation on most Amazon boxes. On Tue, Mar 20, 2012 at 9:07 AM, Gautam Singaraju <[EMAIL PROTECTED]> wrote: > We are have been considering Kafka for a new Data Platform. Has someone > used Kafka in AWS? If so, could you please share your experiences with us? > > Thank you! > --- > Gautam -- -- Dave Fayram [EMAIL PROTECTED]
-
Re: Kafka in AWS?Russell Jurney 2012-03-20, 19:56
I wish someone would publish some source that writes events to S3.
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com On Mar 20, 2012, at 11:20 AM, Dave Fayram <[EMAIL PROTECTED]> wrote: > We've been successfully using Kafka on AWS as well, and JMX wise we > just use an SSH tunnel. > > In general, we've been very happy with the performance on AWS, which > some people have reservations about due to the I/O situation on most > Amazon boxes. > > On Tue, Mar 20, 2012 at 9:07 AM, Gautam Singaraju > <[EMAIL PROTECTED]> wrote: >> We are have been considering Kafka for a new Data Platform. Has someone >> used Kafka in AWS? If so, could you please share your experiences with us? >> >> Thank you! >> --- >> Gautam > > > > -- > -- > Dave Fayram > [EMAIL PROTECTED]
-
Re: Kafka in AWS?Niek Sanders 2012-03-20, 22:04
Russell,
I'm actually in the process of writing a Java code to go from Kafka messages to S3. I might be able to rip-out my application-specific parts and share something later tonight. The biggest hassle is that you can't append to existing S3 files. So unless you're planning on uploading each message as a separate S3 object, this means you need message aggregation smarts on the Kafka consumer / S3 uploader side of things. Best, Niek On Tue, Mar 20, 2012 at 12:56 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > I wish someone would publish some source that writes events to S3. > > Russell Jurney > twitter.com/rjurney > [EMAIL PROTECTED] > datasyndrome.com > > On Mar 20, 2012, at 11:20 AM, Dave Fayram <[EMAIL PROTECTED]> wrote: > >> We've been successfully using Kafka on AWS as well, and JMX wise we >> just use an SSH tunnel. >> >> In general, we've been very happy with the performance on AWS, which >> some people have reservations about due to the I/O situation on most >> Amazon boxes. >> >> On Tue, Mar 20, 2012 at 9:07 AM, Gautam Singaraju >> <[EMAIL PROTECTED]> wrote: >>> We are have been considering Kafka for a new Data Platform. Has someone >>> used Kafka in AWS? If so, could you please share your experiences with us? >>> >>> Thank you! >>> --- >>> Gautam >> >> >> >> -- >> -- >> Dave Fayram >> [EMAIL PROTECTED]
-
Re: Kafka in AWS?Russell Jurney 2012-03-20, 22:23
Yeah, that is the part I am hoping someone will contribute :) I know I can
write that myself. I also know it will be buggy and that I will have lots of trouble. If you contribute this code, it would be a huge boon to Kafka. It is imo the primary use case for Kafka atm... if only the code gets into git. On Tue, Mar 20, 2012 at 3:04 PM, Niek Sanders <[EMAIL PROTECTED]>wrote: > Russell, > > I'm actually in the process of writing a Java code to go from Kafka > messages to S3. I might be able to rip-out my application-specific > parts and share something later tonight. > > The biggest hassle is that you can't append to existing S3 files. So > unless you're planning on uploading each message as a separate S3 > object, this means you need message aggregation smarts on the Kafka > consumer / S3 uploader side of things. > > Best, > Niek > > > > > > > On Tue, Mar 20, 2012 at 12:56 PM, Russell Jurney > <[EMAIL PROTECTED]> wrote: > > I wish someone would publish some source that writes events to S3. > > > > Russell Jurney > > twitter.com/rjurney > > [EMAIL PROTECTED] > > datasyndrome.com > > > > On Mar 20, 2012, at 11:20 AM, Dave Fayram <[EMAIL PROTECTED]> wrote: > > > >> We've been successfully using Kafka on AWS as well, and JMX wise we > >> just use an SSH tunnel. > >> > >> In general, we've been very happy with the performance on AWS, which > >> some people have reservations about due to the I/O situation on most > >> Amazon boxes. > >> > >> On Tue, Mar 20, 2012 at 9:07 AM, Gautam Singaraju > >> <[EMAIL PROTECTED]> wrote: > >>> We are have been considering Kafka for a new Data Platform. Has someone > >>> used Kafka in AWS? If so, could you please share your experiences with > us? > >>> > >>> Thank you! > >>> --- > >>> Gautam > >> > >> > >> > >> -- > >> -- > >> Dave Fayram > >> [EMAIL PROTECTED] > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: Kafka in AWS?Felix GV 2012-03-20, 22:32
The primary use case for Kafka is to use it on AWS...???
Sorry if I put words you didn't intend in your mouth :P ... I just thought that sounded funny ;) Sorry for being off-topic. Carry on :/ ! -- Felix On Tue, Mar 20, 2012 at 6:23 PM, Russell Jurney <[EMAIL PROTECTED]>wrote: > Yeah, that is the part I am hoping someone will contribute :) I know I can > write that myself. I also know it will be buggy and that I will have lots > of trouble. > > If you contribute this code, it would be a huge boon to Kafka. It is imo > the primary use case for Kafka atm... if only the code gets into git. > > On Tue, Mar 20, 2012 at 3:04 PM, Niek Sanders <[EMAIL PROTECTED] > >wrote: > > > Russell, > > > > I'm actually in the process of writing a Java code to go from Kafka > > messages to S3. I might be able to rip-out my application-specific > > parts and share something later tonight. > > > > The biggest hassle is that you can't append to existing S3 files. So > > unless you're planning on uploading each message as a separate S3 > > object, this means you need message aggregation smarts on the Kafka > > consumer / S3 uploader side of things. > > > > Best, > > Niek > > > > > > > > > > > > > > On Tue, Mar 20, 2012 at 12:56 PM, Russell Jurney > > <[EMAIL PROTECTED]> wrote: > > > I wish someone would publish some source that writes events to S3. > > > > > > Russell Jurney > > > twitter.com/rjurney > > > [EMAIL PROTECTED] > > > datasyndrome.com > > > > > > On Mar 20, 2012, at 11:20 AM, Dave Fayram <[EMAIL PROTECTED]> wrote: > > > > > >> We've been successfully using Kafka on AWS as well, and JMX wise we > > >> just use an SSH tunnel. > > >> > > >> In general, we've been very happy with the performance on AWS, which > > >> some people have reservations about due to the I/O situation on most > > >> Amazon boxes. > > >> > > >> On Tue, Mar 20, 2012 at 9:07 AM, Gautam Singaraju > > >> <[EMAIL PROTECTED]> wrote: > > >>> We are have been considering Kafka for a new Data Platform. Has > someone > > >>> used Kafka in AWS? If so, could you please share your experiences > with > > us? > > >>> > > >>> Thank you! > > >>> --- > > >>> Gautam > > >> > > >> > > >> > > >> -- > > >> -- > > >> Dave Fayram > > >> [EMAIL PROTECTED] > > > > > > -- > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] > datasyndrome.com >
-
Re: Kafka in AWS?Russell Jurney 2012-03-20, 23:51
I think as soon as someone commits code that reliably sinks events to S3,
Kafka adoption will skyrocket. There is no good solution to this yet. MANY people want one. Russ On Tue, Mar 20, 2012 at 3:32 PM, Felix GV <[EMAIL PROTECTED]> wrote: > The primary use case for Kafka is to use it on AWS...??? > > Sorry if I put words you didn't intend in your mouth :P ... I just thought > that sounded funny ;) > > Sorry for being off-topic. Carry on :/ ! > > -- > Felix > > > > On Tue, Mar 20, 2012 at 6:23 PM, Russell Jurney <[EMAIL PROTECTED] > >wrote: > > > Yeah, that is the part I am hoping someone will contribute :) I know I > can > > write that myself. I also know it will be buggy and that I will have > lots > > of trouble. > > > > If you contribute this code, it would be a huge boon to Kafka. It is imo > > the primary use case for Kafka atm... if only the code gets into git. > > > > On Tue, Mar 20, 2012 at 3:04 PM, Niek Sanders <[EMAIL PROTECTED] > > >wrote: > > > > > Russell, > > > > > > I'm actually in the process of writing a Java code to go from Kafka > > > messages to S3. I might be able to rip-out my application-specific > > > parts and share something later tonight. > > > > > > The biggest hassle is that you can't append to existing S3 files. So > > > unless you're planning on uploading each message as a separate S3 > > > object, this means you need message aggregation smarts on the Kafka > > > consumer / S3 uploader side of things. > > > > > > Best, > > > Niek > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 20, 2012 at 12:56 PM, Russell Jurney > > > <[EMAIL PROTECTED]> wrote: > > > > I wish someone would publish some source that writes events to S3. > > > > > > > > Russell Jurney > > > > twitter.com/rjurney > > > > [EMAIL PROTECTED] > > > > datasyndrome.com > > > > > > > > On Mar 20, 2012, at 11:20 AM, Dave Fayram <[EMAIL PROTECTED]> wrote: > > > > > > > >> We've been successfully using Kafka on AWS as well, and JMX wise we > > > >> just use an SSH tunnel. > > > >> > > > >> In general, we've been very happy with the performance on AWS, which > > > >> some people have reservations about due to the I/O situation on most > > > >> Amazon boxes. > > > >> > > > >> On Tue, Mar 20, 2012 at 9:07 AM, Gautam Singaraju > > > >> <[EMAIL PROTECTED]> wrote: > > > >>> We are have been considering Kafka for a new Data Platform. Has > > someone > > > >>> used Kafka in AWS? If so, could you please share your experiences > > with > > > us? > > > >>> > > > >>> Thank you! > > > >>> --- > > > >>> Gautam > > > >> > > > >> > > > >> > > > >> -- > > > >> -- > > > >> Dave Fayram > > > >> [EMAIL PROTECTED] > > > > > > > > > > > -- > > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] > > datasyndrome.com > > > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: Kafka in AWS?Neha Narkhede 2012-03-21, 05:03
Russell,
By "sink events into S3", do you mean you want to have some plugin that will suck data out of your Kafka brokers and upload to S3. Would you mind describing use cases that would require to send data to Kafka, then upload data to S3, and then use it by querying S3 ? Thanks, Neha On Mar 20, 2012 4:51 PM, "Russell Jurney" <[EMAIL PROTECTED]> wrote: > I think as soon as someone commits code that reliably sinks events to S3, > Kafka adoption will skyrocket. There is no good solution to this yet. > MANY people want one. > > Russ > > On Tue, Mar 20, 2012 at 3:32 PM, Felix GV <[EMAIL PROTECTED]> wrote: > > > The primary use case for Kafka is to use it on AWS...??? > > > > Sorry if I put words you didn't intend in your mouth :P ... I just > thought > > that sounded funny ;) > > > > Sorry for being off-topic. Carry on :/ ! > > > > -- > > Felix > > > > > > > > On Tue, Mar 20, 2012 at 6:23 PM, Russell Jurney < > [EMAIL PROTECTED] > > >wrote: > > > > > Yeah, that is the part I am hoping someone will contribute :) I know I > > can > > > write that myself. I also know it will be buggy and that I will have > > lots > > > of trouble. > > > > > > If you contribute this code, it would be a huge boon to Kafka. It is > imo > > > the primary use case for Kafka atm... if only the code gets into git. > > > > > > On Tue, Mar 20, 2012 at 3:04 PM, Niek Sanders <[EMAIL PROTECTED] > > > >wrote: > > > > > > > Russell, > > > > > > > > I'm actually in the process of writing a Java code to go from Kafka > > > > messages to S3. I might be able to rip-out my application-specific > > > > parts and share something later tonight. > > > > > > > > The biggest hassle is that you can't append to existing S3 files. So > > > > unless you're planning on uploading each message as a separate S3 > > > > object, this means you need message aggregation smarts on the Kafka > > > > consumer / S3 uploader side of things. > > > > > > > > Best, > > > > Niek > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 20, 2012 at 12:56 PM, Russell Jurney > > > > <[EMAIL PROTECTED]> wrote: > > > > > I wish someone would publish some source that writes events to S3. > > > > > > > > > > Russell Jurney > > > > > twitter.com/rjurney > > > > > [EMAIL PROTECTED] > > > > > datasyndrome.com > > > > > > > > > > On Mar 20, 2012, at 11:20 AM, Dave Fayram <[EMAIL PROTECTED]> > wrote: > > > > > > > > > >> We've been successfully using Kafka on AWS as well, and JMX wise > we > > > > >> just use an SSH tunnel. > > > > >> > > > > >> In general, we've been very happy with the performance on AWS, > which > > > > >> some people have reservations about due to the I/O situation on > most > > > > >> Amazon boxes. > > > > >> > > > > >> On Tue, Mar 20, 2012 at 9:07 AM, Gautam Singaraju > > > > >> <[EMAIL PROTECTED]> wrote: > > > > >>> We are have been considering Kafka for a new Data Platform. Has > > > someone > > > > >>> used Kafka in AWS? If so, could you please share your experiences > > > with > > > > us? > > > > >>> > > > > >>> Thank you! > > > > >>> --- > > > > >>> Gautam > > > > >> > > > > >> > > > > >> > > > > >> -- > > > > >> -- > > > > >> Dave Fayram > > > > >> [EMAIL PROTECTED] > > > > > > > > > > > > > > > > -- > > > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] > > > datasyndrome.com > > > > > > > > > -- > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] > datasyndrome.com >
-
Re: Kafka in AWS?Russell Jurney 2012-03-21, 05:07
I want events in S3 to process them in Hadoop. I'd like to emit them in my app, and have them magically show up in 64MB chunks on S3. Like most everyone else.
Russell Jurney http://datasyndrome.com On Mar 20, 2012, at 10:03 PM, Neha Narkhede <[EMAIL PROTECTED]> wrote: > Russell, > > By "sink events into S3", do you mean you want to have some plugin that > will suck data out of your Kafka brokers and upload to S3. Would you mind > describing use cases that would require to send data to Kafka, then upload > data to S3, and then use it by querying S3 ? > > Thanks, > Neha > On Mar 20, 2012 4:51 PM, "Russell Jurney" <[EMAIL PROTECTED]> wrote: > >> I think as soon as someone commits code that reliably sinks events to S3, >> Kafka adoption will skyrocket. There is no good solution to this yet. >> MANY people want one. >> >> Russ >> >> On Tue, Mar 20, 2012 at 3:32 PM, Felix GV <[EMAIL PROTECTED]> wrote: >> >>> The primary use case for Kafka is to use it on AWS...??? >>> >>> Sorry if I put words you didn't intend in your mouth :P ... I just >> thought >>> that sounded funny ;) >>> >>> Sorry for being off-topic. Carry on :/ ! >>> >>> -- >>> Felix >>> >>> >>> >>> On Tue, Mar 20, 2012 at 6:23 PM, Russell Jurney < >> [EMAIL PROTECTED] >>>> wrote: >>> >>>> Yeah, that is the part I am hoping someone will contribute :) I know I >>> can >>>> write that myself. I also know it will be buggy and that I will have >>> lots >>>> of trouble. >>>> >>>> If you contribute this code, it would be a huge boon to Kafka. It is >> imo >>>> the primary use case for Kafka atm... if only the code gets into git. >>>> >>>> On Tue, Mar 20, 2012 at 3:04 PM, Niek Sanders <[EMAIL PROTECTED] >>>>> wrote: >>>> >>>>> Russell, >>>>> >>>>> I'm actually in the process of writing a Java code to go from Kafka >>>>> messages to S3. I might be able to rip-out my application-specific >>>>> parts and share something later tonight. >>>>> >>>>> The biggest hassle is that you can't append to existing S3 files. So >>>>> unless you're planning on uploading each message as a separate S3 >>>>> object, this means you need message aggregation smarts on the Kafka >>>>> consumer / S3 uploader side of things. >>>>> >>>>> Best, >>>>> Niek >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, Mar 20, 2012 at 12:56 PM, Russell Jurney >>>>> <[EMAIL PROTECTED]> wrote: >>>>>> I wish someone would publish some source that writes events to S3. >>>>>> >>>>>> Russell Jurney >>>>>> twitter.com/rjurney >>>>>> [EMAIL PROTECTED] >>>>>> datasyndrome.com >>>>>> >>>>>> On Mar 20, 2012, at 11:20 AM, Dave Fayram <[EMAIL PROTECTED]> >> wrote: >>>>>> >>>>>>> We've been successfully using Kafka on AWS as well, and JMX wise >> we >>>>>>> just use an SSH tunnel. >>>>>>> >>>>>>> In general, we've been very happy with the performance on AWS, >> which >>>>>>> some people have reservations about due to the I/O situation on >> most >>>>>>> Amazon boxes. >>>>>>> >>>>>>> On Tue, Mar 20, 2012 at 9:07 AM, Gautam Singaraju >>>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>>> We are have been considering Kafka for a new Data Platform. Has >>>> someone >>>>>>>> used Kafka in AWS? If so, could you please share your experiences >>>> with >>>>> us? >>>>>>>> >>>>>>>> Thank you! >>>>>>>> --- >>>>>>>> Gautam >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> Dave Fayram >>>>>>> [EMAIL PROTECTED] >>>>> >>>> >>>> >>>> >>>> -- >>>> Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] >>>> datasyndrome.com >>>> >>> >> >> >> >> -- >> Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] >> datasyndrome.com >>
-
Re: Kafka in AWS?Vaibhav Puranik 2012-03-21, 05:21
Neha,
My requirement is not related to Russell's, but I thought it will be helpful describe what we need at GumGum <http://gumgum.com/>. I wasn't sure whether it's Kafka domain since kafka gives you a topic to pull data from and then it's up to you to do whatever with it. But since we are talking about it, here is what we do everyday (currently without Kafka): We are a ad network. We write all of our impressions and clicks data in various log files and upload it to S3. At night we run many Map reduce jobs to aggregate this data in various ways. We have an 'Autoscaled' cluster in AWS. Our webservers keep going up and down based on the load on the system. Whenever a server shuts down we tend to lose data. Many times file upload is not completed in time before the server shuts down. That is why we are looking at implementing Kafka to send events in real time to S3 without losing them. If there exists a 'sink' that transfers data to S3, our job will be lot easier. But again, I am not sure whether Kafka is supposed to provide that or not. Regards, Vaibhav On Tue, Mar 20, 2012 at 10:03 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote: > Russell, > > By "sink events into S3", do you mean you want to have some plugin that > will suck data out of your Kafka brokers and upload to S3. Would you mind > describing use cases that would require to send data to Kafka, then upload > data to S3, and then use it by querying S3 ? > > Thanks, > Neha > On Mar 20, 2012 4:51 PM, "Russell Jurney" <[EMAIL PROTECTED]> > wrote: > > > I think as soon as someone commits code that reliably sinks events to S3, > > Kafka adoption will skyrocket. There is no good solution to this yet. > > MANY people want one. > > > > Russ > > > > On Tue, Mar 20, 2012 at 3:32 PM, Felix GV <[EMAIL PROTECTED]> wrote: > > > > > The primary use case for Kafka is to use it on AWS...??? > > > > > > Sorry if I put words you didn't intend in your mouth :P ... I just > > thought > > > that sounded funny ;) > > > > > > Sorry for being off-topic. Carry on :/ ! > > > > > > -- > > > Felix > > > > > > > > > > > > On Tue, Mar 20, 2012 at 6:23 PM, Russell Jurney < > > [EMAIL PROTECTED] > > > >wrote: > > > > > > > Yeah, that is the part I am hoping someone will contribute :) I > know I > > > can > > > > write that myself. I also know it will be buggy and that I will have > > > lots > > > > of trouble. > > > > > > > > If you contribute this code, it would be a huge boon to Kafka. It is > > imo > > > > the primary use case for Kafka atm... if only the code gets into git. > > > > > > > > On Tue, Mar 20, 2012 at 3:04 PM, Niek Sanders < > [EMAIL PROTECTED] > > > > >wrote: > > > > > > > > > Russell, > > > > > > > > > > I'm actually in the process of writing a Java code to go from Kafka > > > > > messages to S3. I might be able to rip-out my application-specific > > > > > parts and share something later tonight. > > > > > > > > > > The biggest hassle is that you can't append to existing S3 files. > So > > > > > unless you're planning on uploading each message as a separate S3 > > > > > object, this means you need message aggregation smarts on the Kafka > > > > > consumer / S3 uploader side of things. > > > > > > > > > > Best, > > > > > Niek > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 20, 2012 at 12:56 PM, Russell Jurney > > > > > <[EMAIL PROTECTED]> wrote: > > > > > > I wish someone would publish some source that writes events to > S3. > > > > > > > > > > > > Russell Jurney > > > > > > twitter.com/rjurney > > > > > > [EMAIL PROTECTED] > > > > > > datasyndrome.com > > > > > > > > > > > > On Mar 20, 2012, at 11:20 AM, Dave Fayram <[EMAIL PROTECTED]> > > wrote: > > > > > > > > > > > >> We've been successfully using Kafka on AWS as well, and JMX wise > > we > > > > > >> just use an SSH tunnel. > > > > > >> > > > > > >> In general, we've been very happy with the performance on AWS, > > which > > > > > >> some people have reservations about due to the I/O situation on
-
Re: Kafka in AWS?Neha Narkhede 2012-03-21, 15:09
Vaibhav,
Thanks for explaining your use case. I think I see the requirement here. It seems like you need the data in S3 since you use Elastic MapReduce to process your data. I guess that's the reason the Hadoop input/output formats that Kafka provides are not directly useful. I have some ideas on how this can be done. Will write them up on a wiki soon. Thanks, Neha On Tue, Mar 20, 2012 at 10:21 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > Neha, > > My requirement is not related to Russell's, but I thought it will be > helpful describe what we need at GumGum <http://gumgum.com/>. > I wasn't sure whether it's Kafka domain since kafka gives you a topic > to pull data from and then it's up to you to do whatever with it. > > But since we are talking about it, here is what we do everyday (currently > without Kafka): > > We are a ad network. We write all of our impressions and clicks data in > various log files and upload it to S3. At night we run many Map reduce jobs > to aggregate this data in various ways. > We have an 'Autoscaled' cluster in AWS. Our webservers keep going up and > down based on the load on the system. > > Whenever a server shuts down we tend to lose data. Many times file upload > is not completed in time before the server shuts down. That is why we are > looking at implementing Kafka to send events in real time to S3 without > losing them. > > If there exists a 'sink' that transfers data to S3, our job will be lot > easier. But again, I am not sure whether Kafka is supposed to provide that > or not. > > Regards, > Vaibhav > > > On Tue, Mar 20, 2012 at 10:03 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote: > >> Russell, >> >> By "sink events into S3", do you mean you want to have some plugin that >> will suck data out of your Kafka brokers and upload to S3. Would you mind >> describing use cases that would require to send data to Kafka, then upload >> data to S3, and then use it by querying S3 ? >> >> Thanks, >> Neha >> On Mar 20, 2012 4:51 PM, "Russell Jurney" <[EMAIL PROTECTED]> >> wrote: >> >> > I think as soon as someone commits code that reliably sinks events to S3, >> > Kafka adoption will skyrocket. There is no good solution to this yet. >> > MANY people want one. >> > >> > Russ >> > >> > On Tue, Mar 20, 2012 at 3:32 PM, Felix GV <[EMAIL PROTECTED]> wrote: >> > >> > > The primary use case for Kafka is to use it on AWS...??? >> > > >> > > Sorry if I put words you didn't intend in your mouth :P ... I just >> > thought >> > > that sounded funny ;) >> > > >> > > Sorry for being off-topic. Carry on :/ ! >> > > >> > > -- >> > > Felix >> > > >> > > >> > > >> > > On Tue, Mar 20, 2012 at 6:23 PM, Russell Jurney < >> > [EMAIL PROTECTED] >> > > >wrote: >> > > >> > > > Yeah, that is the part I am hoping someone will contribute :) I >> know I >> > > can >> > > > write that myself. I also know it will be buggy and that I will have >> > > lots >> > > > of trouble. >> > > > >> > > > If you contribute this code, it would be a huge boon to Kafka. It is >> > imo >> > > > the primary use case for Kafka atm... if only the code gets into git. >> > > > >> > > > On Tue, Mar 20, 2012 at 3:04 PM, Niek Sanders < >> [EMAIL PROTECTED] >> > > > >wrote: >> > > > >> > > > > Russell, >> > > > > >> > > > > I'm actually in the process of writing a Java code to go from Kafka >> > > > > messages to S3. I might be able to rip-out my application-specific >> > > > > parts and share something later tonight. >> > > > > >> > > > > The biggest hassle is that you can't append to existing S3 files. >> So >> > > > > unless you're planning on uploading each message as a separate S3 >> > > > > object, this means you need message aggregation smarts on the Kafka >> > > > > consumer / S3 uploader side of things. >> > > > > >> > > > > Best, >> > > > > Niek >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > On Tue, Mar 20, 2012 at 12:56 PM, Russell Jurney >> > > > > <[EMAIL PROTECTED]> wrote:
-
Re: Kafka in AWS?Niek Sanders 2012-03-21, 18:01
So what would you like the S3 files to actually look like?
One Kafka message body per line? Should the message topic be tossed in there too? A tricky aspect is that the Kafka message body is an opaque byte array. For my own case I'm using JSON for the payload so it makes my requirements simpler. - Niek On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > I want events in S3 to process them in Hadoop. I'd like to emit them in my app, and have them magically show up in 64MB chunks on S3. Like most everyone else. > > Russell Jurney http://datasyndrome.com >
-
Re: Kafka in AWS?Vaibhav Puranik 2012-03-21, 18:59
I would use the payload. I want the message to be exactly as it is. We want
to name the files as per topic. (That's how we differentiate right now). Regards, Vaibhav On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <[EMAIL PROTECTED]>wrote: > So what would you like the S3 files to actually look like? > > One Kafka message body per line? Should the message topic be tossed > in there too? > > A tricky aspect is that the Kafka message body is an opaque byte > array. For my own case I'm using JSON for the payload so it makes my > requirements simpler. > > - Niek > > > > On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney > <[EMAIL PROTECTED]> wrote: > > I want events in S3 to process them in Hadoop. I'd like to emit them in > my app, and have them magically show up in 64MB chunks on S3. Like most > everyone else. > > > > Russell Jurney http://datasyndrome.com > > >
-
Re: Kafka in AWS?Russell Jurney 2012-03-21, 19:46
I'm going to use thrift, avro or protobuf for serialization.
Russell Jurney http://datasyndrome.com On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > I would use the payload. I want the message to be exactly as it is. We want > to name the files as per topic. > (That's how we differentiate right now). > > Regards, > Vaibhav > > On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <[EMAIL PROTECTED]>wrote: > >> So what would you like the S3 files to actually look like? >> >> One Kafka message body per line? Should the message topic be tossed >> in there too? >> >> A tricky aspect is that the Kafka message body is an opaque byte >> array. For my own case I'm using JSON for the payload so it makes my >> requirements simpler. >> >> - Niek >> >> >> >> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney >> <[EMAIL PROTECTED]> wrote: >>> I want events in S3 to process them in Hadoop. I'd like to emit them in >> my app, and have them magically show up in 64MB chunks on S3. Like most >> everyone else. >>> >>> Russell Jurney http://datasyndrome.com >>> >>
-
Re: Kafka in AWS?Tim Lossen 2012-03-21, 19:50
another good option would be messagepack -- flexible & schemaless like json, but binary.
Sent from my iPhone On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]> wrote: > I'm going to use thrift, avro or protobuf for serialization. > > Russell Jurney http://datasyndrome.com > > On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > >> I would use the payload. I want the message to be exactly as it is. We want >> to name the files as per topic. >> (That's how we differentiate right now). >> >> Regards, >> Vaibhav >> >> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <[EMAIL PROTECTED]>wrote: >> >>> So what would you like the S3 files to actually look like? >>> >>> One Kafka message body per line? Should the message topic be tossed >>> in there too? >>> >>> A tricky aspect is that the Kafka message body is an opaque byte >>> array. For my own case I'm using JSON for the payload so it makes my >>> requirements simpler. >>> >>> - Niek >>> >>> >>> >>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney >>> <[EMAIL PROTECTED]> wrote: >>>> I want events in S3 to process them in Hadoop. I'd like to emit them in >>> my app, and have them magically show up in 64MB chunks on S3. Like most >>> everyone else. >>>> >>>> Russell Jurney http://datasyndrome.com >>>> >>>
-
Re: Kafka in AWS?Russell Jurney 2012-03-21, 19:52
I don't want avro to have an opinion on the serialization format.
Russell Jurney http://datasyndrome.com On Mar 21, 2012, at 12:50 PM, Tim Lossen <[EMAIL PROTECTED]> wrote: > another good option would be messagepack -- flexible & schemaless like json, but binary. > > Sent from my iPhone > > On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]> wrote: > >> I'm going to use thrift, avro or protobuf for serialization. >> >> Russell Jurney http://datasyndrome.com >> >> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: >> >>> I would use the payload. I want the message to be exactly as it is. We want >>> to name the files as per topic. >>> (That's how we differentiate right now). >>> >>> Regards, >>> Vaibhav >>> >>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <[EMAIL PROTECTED]>wrote: >>> >>>> So what would you like the S3 files to actually look like? >>>> >>>> One Kafka message body per line? Should the message topic be tossed >>>> in there too? >>>> >>>> A tricky aspect is that the Kafka message body is an opaque byte >>>> array. For my own case I'm using JSON for the payload so it makes my >>>> requirements simpler. >>>> >>>> - Niek >>>> >>>> >>>> >>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney >>>> <[EMAIL PROTECTED]> wrote: >>>>> I want events in S3 to process them in Hadoop. I'd like to emit them in >>>> my app, and have them magically show up in 64MB chunks on S3. Like most >>>> everyone else. >>>>> >>>>> Russell Jurney http://datasyndrome.com >>>>> >>>>
-
Re: Kafka in AWS?Russell Jurney 2012-03-21, 20:33
I want the S3 files to be organized by type and date. Folders for types, subfolders for date down to the hour: year/month/day/hour. All payloads of a given type get written together.
It would be ideal if there was no integration with the end format, but in practice I'm not sure if all the serialization protocols mentioned can be written in this way. Russell Jurney http://datasyndrome.com On Mar 21, 2012, at 12:50 PM, Tim Lossen <[EMAIL PROTECTED]> wrote: > another good option would be messagepack -- flexible & schemaless like json, but binary. > > Sent from my iPhone > > On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]> wrote: > >> I'm going to use thrift, avro or protobuf for serialization. >> >> Russell Jurney http://datasyndrome.com >> >> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: >> >>> I would use the payload. I want the message to be exactly as it is. We want >>> to name the files as per topic. >>> (That's how we differentiate right now). >>> >>> Regards, >>> Vaibhav >>> >>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <[EMAIL PROTECTED]>wrote: >>> >>>> So what would you like the S3 files to actually look like? >>>> >>>> One Kafka message body per line? Should the message topic be tossed >>>> in there too? >>>> >>>> A tricky aspect is that the Kafka message body is an opaque byte >>>> array. For my own case I'm using JSON for the payload so it makes my >>>> requirements simpler. >>>> >>>> - Niek >>>> >>>> >>>> >>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney >>>> <[EMAIL PROTECTED]> wrote: >>>>> I want events in S3 to process them in Hadoop. I'd like to emit them in >>>> my app, and have them magically show up in 64MB chunks on S3. Like most >>>> everyone else. >>>>> >>>>> Russell Jurney http://datasyndrome.com >>>>> >>>>
-
Re: Kafka in AWS?Vaibhav Puranik 2012-03-21, 20:37
We also have s3 files organized by date in the following fashion.
yyyy/MM/dd/hh Our messages are in JSON. Regards, Vaibhav On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney <[EMAIL PROTECTED]>wrote: > I want the S3 files to be organized by type and date. Folders for types, > subfolders for date down to the hour: year/month/day/hour. All payloads of > a given type get written together. > > It would be ideal if there was no integration with the end format, but in > practice I'm not sure if all the serialization protocols mentioned can be > written in this way. > > Russell Jurney http://datasyndrome.com > > On Mar 21, 2012, at 12:50 PM, Tim Lossen <[EMAIL PROTECTED]> wrote: > > > another good option would be messagepack -- flexible & schemaless like > json, but binary. > > > > Sent from my iPhone > > > > On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]> > wrote: > > > >> I'm going to use thrift, avro or protobuf for serialization. > >> > >> Russell Jurney http://datasyndrome.com > >> > >> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]> > wrote: > >> > >>> I would use the payload. I want the message to be exactly as it is. We > want > >>> to name the files as per topic. > >>> (That's how we differentiate right now). > >>> > >>> Regards, > >>> Vaibhav > >>> > >>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <[EMAIL PROTECTED] > >wrote: > >>> > >>>> So what would you like the S3 files to actually look like? > >>>> > >>>> One Kafka message body per line? Should the message topic be tossed > >>>> in there too? > >>>> > >>>> A tricky aspect is that the Kafka message body is an opaque byte > >>>> array. For my own case I'm using JSON for the payload so it makes my > >>>> requirements simpler. > >>>> > >>>> - Niek > >>>> > >>>> > >>>> > >>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney > >>>> <[EMAIL PROTECTED]> wrote: > >>>>> I want events in S3 to process them in Hadoop. I'd like to emit them > in > >>>> my app, and have them magically show up in 64MB chunks on S3. Like > most > >>>> everyone else. > >>>>> > >>>>> Russell Jurney http://datasyndrome.com > >>>>> > >>>> >
-
Re: Kafka in AWS?Russell Jurney 2012-03-21, 20:44
You have code that puts records in bigger blocks on s3? Plz to share? :)
Russell Jurney http://datasyndrome.com On Mar 21, 2012, at 1:37 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > We also have s3 files organized by date in the following fashion. > > yyyy/MM/dd/hh > > Our messages are in JSON. > > Regards, > Vaibhav > > On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney <[EMAIL PROTECTED]>wrote: > >> I want the S3 files to be organized by type and date. Folders for types, >> subfolders for date down to the hour: year/month/day/hour. All payloads of >> a given type get written together. >> >> It would be ideal if there was no integration with the end format, but in >> practice I'm not sure if all the serialization protocols mentioned can be >> written in this way. >> >> Russell Jurney http://datasyndrome.com >> >> On Mar 21, 2012, at 12:50 PM, Tim Lossen <[EMAIL PROTECTED]> wrote: >> >>> another good option would be messagepack -- flexible & schemaless like >> json, but binary. >>> >>> Sent from my iPhone >>> >>> On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]> >> wrote: >>> >>>> I'm going to use thrift, avro or protobuf for serialization. >>>> >>>> Russell Jurney http://datasyndrome.com >>>> >>>> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]> >> wrote: >>>> >>>>> I would use the payload. I want the message to be exactly as it is. We >> want >>>>> to name the files as per topic. >>>>> (That's how we differentiate right now). >>>>> >>>>> Regards, >>>>> Vaibhav >>>>> >>>>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <[EMAIL PROTECTED] >>> wrote: >>>>> >>>>>> So what would you like the S3 files to actually look like? >>>>>> >>>>>> One Kafka message body per line? Should the message topic be tossed >>>>>> in there too? >>>>>> >>>>>> A tricky aspect is that the Kafka message body is an opaque byte >>>>>> array. For my own case I'm using JSON for the payload so it makes my >>>>>> requirements simpler. >>>>>> >>>>>> - Niek >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney >>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>> I want events in S3 to process them in Hadoop. I'd like to emit them >> in >>>>>> my app, and have them magically show up in 64MB chunks on S3. Like >> most >>>>>> everyone else. >>>>>>> >>>>>>> Russell Jurney http://datasyndrome.com >>>>>>> >>>>>> >>
-
Re: Kafka in AWS?Vaibhav Puranik 2012-03-22, 05:01
Let me ask my boss what I can share. Let's talk off the mailing list.
Regards, Vaibhav On Wed, Mar 21, 2012 at 1:44 PM, Russell Jurney <[EMAIL PROTECTED]>wrote: > You have code that puts records in bigger blocks on s3? Plz to share? :) > > Russell Jurney http://datasyndrome.com > > On Mar 21, 2012, at 1:37 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > > > We also have s3 files organized by date in the following fashion. > > > > yyyy/MM/dd/hh > > > > Our messages are in JSON. > > > > Regards, > > Vaibhav > > > > On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney < > [EMAIL PROTECTED]>wrote: > > > >> I want the S3 files to be organized by type and date. Folders for types, > >> subfolders for date down to the hour: year/month/day/hour. All payloads > of > >> a given type get written together. > >> > >> It would be ideal if there was no integration with the end format, but > in > >> practice I'm not sure if all the serialization protocols mentioned can > be > >> written in this way. > >> > >> Russell Jurney http://datasyndrome.com > >> > >> On Mar 21, 2012, at 12:50 PM, Tim Lossen <[EMAIL PROTECTED]> wrote: > >> > >>> another good option would be messagepack -- flexible & schemaless like > >> json, but binary. > >>> > >>> Sent from my iPhone > >>> > >>> On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]> > >> wrote: > >>> > >>>> I'm going to use thrift, avro or protobuf for serialization. > >>>> > >>>> Russell Jurney http://datasyndrome.com > >>>> > >>>> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]> > >> wrote: > >>>> > >>>>> I would use the payload. I want the message to be exactly as it is. > We > >> want > >>>>> to name the files as per topic. > >>>>> (That's how we differentiate right now). > >>>>> > >>>>> Regards, > >>>>> Vaibhav > >>>>> > >>>>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders < > [EMAIL PROTECTED] > >>> wrote: > >>>>> > >>>>>> So what would you like the S3 files to actually look like? > >>>>>> > >>>>>> One Kafka message body per line? Should the message topic be tossed > >>>>>> in there too? > >>>>>> > >>>>>> A tricky aspect is that the Kafka message body is an opaque byte > >>>>>> array. For my own case I'm using JSON for the payload so it makes > my > >>>>>> requirements simpler. > >>>>>> > >>>>>> - Niek > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney > >>>>>> <[EMAIL PROTECTED]> wrote: > >>>>>>> I want events in S3 to process them in Hadoop. I'd like to emit > them > >> in > >>>>>> my app, and have them magically show up in 64MB chunks on S3. Like > >> most > >>>>>> everyone else. > >>>>>>> > >>>>>>> Russell Jurney http://datasyndrome.com > >>>>>>> > >>>>>> > >> >
-
Re: Kafka in AWS?Russell Jurney 2012-03-23, 02:15
bump
On Wed, Mar 21, 2012 at 10:01 PM, Vaibhav Puranik <[EMAIL PROTECTED]>wrote: > Let me ask my boss what I can share. Let's talk off the mailing list. > > Regards, > Vaibhav > > On Wed, Mar 21, 2012 at 1:44 PM, Russell Jurney <[EMAIL PROTECTED] > >wrote: > > > You have code that puts records in bigger blocks on s3? Plz to share? :) > > > > Russell Jurney http://datasyndrome.com > > > > On Mar 21, 2012, at 1:37 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > > > > > We also have s3 files organized by date in the following fashion. > > > > > > yyyy/MM/dd/hh > > > > > > Our messages are in JSON. > > > > > > Regards, > > > Vaibhav > > > > > > On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney < > > [EMAIL PROTECTED]>wrote: > > > > > >> I want the S3 files to be organized by type and date. Folders for > types, > > >> subfolders for date down to the hour: year/month/day/hour. All > payloads > > of > > >> a given type get written together. > > >> > > >> It would be ideal if there was no integration with the end format, but > > in > > >> practice I'm not sure if all the serialization protocols mentioned can > > be > > >> written in this way. > > >> > > >> Russell Jurney http://datasyndrome.com > > >> > > >> On Mar 21, 2012, at 12:50 PM, Tim Lossen <[EMAIL PROTECTED]> wrote: > > >> > > >>> another good option would be messagepack -- flexible & schemaless > like > > >> json, but binary. > > >>> > > >>> Sent from my iPhone > > >>> > > >>> On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]> > > >> wrote: > > >>> > > >>>> I'm going to use thrift, avro or protobuf for serialization. > > >>>> > > >>>> Russell Jurney http://datasyndrome.com > > >>>> > > >>>> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]> > > >> wrote: > > >>>> > > >>>>> I would use the payload. I want the message to be exactly as it is. > > We > > >> want > > >>>>> to name the files as per topic. > > >>>>> (That's how we differentiate right now). > > >>>>> > > >>>>> Regards, > > >>>>> Vaibhav > > >>>>> > > >>>>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders < > > [EMAIL PROTECTED] > > >>> wrote: > > >>>>> > > >>>>>> So what would you like the S3 files to actually look like? > > >>>>>> > > >>>>>> One Kafka message body per line? Should the message topic be > tossed > > >>>>>> in there too? > > >>>>>> > > >>>>>> A tricky aspect is that the Kafka message body is an opaque byte > > >>>>>> array. For my own case I'm using JSON for the payload so it makes > > my > > >>>>>> requirements simpler. > > >>>>>> > > >>>>>> - Niek > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney > > >>>>>> <[EMAIL PROTECTED]> wrote: > > >>>>>>> I want events in S3 to process them in Hadoop. I'd like to emit > > them > > >> in > > >>>>>> my app, and have them magically show up in 64MB chunks on S3. Like > > >> most > > >>>>>> everyone else. > > >>>>>>> > > >>>>>>> Russell Jurney http://datasyndrome.com > > >>>>>>> > > >>>>>> > > >> > > > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com |