Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka in AWS?


Copy link to this message
-
Re: Kafka in AWS?
bump

On Wed, Mar 21, 2012 at 10:01 PM, Vaibhav Puranik <[EMAIL PROTECTED]>wrote:

> Let me ask my boss what I can share. Let's talk off the mailing list.
>
> Regards,
> Vaibhav
>
> On Wed, Mar 21, 2012 at 1:44 PM, Russell Jurney <[EMAIL PROTECTED]
> >wrote:
>
> > You have code that puts records in bigger blocks on s3? Plz to share? :)
> >
> > Russell Jurney http://datasyndrome.com
> >
> > On Mar 21, 2012, at 1:37 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote:
> >
> > > We also have s3 files organized by date in the following fashion.
> > >
> > > yyyy/MM/dd/hh
> > >
> > > Our messages are in JSON.
> > >
> > > Regards,
> > > Vaibhav
> > >
> > > On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney <
> > [EMAIL PROTECTED]>wrote:
> > >
> > >> I want the S3 files to be organized by type and date. Folders for
> types,
> > >> subfolders for date down to the hour: year/month/day/hour. All
> payloads
> > of
> > >> a given type get written together.
> > >>
> > >> It would be ideal if there was no integration with the end format, but
> > in
> > >> practice I'm not sure if all the serialization protocols mentioned can
> > be
> > >> written in this way.
> > >>
> > >> Russell Jurney http://datasyndrome.com
> > >>
> > >> On Mar 21, 2012, at 12:50 PM, Tim Lossen <[EMAIL PROTECTED]> wrote:
> > >>
> > >>> another good option would be messagepack -- flexible & schemaless
> like
> > >> json, but binary.
> > >>>
> > >>> Sent from my iPhone
> > >>>
> > >>> On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]>
> > >> wrote:
> > >>>
> > >>>> I'm going to use thrift, avro or protobuf for serialization.
> > >>>>
> > >>>> Russell Jurney http://datasyndrome.com
> > >>>>
> > >>>> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]>
> > >> wrote:
> > >>>>
> > >>>>> I would use the payload. I want the message to be exactly as it is.
> > We
> > >> want
> > >>>>> to name the files as per topic.
> > >>>>> (That's how we differentiate right now).
> > >>>>>
> > >>>>> Regards,
> > >>>>> Vaibhav
> > >>>>>
> > >>>>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <
> > [EMAIL PROTECTED]
> > >>> wrote:
> > >>>>>
> > >>>>>> So what would you like the S3 files to actually look like?
> > >>>>>>
> > >>>>>> One Kafka message body per line?  Should the message topic be
> tossed
> > >>>>>> in there too?
> > >>>>>>
> > >>>>>> A tricky aspect is that the Kafka message body is an opaque byte
> > >>>>>> array.  For my own case I'm using JSON for the payload so it makes
> > my
> > >>>>>> requirements simpler.
> > >>>>>>
> > >>>>>> - Niek
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney
> > >>>>>> <[EMAIL PROTECTED]> wrote:
> > >>>>>>> I want events in S3 to process them in Hadoop. I'd like to emit
> > them
> > >> in
> > >>>>>> my app, and have them magically show up in 64MB chunks on S3. Like
> > >> most
> > >>>>>> everyone else.
> > >>>>>>>
> > >>>>>>> Russell Jurney http://datasyndrome.com
> > >>>>>>>
> > >>>>>>
> > >>
> >
>

--
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB