Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Kafka in AWS?


Copy link to this message
-
Re: Kafka in AWS?
Russell Jurney 2012-03-23, 02:15
bump

On Wed, Mar 21, 2012 at 10:01 PM, Vaibhav Puranik <[EMAIL PROTECTED]>wrote:

> Let me ask my boss what I can share. Let's talk off the mailing list.
>
> Regards,
> Vaibhav
>
> On Wed, Mar 21, 2012 at 1:44 PM, Russell Jurney <[EMAIL PROTECTED]
> >wrote:
>
> > You have code that puts records in bigger blocks on s3? Plz to share? :)
> >
> > Russell Jurney http://datasyndrome.com
> >
> > On Mar 21, 2012, at 1:37 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote:
> >
> > > We also have s3 files organized by date in the following fashion.
> > >
> > > yyyy/MM/dd/hh
> > >
> > > Our messages are in JSON.
> > >
> > > Regards,
> > > Vaibhav
> > >
> > > On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney <
> > [EMAIL PROTECTED]>wrote:
> > >
> > >> I want the S3 files to be organized by type and date. Folders for
> types,
> > >> subfolders for date down to the hour: year/month/day/hour. All
> payloads
> > of
> > >> a given type get written together.
> > >>
> > >> It would be ideal if there was no integration with the end format, but
> > in
> > >> practice I'm not sure if all the serialization protocols mentioned can
> > be
> > >> written in this way.
> > >>
> > >> Russell Jurney http://datasyndrome.com
> > >>
> > >> On Mar 21, 2012, at 12:50 PM, Tim Lossen <[EMAIL PROTECTED]> wrote:
> > >>
> > >>> another good option would be messagepack -- flexible & schemaless
> like
> > >> json, but binary.
> > >>>
> > >>> Sent from my iPhone
> > >>>
> > >>> On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]>
> > >> wrote:
> > >>>
> > >>>> I'm going to use thrift, avro or protobuf for serialization.
> > >>>>
> > >>>> Russell Jurney http://datasyndrome.com
> > >>>>
> > >>>> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]>
> > >> wrote:
> > >>>>
> > >>>>> I would use the payload. I want the message to be exactly as it is.
> > We
> > >> want
> > >>>>> to name the files as per topic.
> > >>>>> (That's how we differentiate right now).
> > >>>>>
> > >>>>> Regards,
> > >>>>> Vaibhav
> > >>>>>
> > >>>>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <
> > [EMAIL PROTECTED]
> > >>> wrote:
> > >>>>>
> > >>>>>> So what would you like the S3 files to actually look like?
> > >>>>>>
> > >>>>>> One Kafka message body per line?  Should the message topic be
> tossed
> > >>>>>> in there too?
> > >>>>>>
> > >>>>>> A tricky aspect is that the Kafka message body is an opaque byte
> > >>>>>> array.  For my own case I'm using JSON for the payload so it makes
> > my
> > >>>>>> requirements simpler.
> > >>>>>>
> > >>>>>> - Niek
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney
> > >>>>>> <[EMAIL PROTECTED]> wrote:
> > >>>>>>> I want events in S3 to process them in Hadoop. I'd like to emit
> > them
> > >> in
> > >>>>>> my app, and have them magically show up in 64MB chunks on S3. Like
> > >> most
> > >>>>>> everyone else.
> > >>>>>>>
> > >>>>>>> Russell Jurney http://datasyndrome.com
> > >>>>>>>
> > >>>>>>
> > >>
> >
>

--
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com