Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Kafka in AWS?


Copy link to this message
-
Re: Kafka in AWS?
Vaibhav Puranik 2012-03-21, 20:37
We also have s3 files organized by date in the following fashion.

yyyy/MM/dd/hh

Our messages are in JSON.

Regards,
Vaibhav

On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:

> I want the S3 files to be organized by type and date. Folders for types,
> subfolders for date down to the hour: year/month/day/hour. All payloads of
> a given type get written together.
>
> It would be ideal if there was no integration with the end format, but in
> practice I'm not sure if all the serialization protocols mentioned can be
> written in this way.
>
> Russell Jurney http://datasyndrome.com
>
> On Mar 21, 2012, at 12:50 PM, Tim Lossen <[EMAIL PROTECTED]> wrote:
>
> > another good option would be messagepack -- flexible & schemaless like
> json, but binary.
> >
> > Sent from my iPhone
> >
> > On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]>
> wrote:
> >
> >> I'm going to use thrift, avro or protobuf for serialization.
> >>
> >> Russell Jurney http://datasyndrome.com
> >>
> >> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]>
> wrote:
> >>
> >>> I would use the payload. I want the message to be exactly as it is. We
> want
> >>> to name the files as per topic.
> >>> (That's how we differentiate right now).
> >>>
> >>> Regards,
> >>> Vaibhav
> >>>
> >>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <[EMAIL PROTECTED]
> >wrote:
> >>>
> >>>> So what would you like the S3 files to actually look like?
> >>>>
> >>>> One Kafka message body per line?  Should the message topic be tossed
> >>>> in there too?
> >>>>
> >>>> A tricky aspect is that the Kafka message body is an opaque byte
> >>>> array.  For my own case I'm using JSON for the payload so it makes my
> >>>> requirements simpler.
> >>>>
> >>>> - Niek
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney
> >>>> <[EMAIL PROTECTED]> wrote:
> >>>>> I want events in S3 to process them in Hadoop. I'd like to emit them
> in
> >>>> my app, and have them magically show up in 64MB chunks on S3. Like
> most
> >>>> everyone else.
> >>>>>
> >>>>> Russell Jurney http://datasyndrome.com
> >>>>>
> >>>>
>