Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka in AWS?


Copy link to this message
-
Re: Kafka in AWS?
Let me ask my boss what I can share. Let's talk off the mailing list.

Regards,
Vaibhav

On Wed, Mar 21, 2012 at 1:44 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:

> You have code that puts records in bigger blocks on s3? Plz to share? :)
>
> Russell Jurney http://datasyndrome.com
>
> On Mar 21, 2012, at 1:37 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote:
>
> > We also have s3 files organized by date in the following fashion.
> >
> > yyyy/MM/dd/hh
> >
> > Our messages are in JSON.
> >
> > Regards,
> > Vaibhav
> >
> > On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney <
> [EMAIL PROTECTED]>wrote:
> >
> >> I want the S3 files to be organized by type and date. Folders for types,
> >> subfolders for date down to the hour: year/month/day/hour. All payloads
> of
> >> a given type get written together.
> >>
> >> It would be ideal if there was no integration with the end format, but
> in
> >> practice I'm not sure if all the serialization protocols mentioned can
> be
> >> written in this way.
> >>
> >> Russell Jurney http://datasyndrome.com
> >>
> >> On Mar 21, 2012, at 12:50 PM, Tim Lossen <[EMAIL PROTECTED]> wrote:
> >>
> >>> another good option would be messagepack -- flexible & schemaless like
> >> json, but binary.
> >>>
> >>> Sent from my iPhone
> >>>
> >>> On 21 Mar 2012, at 20:46, Russell Jurney <[EMAIL PROTECTED]>
> >> wrote:
> >>>
> >>>> I'm going to use thrift, avro or protobuf for serialization.
> >>>>
> >>>> Russell Jurney http://datasyndrome.com
> >>>>
> >>>> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <[EMAIL PROTECTED]>
> >> wrote:
> >>>>
> >>>>> I would use the payload. I want the message to be exactly as it is.
> We
> >> want
> >>>>> to name the files as per topic.
> >>>>> (That's how we differentiate right now).
> >>>>>
> >>>>> Regards,
> >>>>> Vaibhav
> >>>>>
> >>>>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <
> [EMAIL PROTECTED]
> >>> wrote:
> >>>>>
> >>>>>> So what would you like the S3 files to actually look like?
> >>>>>>
> >>>>>> One Kafka message body per line?  Should the message topic be tossed
> >>>>>> in there too?
> >>>>>>
> >>>>>> A tricky aspect is that the Kafka message body is an opaque byte
> >>>>>> array.  For my own case I'm using JSON for the payload so it makes
> my
> >>>>>> requirements simpler.
> >>>>>>
> >>>>>> - Niek
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney
> >>>>>> <[EMAIL PROTECTED]> wrote:
> >>>>>>> I want events in S3 to process them in Hadoop. I'd like to emit
> them
> >> in
> >>>>>> my app, and have them magically show up in 64MB chunks on S3. Like
> >> most
> >>>>>> everyone else.
> >>>>>>>
> >>>>>>> Russell Jurney http://datasyndrome.com
> >>>>>>>
> >>>>>>
> >>
>