Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Is Avro right for me?

Copy link to this message
Re: Is Avro right for me?
Whats more, there are examples and support for Kafka, but not so much for
On Mon, May 27, 2013 at 6:25 AM, Martin Kleppmann <[EMAIL PROTECTED]>wrote:

> I don't have experience with Flume, so I can't comment on that. At
> LinkedIn we ship logs around by sending Avro-encoded messages to Kafka (
> http://kafka.apache.org/). Kafka is nice, it scales very well and gives a
> great deal of flexibility — logs can be consumed by any number of
> independent consumers, consumers can catch up on a backlog if they're
> disconnected for a while, and it comes with Hadoop import out of the box.
> (RabbitMQ is more designed or use cases where each message corresponds to
> a task that needs to be performed by a worker. IMHO Kafka is a better fit
> for logs, which are more stream-like.)
> With any message broker, you'll need to somehow tag each message with the
> schema that was used to encode it. You could include the full schema with
> every message, but unless you have very large messages, that would be a
> huge overhead. Better to give each version of your schema a sequential
> version number, or hash the schema, and include the version number/hash in
> each message. You can then keep a repository of schemas for resolving those
> version numbers or hashes – simply in files that you distribute to all
> producers/consumers, or in a simple REST service like
> https://issues.apache.org/jira/browse/AVRO-1124
> Hope that helps,
> Martin
> On 26 May 2013 17:39, Mark <[EMAIL PROTECTED]> wrote:
>> Yes our central server would be Hadoop.
>> Exactly how would this work with flume? Would I write Avro to a file
>> source which flume would then ship over to one of our collectors  or is
>> there a better/native way? Would I have to include the schema in each
>> event? FYI we would be doing this primarily from a rails application.
>> Does anyone ever use Avro with a message bus like RabbitMQ?
>> On May 23, 2013, at 9:16 PM, Sean Busbey <[EMAIL PROTECTED]> wrote:
>> Yep. Avro would be great at that (provided your central consumer is Avro
>> friendly, like a Hadoop system).  Make sure that all of your schemas have
>> default values defined for fields so that schema evolution will be easier
>> in the future.
>> On Thu, May 23, 2013 at 4:29 PM, Mark <[EMAIL PROTECTED]> wrote:
>>> We're thinking about generating logs and events with Avro and shipping
>>> them to a central collector service via Flume. Is this a valid use case?
>> --
>> Sean
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com