Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Is Avro right for me?


+
Mark 2013-05-23, 20:29
+
Sean Busbey 2013-05-24, 04:16
+
Mark 2013-05-26, 16:39
+
Martin Kleppmann 2013-05-27, 13:25
+
Russell Jurney 2013-05-27, 18:08
+
Stefan Krawczyk 2013-05-27, 19:00
Copy link to this message
-
Re: Is Avro right for me?
On 27 May 2013 20:00, Stefan Krawczyk <[EMAIL PROTECTED]> wrote:

> So it's up to you what you stick into the body of that Avro event. It
> could just be json, or it could be your own serialized Avro event - and as
> far as I understand serialized Avro always has the schema with it (right?).
>

In an Avro data file, yes, because you just need to specify the schema
once, followed by (say) a million records that all use the same schema. And
in an RPC context, you can negotiate the schema once per connection. But
when using a message broker, you're serializing individual records and
don't have an end-to-end connection with the consumer, so you'd need to
include the schema with every single message.

It probably doesn't make sense to include the full schema with every one,
as a typical schema might be 2 kB whereas a serialized record might be less
than 100 bytes (numbers obviously vary wildly by application), so the
schema size would dominate. Hence my suggestion of including a schema
version number or hash with every message.

Be aware that Flume doesn't have great support for languages outside of the
> JVM.
>

The same caveat unfortunately applies with Kafka too. There are clients for
non-JVM languages, but they lack important features, so I would recommend
using the official JVM client (if your application is non-JVM you could
simply pipe your application's stdout into the Kafka producer, or vice
versa on the consumer side).

Martin
+
Mark 2013-05-28, 22:38
+
Martin Kleppmann 2013-05-29, 10:16
+
Mike Percy 2013-05-29, 00:02
+
Mark 2013-05-29, 16:30
+
Mike Percy 2013-05-30, 03:02
+
Mark 2013-06-05, 03:10
+
Felix GV 2013-06-06, 18:51
+
Felix GV 2013-06-06, 19:09
+
Mark 2013-06-04, 19:57
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB