Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Re: Avro vs Json

Tatu Saloranta 2012-08-12, 22:26
Russell Jurney 2012-08-12, 22:34
Bill Graham 2012-08-13, 02:42
Russell Jurney 2012-08-13, 03:03
Tatu Saloranta 2012-08-13, 17:50
Bill Graham 2012-08-13, 22:59
Russell Jurney 2012-08-14, 00:31
Tatu Saloranta 2012-08-14, 02:33
Copy link to this message
Re: Avro vs Json
On Sun, Aug 12, 2012 at 7:42 PM, Bill Graham <[EMAIL PROTECTED]> wrote:
> The benefit of having a schema associated with your data should not be
> understated. I think when debating whether to use JSON or some other data
> serialization format that has a schema (like Avro), you should choose the
> later. The schema support alone will pay dividends over the long run.

I would argue it is one of those things that is overstated due to
intuitive attractiveness.
It is worth keeping in mind that explicit external schema is another
cost in not just designing but also maintaining the system. As such,
it is most useful for closely-coupled internal system, where one
controls both ends. This may be the case for computing pipelines a
single team owns.

Put another way: both benefits and costs of schemas accumulate over
long run, and the ratio ultimately determines which one wins. And yet
it is very hard to forecast in advance.
What can be said is that maintenance of no-schema is cheaper than
mainteinance of schema. Value of schema, on the other hand, is much
harder to estimate a priori.

-+ Tatu +-
Knoke, Jeff 2012-08-13, 12:49
Knoke, Jeff 2012-08-13, 12:46