Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # user - Re: Avro vs Json


+
Tatu Saloranta 2012-08-12, 22:26
+
Russell Jurney 2012-08-12, 22:34
+
Bill Graham 2012-08-13, 02:42
Copy link to this message
-
Re: Avro vs Json
Russell Jurney 2012-08-13, 03:03
To be fair, you can test types as you parse JSON. But only a few.

The Avro schemas even include comments... huge win.

Russell Jurney http://datasyndrome.com

On Aug 12, 2012, at 7:42 PM, Bill Graham <[EMAIL PROTECTED]> wrote:

The benefit of having a schema associated with your data should not be
understated. I think when debating whether to use JSON or some other data
serialization format that has a schema (like Avro), you should choose the
later. The schema support alone will pay dividends over the long run.
On Sun, Aug 12, 2012 at 3:34 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:

> You'll need to compress JSON. Avro can compress itself. Avro
> represents more types, you'll need to serialize your types beyond what
> json supports with annotation or by convention. JSON is simpler.
>
> Short answer: use JSON if it's types are expressive enough for your
> data, and if you don't mind compressing it yourself. Avro has more
> types, has the schema onboard and self compresses.
>
> Russell Jurney
>
> On Aug 12, 2012, at 3:27 PM, Tatu Saloranta <[EMAIL PROTECTED]> wrote:
>
> > I would ask questions from specific subset of users: those with actual
> > experience in using both, to compare approaches. If you ask someone
> > who is only used one, all you get to know is that both can be made to
> > work well enough. Which of course may be enough for your needs. :-)
> >
> > -+ Tatu +-
> >
> > On Sun, Aug 12, 2012 at 10:32 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> >> Moving this to the user@avro lists. Please use the right lists for the
> >> best answers and the right people.
> >>
> >> I'd pick Avro out of the two - it is very well designed for typed data
> >> and has a very good implementation of the serializer/deserializer,
> >> aside of the schema advantages. FWIW, Avro has a tojson CLI tool to
> >> dump Avro binary format out as JSON structures, which would be of help
> >> if you seek readability and/or integration with apps/systems that
> >> already depend on JSON.
> >>
> >> On Sun, Aug 12, 2012 at 10:41 PM, Mohit Anchlia <[EMAIL PROTECTED]>
> wrote:
> >>> We get data in Json format. I was initially thinking of simply storing
> Json
> >>> in hdfs for processing. I see there is Avro that does the similar
> thing but
> >>> most likely stores it in more optimized format. I wanted to get users
> >>> opinion on which one is better.
> >>
> >>
> >>
> >> --
> >> Harsh J
>

--
*Note that I'm no longer using my Yahoo! email address. Please email me at
[EMAIL PROTECTED] going forward.*
+
Tatu Saloranta 2012-08-13, 17:50
+
Bill Graham 2012-08-13, 22:59
+
Russell Jurney 2012-08-14, 00:31
+
Tatu Saloranta 2012-08-14, 02:33
+
Tatu Saloranta 2012-08-13, 17:47
+
Knoke, Jeff 2012-08-13, 12:49
+
Knoke, Jeff 2012-08-13, 12:46