To be fair, you can test types as you parse JSON. But only a few.
The Avro schemas even include comments... huge win.
Russell Jurney http://datasyndrome.com
On Aug 12, 2012, at 7:42 PM, Bill Graham <[EMAIL PROTECTED]> wrote:
The benefit of having a schema associated with your data should not be
understated. I think when debating whether to use JSON or some other data
serialization format that has a schema (like Avro), you should choose the
later. The schema support alone will pay dividends over the long run.
On Sun, Aug 12, 2012 at 3:34 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:
> You'll need to compress JSON. Avro can compress itself. Avro
> represents more types, you'll need to serialize your types beyond what
> json supports with annotation or by convention. JSON is simpler.
> Short answer: use JSON if it's types are expressive enough for your
> data, and if you don't mind compressing it yourself. Avro has more
> types, has the schema onboard and self compresses.
> Russell Jurney
> On Aug 12, 2012, at 3:27 PM, Tatu Saloranta <[EMAIL PROTECTED]> wrote:
> > I would ask questions from specific subset of users: those with actual
> > experience in using both, to compare approaches. If you ask someone
> > who is only used one, all you get to know is that both can be made to
> > work well enough. Which of course may be enough for your needs. :-)
> > -+ Tatu +-
> > On Sun, Aug 12, 2012 at 10:32 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> >> Moving this to the user@avro lists. Please use the right lists for the
> >> best answers and the right people.
> >> I'd pick Avro out of the two - it is very well designed for typed data
> >> and has a very good implementation of the serializer/deserializer,
> >> aside of the schema advantages. FWIW, Avro has a tojson CLI tool to
> >> dump Avro binary format out as JSON structures, which would be of help
> >> if you seek readability and/or integration with apps/systems that
> >> already depend on JSON.
> >> On Sun, Aug 12, 2012 at 10:41 PM, Mohit Anchlia <[EMAIL PROTECTED]>
> >>> We get data in Json format. I was initially thinking of simply storing
> >>> in hdfs for processing. I see there is Avro that does the similar
> thing but
> >>> most likely stores it in more optimized format. I wanted to get users
> >>> opinion on which one is better.
> >> --
> >> Harsh J
*Note that I'm no longer using my Yahoo! email address. Please email me at
[EMAIL PROTECTED] going forward.*