You'll need to compress JSON. Avro can compress itself. Avro
represents more types, you'll need to serialize your types beyond what
json supports with annotation or by convention. JSON is simpler.
Short answer: use JSON if it's types are expressive enough for your
data, and if you don't mind compressing it yourself. Avro has more
types, has the schema onboard and self compresses.
On Aug 12, 2012, at 3:27 PM, Tatu Saloranta <[EMAIL PROTECTED]> wrote:
> I would ask questions from specific subset of users: those with actual
> experience in using both, to compare approaches. If you ask someone
> who is only used one, all you get to know is that both can be made to
> work well enough. Which of course may be enough for your needs. :-)
> -+ Tatu +-
> On Sun, Aug 12, 2012 at 10:32 AM, Harsh J <[EMAIL PROTECTED]> wrote:
>> Moving this to the user@avro lists. Please use the right lists for the
>> best answers and the right people.
>> I'd pick Avro out of the two - it is very well designed for typed data
>> and has a very good implementation of the serializer/deserializer,
>> aside of the schema advantages. FWIW, Avro has a tojson CLI tool to
>> dump Avro binary format out as JSON structures, which would be of help
>> if you seek readability and/or integration with apps/systems that
>> already depend on JSON.
>> On Sun, Aug 12, 2012 at 10:41 PM, Mohit Anchlia <[EMAIL PROTECTED]> wrote:
>>> We get data in Json format. I was initially thinking of simply storing Json
>>> in hdfs for processing. I see there is Avro that does the similar thing but
>>> most likely stores it in more optimized format. I wanted to get users
>>> opinion on which one is better.
>> Harsh J