Hi..

Have you thought about HBase?

I would suggest that if you're using Hive or Pig, to look at taking these files and putting the JSON records in to a sequence file.
Or set of sequence files.... (Then look at HBase to help index them...) 200KB is small.

That would be the same for either pig/hive.

In terms of SerDes, I've worked w Pig and ElephantBird, its pretty nice. And yes you get each record as a row, however you can always flatten them as needed.

Hive?
I haven't worked with the latest SerDe, but maybe Dean Wampler or Edward Capriolo could give you a better answer.
Going from memory, I don't know that there is a good SerDe that would write JSON, just read it. (Hive)

IMHO Pig/ElephantBird is the best so far, but then again I may be dated and biased.

I think you're on the right track or at least train of thought.

HTH

-Mike
On Jun 12, 2013, at 7:57 PM, Tecno Brain <[EMAIL PROTECTED]> wrote:

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB