Have you thought about HBase?
I would suggest that if you're using Hive or Pig, to look at taking these files and putting the JSON records in to a sequence file.
Or set of sequence files.... (Then look at HBase to help index them...) 200KB is small.
That would be the same for either pig/hive.
In terms of SerDes, I've worked w Pig and ElephantBird, its pretty nice. And yes you get each record as a row, however you can always flatten them as needed.
I haven't worked with the latest SerDe, but maybe Dean Wampler or Edward Capriolo could give you a better answer.
Going from memory, I don't know that there is a good SerDe that would write JSON, just read it. (Hive)
IMHO Pig/ElephantBird is the best so far, but then again I may be dated and biased.
I think you're on the right track or at least train of thought.
On Jun 12, 2013, at 7:57 PM, Tecno Brain <[EMAIL PROTECTED]> wrote: