Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Problem while Converting from JSON=>Avro=>JSON


Copy link to this message
-
Re: Problem while Converting from JSON=>Avro=>JSON
To generate a file with a subset of fields you can specify a 'reader'
schema that contains only the desired fields.  For example, if you
have a schema like:

{"type":"record","name":"Event","fields":[
  {"name":"id","type":"int"},
  {"name":"url","type":"string"},
  {"name":"props","type":{"type":"array","items":{"type":"record","name":"Property","fields":[
      {"name":"key","type":"int"},
      {"name":"value","type":"string"}
]}]}

And you only want the ids and property values, then you can specify
the following when you create your GenericDatumReader:

{"type":"record","name":"Event","fields":[
  {"name":"id","type":"int"},
  {"name":"props","type":{"type":"array","items":{"type":"record","name":"Property","fields":[
      {"name":"value","type":"string"}
]}]}

Perhaps we should add a --schema parameter to the tojson command-line
tool that does this?

Doug

On Fri, Mar 14, 2014 at 1:30 AM, Saravanan Nagarajan
<[EMAIL PROTECTED]> wrote:

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB