Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> issue with DataFileReader


Copy link to this message
-
issue with DataFileReader

Hi,
We have a record format defined in the avro avdl.  One of the field in the avdl is of the type union {map<map<bytes>>, null}.  The avro file with this avdl schema is used as input and output of our two map/reduce jobs, based on avro java api, respectively.  We process records of the file in the map/reduce jobs using avro generic record, where the type string is actually the Utf8 object.   We never encounter any issue with this approach.  However, recently, we try to use AvroStorage() of Pig to read in the avro file and, unlike the avro map/reduce jobs, the value of the field with the above type definition appears to be not correct.  AvroStorage() uses the avro class DataFileReader to process data.  Is there anybody knows the difference in handling avro data between DataFileReader and Avro Map/Reduce API with the Generic Record ?  Is this a bug in the class DataFileReader?  Thanks.
Ey-Chih Chow            
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB