Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # user - issue with DataFileReader


Copy link to this message
-
issue with DataFileReader
ey-chih chow 2012-12-22, 01:33

Hi,
We have a record format defined in the avro avdl.  One of the field in the avdl is of the type union {map<map<bytes>>, null}.  The avro file with this avdl schema is used as input and output of our two map/reduce jobs, based on avro java api, respectively.  We process records of the file in the map/reduce jobs using avro generic record, where the type string is actually the Utf8 object.   We never encounter any issue with this approach.  However, recently, we try to use AvroStorage() of Pig to read in the avro file and, unlike the avro map/reduce jobs, the value of the field with the above type definition appears to be not correct.  AvroStorage() uses the avro class DataFileReader to process data.  Is there anybody knows the difference in handling avro data between DataFileReader and Avro Map/Reduce API with the Generic Record ?  Is this a bug in the class DataFileReader?  Thanks.
Ey-Chih Chow