Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> pig native map-reduce and avro input format

Copy link to this message
pig native map-reduce and avro input format
Hi everyone,

I need some help to run my map-reduce job with pig.
I wrote a map-reduce job that takes an avro file as input:
If I run this job using the Hadoop jag command everything works fine and the output of the map reduce job is as expected.
Now I need this map-reduce job to work inside of a pig-script. To test the mr-job I used this test script:

...register statements

A = LOAD 'mr-input' USING org.apache.pig.piggybank.storage.avro.AvroStorage();
B = MAPREDUCE 'mr-job-0.0.1.jar' STORE A INTO 'mr-tmp' USING org.apache.pig.piggybank.storage.avro.AvroStorage('schema', '...')
                LOAD 'mr-result' AS (prefix: chararray, result: chararray)
                `com.mycompany.hadoop.Main mr-tmp mr-result ...more parameters`;

All the mappers fail with the error: LongWritable cannot be cast to AvroMapper.
The mapper definition looks like this:

public class Mapper extends Mapper<AvroWrapper<Record>, NullWritable, IntWritable, DocumentRepresentation> {

Any idea how to fix it?