Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - pig native map-reduce and avro input format


+
Jonas Hartwig 2013-01-31, 08:53
Copy link to this message
-
Re: pig native map-reduce and avro input format
Cheolsoo Park 2013-01-31, 18:40
Hi Jonas,

I had to do a similar job before. What I did was store a relation as Avro
files and bulkload them into HBase. You can see my example here:
https://github.com/piaozhexiu/hbase-bulkload-avro

It's hard to tell what went wrong without seeing your code. But your Pig
command seems correct to me.

Thanks,
Cheolsoo
On Thu, Jan 31, 2013 at 12:53 AM, Jonas Hartwig <[EMAIL PROTECTED]>wrote:

> Hi everyone,
>
> I need some help to run my map-reduce job with pig.
> I wrote a map-reduce job that takes an avro file as input:
>               job.setJarByClass(Main.class);
>               job.setJobName("MapReduceJob");
>               job.setMapperClass(Mapper.class);
>               job.setReducerClass(Reducer.class);
>               job.setMapOutputKeyClass(IntWritable.class);
>               job.setMapOutputValueClass(DocumentRepresentation.class);
>               job.setOutputKeyClass(LongWritable.class);
>               job.setInputFormatClass(AvroKeyInputFormat.class);
> If I run this job using the Hadoop jag command everything works fine and
> the output of the map reduce job is as expected.
> Now I need this map-reduce job to work inside of a pig-script. To test the
> mr-job I used this test script:
>
> ...register statements
>
> A = LOAD 'mr-input' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> B = MAPREDUCE 'mr-job-0.0.1.jar' STORE A INTO 'mr-tmp' USING
> org.apache.pig.piggybank.storage.avro.AvroStorage('schema', '...')
>                 LOAD 'mr-result' AS (prefix: chararray, result: chararray)
>                 `com.mycompany.hadoop.Main mr-tmp mr-result ...more
> parameters`;
>
> All the mappers fail with the error: LongWritable cannot be cast to
> AvroMapper.
> The mapper definition looks like this:
>
> public class Mapper extends Mapper<AvroWrapper<Record>, NullWritable,
> IntWritable, DocumentRepresentation> {
>
> Any idea how to fix it?
>
> Jonas
>
+
Russell Jurney 2013-01-31, 19:05
+
Jonas Hartwig 2013-01-31, 19:25