|
|
+
Russell Jurney 2012-08-19, 14:20
-
Re: Problem writing LoadFunc - why can't I use a sub-class of FileInputFormat as my InputFormat?Russell Jurney 2012-08-19, 19:30
Figured this out - I was ping ponging between mapred and mapreduce APIs.
package org.apache.pig.piggybank.storage.arc; import org.apache.hadoop.io.BytesWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.*; import org.apache.nutch.tools.arc.ArcInputFormat; import org.apache.nutch.tools.arc.ArcRecordReader; import java.io.IOException; public class PigArcInputFormat extends FileInputFormat<Text, BytesWritable> { public PigArcInputFormat() { } public ArcInputFormat getInputFormat() throws IOException { return new ArcInputFormat(); } public RecordReader<Text, BytesWritable> getRecordReader(InputSplit split, JobConf config, Reporter reporter) throws IOException { return new ArcRecordReader(config, (FileSplit)split); } } On Sun, Aug 19, 2012 at 7:20 AM, Russell Jurney <[EMAIL PROTECTED]>wrote: > I am writing a LoadFunc called ArcFileReader to load Common Crawl data in > ArcFile format. There is already a ArcRecord, ArcRecordReader and > ArcInputFormat for Hadoop. > > ArcInputFormat extends Hadoop's FileInputFormat, which implements Hadoop's > InputFormat interface. Why then can't I specify ArcInputFormat as my > InputFormat in my LoadFunc? > > @Override > public InputFormat getInputFormat() throws IOException { > return new ArcInputFormat(); > } > > > Java complains - attempting to use incompatible return type. What gives? > > -- > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome. > com > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com |