Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - StoreFunc with Sequence file


Copy link to this message
-
Re: StoreFunc with Sequence file
Gianmarco De Francisci Mo... 2011-11-03, 17:24
Here is the pig script (I hope the formatting is kept),
I think I could reduce the script to a simple load/store and still have the
same problem, but I didn't have time to check it (I would need to rewrite
the StoreFunc).
FYI, my StoreFunc tries to write a SequenceFile<NullWritable,
BytesWritable>:

    @Override

    public OutputFormat<NullWritable, BytesWritable> getOutputFormat()
throws IOException {

        return new SequenceFileOutputFormat<NullWritable, BytesWritable>();

    }
rawtraces = LOAD '$log' AS (follower:chararray, action:int, time:long);

groupedtraces = GROUP rawtraces BY follower;

traces = FOREACH groupedtraces GENERATE group AS performer,
rawtraces.(action, time) AS t;
rawsn = LOAD '$network' AS (parent:chararray, child:chararray);

groupedsn = GROUP rawsn BY parent;

sn = FOREACH groupedsn GENERATE group AS parent, rawsn.(child) AS children;
join1 = JOIN traces BY performer, sn BY parent;
cleanJ1 = FOREACH join1 GENERATE traces::performer AS parent,
traces::t ASparentTraces, FLATTEN(sn::children)
AS child;

groupedJ1 = GROUP cleanJ1 BY child;

intermediate = FOREACH groupedJ1 GENERATE group AS child, cleanJ1.(parent,
parentTraces) AS legacy;
join2 = JOIN traces BY performer, intermediate BY child;

result = FOREACH join2 GENERATE traces::performer AS child, traces::t
ASchildTraces, intermediate::legacy
AS legacy;
STORE result INTO '$output' USING mypackage.pig.BinStorage();

And here is the stack trace:

java.io.IOException: java.io.IOException: wrong key class:
org.apache.hadoop.io.NullWritable is not class
org.apache.pig.impl.io.NullableText
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:464)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:427)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:399)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:261)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
at org.apache.hadoop.mapred.Child$4.run(Child.java:261)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:255)
Caused by: java.io.IOException: wrong key class:
org.apache.hadoop.io.NullWritable is not class
org.apache.pig.impl.io.NullableText
at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:985)
at org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:74)
at mypackage.pig.BinStorage.putNext(BinStorage.java:75)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:462)
... 11 more
Cheers,
--
Gianmarco

On Tue, Nov 1, 2011 at 01:55, Ashutosh Chauhan <[EMAIL PROTECTED]> wrote:

> Actually what I said was not entirely correct. Per Daniel, Pig's load/store
> func are designed to work with InputFormat/OutputFormat which works on
> <ComparableWritable,Writable> so what you are seeing is not expected. Can
> you paste the pig script you are using and the detailed stack trace. You