Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> StoreFunc with Sequence file


Copy link to this message
-
StoreFunc with Sequence file
Hi pig users,
I implemented a custom StoreFunc to write some data in a binary format to a
Sequence File.

    private RecordWriter<NullWritable, BytesWritable> writer;

    private BytesWritable bytes;

    private DataOutputBuffer dob;
    @SuppressWarnings("rawtypes")

    @Override

    public OutputFormat getOutputFormat() throws IOException {

        return new SequenceFileOutputFormat<NullWritable, BytesWritable>();

    }
    @SuppressWarnings({ "rawtypes", "unchecked" })

    @Override

    public void prepareToWrite(RecordWriter writer) throws IOException {

        this.writer = writer;

        this.bytes = new BytesWritable();

        this.dob = new DataOutputBuffer();

    }

    @Override

    public void putNext(Tuple tuple) throws IOException {

        dob.reset();

        WritableUtils.writeCompressedString(dob, (String) tuple.get(0));

        DataBag childTracesBag = (DataBag) tuple.get(1);

        WritableUtils.writeVLong(dob, childTracesBag.size());

        for (Tuple t : childTracesBag) {

            WritableUtils.writeVInt(dob, (Integer) t.get(0));

            dob.writeLong((Long) t.get(1));

        }

        try {

            bytes.set(dob.getData(), 0, dob.getLength());

            writer.write(NullWritable.get(), bytes);

        } catch (InterruptedException e) {

            e.printStackTrace();

        }

    }
But I get this exception:
ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to
recreate exception from backed error: java.io.IOException:
java.io.IOException: wrong key class: org.apache.hadoop.io.NullWritable is
not class org.apache.pig.impl.io.NullableText

And if I use a NullableText instead of a NullWritable, I get this other
exception:
ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to
recreate exception from backed error: java.io.IOException:
java.io.IOException: wrong value class: org.apache.hadoop.io.BytesWritable
is not class org.apache.pig.impl.io.NullableTuple

There must be something I am doing wrong in telling Pig the types of the
sequence file.

It must be a stupid problem but I don't see it.

Does anybody have a clue?
Thanks,
--
Gianmarco
+
Ashutosh Chauhan 2011-10-28, 17:15
+
Gianmarco De Francisci Mo... 2011-10-31, 11:28
+
Ashutosh Chauhan 2011-11-01, 00:55
+
Gianmarco De Francisci Mo... 2011-11-03, 17:24