Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> how to load custom Writable class from sequence file?


Copy link to this message
-
Re: how to load custom Writable class from sequence file?
I think my custom type has toString(), well at least writable() says it's
writable to bytes, so supposedly if I force it to bytes or string, pig
should be able to cast
like

load ... AS ( k:chararray, v:chararray);

but this actually fails
On Mon, Sep 16, 2013 at 6:22 PM, Pradeep Gollakota <[EMAIL PROTECTED]>wrote:

> The problem is that pig only speaks its data types. So you need to tell it
> how to translate from your custom writable to a pig datatype.
>
> Apparently elephant-bird has some support for doing this type of thing...
> take a look at this SO post
>
> http://stackoverflow.com/questions/16540651/apache-pig-can-we-convert-a-custom-writable-object-to-pig-format
>
>
> On Mon, Sep 16, 2013 at 5:37 PM, Yang <[EMAIL PROTECTED]> wrote:
>
> > I tried to do a quick and dirty inspection of some of our data feeds,
> which
> > are encoded in gzipped SequenceFile.
> >
> > basically I did
> >
> > a = load 'myfile' using ......SequenceFileLoader() AS ( mykey, myvalue);
> >
> > but it gave me some error:
> > 2013-09-16 17:34:28,915 [Thread-5] INFO
> >  org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor
> > 2013-09-16 17:34:28,915 [Thread-5] INFO
> >  org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor
> > 2013-09-16 17:34:28,915 [Thread-5] INFO
> >  org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor
> > 2013-09-16 17:34:28,961 [Thread-5] WARN
> >  org.apache.pig.piggybank.storage.SequenceFileLoader - Unable to
> translate
> > key class com.mycompany.model.VisitKey to a Pig datatype
> > 2013-09-16 17:34:28,962 [Thread-5] WARN
> >  org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in
> > cleanup
> > 2013-09-16 17:34:28,963 [Thread-5] WARN
> >  org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
> > org.apache.pig.backend.BackendException: ERROR 0: Unable to translate
> class
> > com.mycompany.model.VisitKey to a Pig datatype
> > at
> >
> >
> org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:78)
> >  at
> >
> >
> org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:133)
> >
> >
> > in the pig file, I have already REGISTERED the jar that contains the
> class
> >  com.mycompany.model.VisitKey
> >
> >
> > if PIG doesn't work, the only other approach is probably to use some of
> the
> > newer "pseudo-scripting " languages like cascalog or scala
> > thanks
> > Yang
> >
>