Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> how to load custom Writable class from sequence file?


Copy link to this message
-
how to load custom Writable class from sequence file?
I tried to do a quick and dirty inspection of some of our data feeds, which
are encoded in gzipped SequenceFile.

basically I did

a = load 'myfile' using ......SequenceFileLoader() AS ( mykey, myvalue);

but it gave me some error:
2013-09-16 17:34:28,915 [Thread-5] INFO
 org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor
2013-09-16 17:34:28,915 [Thread-5] INFO
 org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor
2013-09-16 17:34:28,915 [Thread-5] INFO
 org.apache.hadoop.io.compress.CodecPool - Got brand-new decompressor
2013-09-16 17:34:28,961 [Thread-5] WARN
 org.apache.pig.piggybank.storage.SequenceFileLoader - Unable to translate
key class com.mycompany.model.VisitKey to a Pig datatype
2013-09-16 17:34:28,962 [Thread-5] WARN
 org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in
cleanup
2013-09-16 17:34:28,963 [Thread-5] WARN
 org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
org.apache.pig.backend.BackendException: ERROR 0: Unable to translate class
com.mycompany.model.VisitKey to a Pig datatype
at
org.apache.pig.piggybank.storage.SequenceFileLoader.setKeyType(SequenceFileLoader.java:78)
 at
org.apache.pig.piggybank.storage.SequenceFileLoader.getNext(SequenceFileLoader.java:133)
in the pig file, I have already REGISTERED the jar that contains the class
 com.mycompany.model.VisitKey
if PIG doesn't work, the only other approach is probably to use some of the
newer "pseudo-scripting " languages like cascalog or scala
thanks
Yang