Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Read SequenceFile


Copy link to this message
-
Read SequenceFile
I have a sequence file with JSON but when I run this script it just
displays the keys. It's not displaying the bytearray. How can I get it
dumped. In sequencefile first line I see it as BytesWritable.

DEFINE SequenceFileLoader
org.apache.pig.piggybank.storage.SequenceFileLoader();
A = LOAD
'/flume_vol/flume/2012/09/07/20/SDGL04R8gtby1/web.1347076423344.snappy'
USING SequenceFileLoader AS (key:long, document:bytearray);
DUMP A;

(1347076528612,)

(1347076528808,)

(1347076529021,)

But I see the JSON in the file:

hadoop fs -cat
/flume_vol/flume/2012/09/07/20/SDGL04R8gtby1/web.1347076423344.snappy|more
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable)org.apache.hadoop.io.compress.SnappyCodec

at^LcityY%Ýry.ªMore--
"exit   !ü$,"previous2
79,"i1ý4assasaá¡bY^LauthQaùTashedSSN":"XXDDFF!ffDD2products¥att¾]m
sku":"biz)A%a}]}fK^L4125í8)°þKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKýKÜí¹ÉhþKþKþKþKþKþKáK)¡^LþKþKîKé296þKþKþKþKþK

        ,"priorityCode):!       b.]
E"tt¾]I sku":"biz)A%a8H}]}

HeighA&2860,@Width":768,"encodA :"utf-8",%é
                                           Agenos
`!,"device)r,},"customVar%ê,tag":"Financ!ddresA