Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Read SequenceFile


Copy link to this message
-
Read SequenceFile
I have a sequence file with JSON but when I run this script it just
displays the keys. It's not displaying the bytearray. How can I get it
dumped. In sequencefile first line I see it as BytesWritable.

DEFINE SequenceFileLoader
org.apache.pig.piggybank.storage.SequenceFileLoader();
A = LOAD
'/flume_vol/flume/2012/09/07/20/SDGL04R8gtby1/web.1347076423344.snappy'
USING SequenceFileLoader AS (key:long, document:bytearray);
DUMP A;

(1347076528612,)

(1347076528808,)

(1347076529021,)

But I see the JSON in the file:

hadoop fs -cat
/flume_vol/flume/2012/09/07/20/SDGL04R8gtby1/web.1347076423344.snappy|more
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable)org.apache.hadoop.io.compress.SnappyCodec

at^LcityY%Ýry.ªMore--
"exit   !ü$,"previous2
79,"i1ý4assasaá¡bY^LauthQaùTashedSSN":"XXDDFF!ffDD2products¥att¾]m
sku":"biz)A%a}]}fK^L4125í8)°þKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKýKÜí¹ÉhþKþKþKþKþKþKáK)¡^LþKþKîKé296þKþKþKþKþK

        ,"priorityCode):!       b.]
E"tt¾]I sku":"biz)A%a8H}]}

HeighA&2860,@Width":768,"encodA :"utf-8",%é
                                           Agenos
`!,"device)r,},"customVar%ê,tag":"Financ!ddresA
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB