Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Read SequenceFile


Copy link to this message
-
Re: Read SequenceFile
It looks like SequenceFileLoader in piggybank doesn't support BytesWritable
type? Is there a way to get this working in pig?

On Fri, Sep 7, 2012 at 11:03 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:

> I have a sequence file with JSON but when I run this script it just
> displays the keys. It's not displaying the bytearray. How can I get it
> dumped. In sequencefile first line I see it as BytesWritable.
>
> DEFINE SequenceFileLoader
> org.apache.pig.piggybank.storage.SequenceFileLoader();
> A = LOAD
> '/flume_vol/flume/2012/09/07/20/SDGL04R8gtby1/web.1347076423344.snappy'
> USING SequenceFileLoader AS (key:long, document:bytearray);
> DUMP A;
>
> (1347076528612,)
>
> (1347076528808,)
>
> (1347076529021,)
>
> But I see the JSON in the file:
>
>
>
> hadoop fs -cat
> /flume_vol/flume/2012/09/07/20/SDGL04R8gtby1/web.1347076423344.snappy|more
>
> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable)org.apache.hadoop.io.compress.SnappyCodec
> Kþ
> at^LcityY%Ýry.ªMore--
> "exit   !ü$,"previous2
> 79,"i1ý4assasaá¡bY^LauthQaùTashedSSN":"XXDDFF!ffDD2products¥att¾]m
> sku":"biz)A%a}]}fK^L4125í8)°þKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKþKýKÜí¹ÉhþKþKþKþKþKþKáK)¡^LþKþKîKé296þKþKþKþKþK
>
>         ,"priorityCode):!       b.]
> E"tt¾]I sku":"biz)A%a8H}]}
>
> HeighA&2860,@Width":768,"encodA :"utf-8",%é
>                                            Agenos
> `!,"device)r,},"customVar%ê,tag":"Financ!ddresA
>