Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Sequence File processing


+
Srini 2012-12-24, 05:24
+
Cheolsoo Park 2012-12-24, 21:37
+
Srini 2012-12-25, 06:35
Copy link to this message
-
Re: Sequence File processing
+1

Best Regards,
Tariq
+91-9741563634
https://mtariq.jux.com/
On Tue, Dec 25, 2012 at 3:07 AM, Cheolsoo Park <[EMAIL PROTECTED]>wrote:

> Hi Srini,
>
> You can use STRSPLIT to split your "value" chararray and define schema in a
> FOREACH. For example, if the "value" consists of 3 integers (i.e. "1|2|3"),
>
> A= LOAD 'part-m-0000' USING SequenceFileLoader() AS
> (key:long,value:chararray);
> B = FOREACH A GENERATE key, FLATTEN( STRSPLIT(value,'\\|') ) AS (i:int,
> j:int, k:int);
> DESCRIBE B;
> DUMP B;
>
> This will return:
>
> B: {key: chararray,i: int,j: int,k: int}
> (k,1,2,3)
>
> Thanks,
> Cheolsoo
>
>
> On Sun, Dec 23, 2012 at 9:24 PM, Srini <[EMAIL PROTECTED]> wrote:
>
> > Hi ,
> >
> > I have used SequeceFileLoader for loading sequence file.
> >
> > A= load 'part-m-0000' using SequenceFileLoader() as
> > (key:long,value:chararray)
> >
> > "value" is the  chararray which consists of 10 fields which are separated
> > by delimiter ( "|" here ). How do I create schema here so that I can make
> > further analysis with these fields (such as filter, group )
> >
> > Any help is appreciated.
> >
> > Thanks,
> > Srini
> >
>
+
Kshiva Kps 2012-12-25, 05:42
+
Dmitriy Ryaboy 2013-01-11, 03:37