|
|
-
Converting a tuple to rows
Xavier Stevens 2011-06-02, 18:38
I'm currently trying to write a pig script to output a feature index. Is there a built-in function for converting an unknown length tuple to output once for each item in the tuple?
Example code:
raw = LOAD 'hbase://mytable' USING HBaseStorage('data:json') AS json:chararray; genmap = FOREACH raw GENERATE com.mozilla.pig.eval.json.JsonMap(json) AS json_map:map[]; words = FOREACH genmap GENERATE FLATTEN(com.mozilla.pig.eval.text.Normalize(json_map#'text')) AS word_tuple; dump words; (the,quick,brown,fox,jumped,over,the,lazy,dog)
I want to get:
the quick brown fox jumped over lazy dog
Thanks,
-Xavier
-
Re: Converting a tuple to rows
Thejas M Nair 2011-06-02, 18:52
one_word_per_line = FOREACH words GENERATE FLATTEN(TOBAG(*));
-Thejas On 6/2/11 11:38 AM, "Xavier Stevens" <[EMAIL PROTECTED]> wrote:
I'm currently trying to write a pig script to output a feature index. Is there a built-in function for converting an unknown length tuple to output once for each item in the tuple?
Example code:
raw = LOAD 'hbase://mytable' USING HBaseStorage('data:json') AS json:chararray; genmap = FOREACH raw GENERATE com.mozilla.pig.eval.json.JsonMap(json) AS json_map:map[]; words = FOREACH genmap GENERATE FLATTEN(com.mozilla.pig.eval.text.Normalize(json_map#'text')) AS word_tuple; dump words; (the,quick,brown,fox,jumped,over,the,lazy,dog)
I want to get:
the quick brown fox jumped over lazy dog
Thanks,
-Xavier
--
-
Re: Converting a tuple to rows
Xavier Stevens 2011-06-02, 18:57
Awesome! I was trying to FLATTEN(*) without the TOBAG.
Thanks Thejas.
On 6/2/11 11:52 AM, Thejas M Nair wrote: > one_word_per_line = FOREACH words GENERATE FLATTEN(TOBAG(*)); > > -Thejas > > > On 6/2/11 11:38 AM, "Xavier Stevens" <[EMAIL PROTECTED]> wrote: > > I'm currently trying to write a pig script to output a feature > index. Is > there a built-in function for converting an unknown length tuple to > output once for each item in the tuple? > > Example code: > > raw = LOAD 'hbase://mytable' USING HBaseStorage('data:json') AS > json:chararray; > genmap = FOREACH raw GENERATE > com.mozilla.pig.eval.json.JsonMap(json) AS > json_map:map[]; > words = FOREACH genmap GENERATE > FLATTEN(com.mozilla.pig.eval.text.Normalize(json_map#'text')) AS > word_tuple; > dump words; > (the,quick,brown,fox,jumped,over,the,lazy,dog) > > I want to get: > > the > quick > brown > fox > jumped > over > lazy > dog > > Thanks, > > -Xavier > > > > -- >
|
|