|
|
Mix Nin 2013-03-07, 00:41
I have a file with below data
xxxxx 11,22,33 44,55,66 77,88,99
I wrote below PIG script
X= LOAD '/user/lnindrakrishna/tmp/ExpTag.txt' AS (id :chararray,qc :chararray ,qt :chararray ,qe :chararray );
Y = Foreach X generate id, STRSPLIT(qc,',') AS split_qc , STRSPLIT(qt,',') AS split_qt, STRSPLIT(qe,',') AS split_qe;;
Z = foreach Y generate id, FLATTEN(TOBAG(split_qc));
I expected output as follows:
xxxxx 11 xxxxx 22 xxxxx 33
But the above script is producing output as follows
(xxxxx,11,22,33)
FLATTEN is not actually flattening the bag of tuple. Any inputs here?
- Thanks
-
Re: FLATTEN is not working
Harsha 2013-03-07, 01:29
Hi Mix, You are doing a TOBAG on a tuple which will put it as {((11,22,33))}. flatten the tuple before doing the TOBAG. Z = foreach Y GENERATE id ,flatten(split_qc); A = foreach Z generate $0, flatten(TOBAG($1,$2,$3)); -- Harsha On Wednesday, March 6, 2013 at 4:41 PM, Mix Nin wrote:
> I have a file with below data > > xxxxx 11,22,33 44,55,66 77,88,99 > > I wrote below PIG script > > X= LOAD '/user/lnindrakrishna/tmp/ExpTag.txt' AS (id :chararray,qc > :chararray ,qt :chararray ,qe :chararray ); > > Y = Foreach X generate id, STRSPLIT(qc,',') AS split_qc , STRSPLIT(qt,',') > AS split_qt, STRSPLIT(qe,',') AS split_qe;; > > Z = foreach Y generate id, FLATTEN(TOBAG(split_qc)); > > I expected output as follows: > > xxxxx 11 > xxxxx 22 > xxxxx 33 > > But the above script is producing output as follows > > (xxxxx,11,22,33) > > FLATTEN is not actually flattening the bag of tuple. Any inputs here? > > - Thanks
-
Re: FLATTEN is not working
Mix Nin 2013-03-07, 01:46
Harsha, Thanks for the reply. Suppose if I want to see output as follows xxxxx 11 44 77 xxxxx 22 55 88 xxxxx 33 66 99
How would the script be written On Wed, Mar 6, 2013 at 5:29 PM, Harsha <[EMAIL PROTECTED]> wrote:
> Hi Mix, > You are doing a TOBAG on a tuple which will put it as > {((11,22,33))}. > flatten the tuple before doing the TOBAG. > Z = foreach Y GENERATE id ,flatten(split_qc); > A = foreach Z generate $0, flatten(TOBAG($1,$2,$3)); > -- > Harsha > > > On Wednesday, March 6, 2013 at 4:41 PM, Mix Nin wrote: > > > I have a file with below data > > > > xxxxx 11,22,33 44,55,66 77,88,99 > > > > I wrote below PIG script > > > > X= LOAD '/user/lnindrakrishna/tmp/ExpTag.txt' AS (id :chararray,qc > > :chararray ,qt :chararray ,qe :chararray ); > > > > Y = Foreach X generate id, STRSPLIT(qc,',') AS split_qc , > STRSPLIT(qt,',') > > AS split_qt, STRSPLIT(qe,',') AS split_qe;; > > > > Z = foreach Y generate id, FLATTEN(TOBAG(split_qc)); > > > > I expected output as follows: > > > > xxxxx 11 > > xxxxx 22 > > xxxxx 33 > > > > But the above script is producing output as follows > > > > (xxxxx,11,22,33) > > > > FLATTEN is not actually flattening the bag of tuple. Any inputs here? > > > > - Thanks > >
-
Re: FLATTEN is not working
Harsha 2013-03-07, 02:00
I can think off doing some thing on these lines but there might be a better way. Z = foreach Y generate id, TOTUPLE(split_qc.$0,split_qt.$0,split_qe.$0),TOTUPLE(split_qc.$1,split_qt.$1,split_qe.$1),TOTUPLE(split_qc.$2,split_qt.$2,split_qe.$2); A = foreach Z generate $0, flatten(TOBAG($1,$2,$3));
-- Harsha On Wednesday, March 6, 2013 at 5:46 PM, Mix Nin wrote:
> Harsha, Thanks for the reply. Suppose if I want to see output as follows > xxxxx 11 44 77 > xxxxx 22 55 88 > xxxxx 33 66 99 > > How would the script be written > > > On Wed, Mar 6, 2013 at 5:29 PM, Harsha <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote: > > > Hi Mix, > > You are doing a TOBAG on a tuple which will put it as > > {((11,22,33))}. > > flatten the tuple before doing the TOBAG. > > Z = foreach Y GENERATE id ,flatten(split_qc); > > A = foreach Z generate $0, flatten(TOBAG($1,$2,$3)); > > -- > > Harsha > > > > > > On Wednesday, March 6, 2013 at 4:41 PM, Mix Nin wrote: > > > > > I have a file with below data > > > > > > xxxxx 11,22,33 44,55,66 77,88,99 > > > > > > I wrote below PIG script > > > > > > X= LOAD '/user/lnindrakrishna/tmp/ExpTag.txt' AS (id :chararray,qc > > > :chararray ,qt :chararray ,qe :chararray ); > > > > > > Y = Foreach X generate id, STRSPLIT(qc,',') AS split_qc , > > STRSPLIT(qt,',') > > > AS split_qt, STRSPLIT(qe,',') AS split_qe;; > > > > > > Z = foreach Y generate id, FLATTEN(TOBAG(split_qc)); > > > > > > I expected output as follows: > > > > > > xxxxx 11 > > > xxxxx 22 > > > xxxxx 33 > > > > > > But the above script is producing output as follows > > > > > > (xxxxx,11,22,33) > > > > > > FLATTEN is not actually flattening the bag of tuple. Any inputs here? > > > > > > - Thanks
-
Re: FLATTEN is not working
Mix Nin 2013-03-07, 15:03
Hi Harsha,
I am getting output as below with the new script. It is not transposed
(xxxxx,(11,44,77),(22,55,88),(33,66,99)) Also , there is no guarantee that in input that there would be only 3 values in each field separated by comma(,). There can be variable number of values.
Thanks
-
Re: FLATTEN is not working
Mix Nin 2013-03-07, 20:43
I used below script and got the desired output. Thanks for the reply
A =foreach Z generate $0 as id, FLATTEN(TOBAG(*)) as value;
I have another question
Currently the input is as below xxxxx 11,22,33 44,55,66 77,88,99
Suppose if input is as below
xxxxx 11,22,33 44,55,66 77,88,99 yyyyy 12,23 34,45 56,67 zzzzz 1,2,3,4 5,6,7,8,9 66,77,88,99
And the output needs to be as follows
xxxx 11 44 77 xxxx 22 55 88 xxxx 33 66 99 yyyy 12 34 56 yyyy 23 45 67 zzzz 1 5 66 zzzz 2 6 77 zzzz 3 7 88 zzzz 4 8 99
So basically, input can have variable values in each filed. How can we replace the script? On Thu, Mar 7, 2013 at 7:03 AM, Mix Nin <[EMAIL PROTECTED]> wrote:
> Hi Harsha, > > I am getting output as below with the new script. It is not transposed > > (xxxxx,(11,44,77),(22,55,88),(33,66,99)) > > > Also , there is no guarantee that in input that there would be only 3 > values in each field separated by comma(,). There can be variable number of > values. > > Thanks > > >
|
|