|
|
-
Passing a single Bag to an Eval function
James Newhaven 2012-05-28, 12:12
I am trying to use an EVAL pig function (it's called BagSplit from datafu) which accepts a Bag as a parameter.
The problem I have is that my current relation is a single Bag, so I'm not sure how to pass this relation to the Eval function.
My pig script looks like this:
F = FOREACH E GENERATE $0,$1;
DESCRIBE F;
F: {id: chararray,countd: long}
G = FOREACH F GENERATE BagSplit(??);
Not sure what I can put in ??
Thanks, James
-
Re: Passing a single Bag to an Eval function
Jonathan Coveney 2012-05-28, 17:53
Howdy James. It's important to remember that relations and bags are not the same (though they feel pretty similar). EvalFuncs can never be run directly on a relation, only on a bag. In this case, you need to group first.
-- F: {id: chararray,countd: long}
G = group F all;
H = FOREACH G GENERATE BagSplit(F);
"group all" is what lets you effectively turn a relation directly into a bag. describe G and you'll see that now you have a relation of one row, where that row is the catchall key "all," and then a bag named "F" that is the relation you wanted to run BagSplit on.
2012/5/28 James Newhaven <[EMAIL PROTECTED]>
> I am trying to use an EVAL pig function (it's called BagSplit from datafu) > which accepts a Bag as a parameter. > > The problem I have is that my current relation is a single Bag, so I'm not > sure how to pass this relation to the Eval function. > > My pig script looks like this: > > F = FOREACH E GENERATE $0,$1; > > DESCRIBE F; > > F: {id: chararray,countd: long} > > G = FOREACH F GENERATE BagSplit(??); > > Not sure what I can put in ?? > > Thanks, > James >
-
Re: Passing a single Bag to an Eval function
James Newhaven 2012-05-28, 22:16
Thanks Jonathan. Great explanation.
On Mon, May 28, 2012 at 6:53 PM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:
> Howdy James. It's important to remember that relations and bags are not the > same (though they feel pretty similar). EvalFuncs can never be run directly > on a relation, only on a bag. In this case, you need to group first. > > -- F: {id: chararray,countd: long} > > G = group F all; > > H = FOREACH G GENERATE BagSplit(F); > > "group all" is what lets you effectively turn a relation directly into a > bag. describe G and you'll see that now you have a relation of one row, > where that row is the catchall key "all," and then a bag named "F" that is > the relation you wanted to run BagSplit on. > > 2012/5/28 James Newhaven <[EMAIL PROTECTED]> > > > I am trying to use an EVAL pig function (it's called BagSplit from > datafu) > > which accepts a Bag as a parameter. > > > > The problem I have is that my current relation is a single Bag, so I'm > not > > sure how to pass this relation to the Eval function. > > > > My pig script looks like this: > > > > F = FOREACH E GENERATE $0,$1; > > > > DESCRIBE F; > > > > F: {id: chararray,countd: long} > > > > G = FOREACH F GENERATE BagSplit(??); > > > > Not sure what I can put in ?? > > > > Thanks, > > James > > >
|
|