|
|
-
Conditional execution of 'generate' clauses
Stan Rosenberg 2011-10-02, 16:22
Hi Folks,
I came across a use case where I'd like to do something like this:
FOREACH X { if (!IsEmpty(t))
}
+
Stan Rosenberg 2011-10-02, 16:22
-
Conditional execution of 'generate' clauses
Stan Rosenberg 2011-10-02, 16:28
Hi Folks,
I came across a use case where I'd like to do something like this:
FOREACH X { ... t = DISTINCT (...) if (!IsEmpty(t)) GENERATE foo, ... }
Thus, 'generate' is conditionally executed and the control flow depends on the value of some tuple 't'. Can this be done in pig?
Thanks,
stan
P.S. Please ignore my previous email; I accidentally triggered send before I had a chance to finish it.
+
Stan Rosenberg 2011-10-02, 16:28
-
Re: Conditional execution of 'generate' clauses
Dmitriy Ryaboy 2011-10-03, 07:46
Why not this:
Y = foreach X { .. t = distinct ... generate t, foo... }
Z = filter Y by isEmpty(t);
OR: t can't be empty if the thing you are distincting is not empty, so this should work:
Y = filter X by IsEmpty(thing_you_wanted_to_distinct); Z = foreach Y { -- the thing you are distincting is now guaranteed to have at least 1 value t = distinct .. generate foo... }
On Sun, Oct 2, 2011 at 9:28 AM, Stan Rosenberg < [EMAIL PROTECTED]> wrote:
> Hi Folks, > > I came across a use case where I'd like to do something like this: > > FOREACH X { > ... > t = DISTINCT (...) > if (!IsEmpty(t)) > GENERATE foo, ... > } > > Thus, 'generate' is conditionally executed and the control flow depends on > the value of some tuple 't'. > Can this be done in pig? > > Thanks, > > stan > > P.S. Please ignore my previous email; I accidentally triggered send before > I > had a chance to finish it. >
+
Dmitriy Ryaboy 2011-10-03, 07:46
-
Re: Conditional execution of 'generate' clauses
Stan Rosenberg 2011-10-03, 14:16
I am aware of workarounds using filters. However, a filter must be executed unconditionally whereas with conditional control-flow some (generate) clauses may never need to be executed.
Thanks,
stan
On Mon, Oct 3, 2011 at 3:46 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
> Why not this: > > Y = foreach X { .. > t = distinct ... > generate t, foo... > } > > Z = filter Y by isEmpty(t); > > OR: t can't be empty if the thing you are distincting is not empty, so this > should work: > > Y = filter X by IsEmpty(thing_you_wanted_to_distinct); > Z = foreach Y { > -- the thing you are distincting is now guaranteed to have at least 1 > value > t = distinct .. > generate foo... > } > > On Sun, Oct 2, 2011 at 9:28 AM, Stan Rosenberg < > [EMAIL PROTECTED]> wrote: > > > Hi Folks, > > > > I came across a use case where I'd like to do something like this: > > > > FOREACH X { > > ... > > t = DISTINCT (...) > > if (!IsEmpty(t)) > > GENERATE foo, ... > > } > > > > Thus, 'generate' is conditionally executed and the control flow depends > on > > the value of some tuple 't'. > > Can this be done in pig? > > > > Thanks, > > > > stan > > > > P.S. Please ignore my previous email; I accidentally triggered send > before > > I > > had a chance to finish it. > > >
+
Stan Rosenberg 2011-10-03, 14:16
-
Re: Conditional execution of 'generate' clauses
Alan Gates 2011-10-03, 15:42
I'm not sure what you mean by t being non empty, since it's a relation and not an expression. But guessing that you mean there is some bag in t you want to check for non-emptiness, isn't the following equivalent?
foreach X { ... t = distinct ... tnotempty = filter t by !IsEmpty(...); generate foo, ... }
Alan. On Oct 3, 2011, at 12:46 AM, Dmitriy Ryaboy wrote:
> Why not this: > > Y = foreach X { .. > t = distinct ... > generate t, foo... > } > > Z = filter Y by isEmpty(t); > > OR: t can't be empty if the thing you are distincting is not empty, so this > should work: > > Y = filter X by IsEmpty(thing_you_wanted_to_distinct); > Z = foreach Y { > -- the thing you are distincting is now guaranteed to have at least 1 > value > t = distinct .. > generate foo... > } > > On Sun, Oct 2, 2011 at 9:28 AM, Stan Rosenberg < > [EMAIL PROTECTED]> wrote: > >> Hi Folks, >> >> I came across a use case where I'd like to do something like this: >> >> FOREACH X { >> ... >> t = DISTINCT (...) >> if (!IsEmpty(t)) >> GENERATE foo, ... >> } >> >> Thus, 'generate' is conditionally executed and the control flow depends on >> the value of some tuple 't'. >> Can this be done in pig? >> >> Thanks, >> >> stan >> >> P.S. Please ignore my previous email; I accidentally triggered send before >> I >> had a chance to finish it. >>
+
Alan Gates 2011-10-03, 15:42
-
Re: Conditional execution of 'generate' clauses
Stan Rosenberg 2011-10-03, 18:18
Alan,
Let me abstract my previous example to:
foreach X { -- do some processing and store results in Y if (Y.$0 == 'foo') { generate X.$0, ... } }
Does pig support this type of control-flow?
Many thanks,
stan
On Mon, Oct 3, 2011 at 11:42 AM, Alan Gates <[EMAIL PROTECTED]> wrote:
> I'm not sure what you mean by t being non empty, since it's a relation and > not an expression. But guessing that you mean there is some bag in t you > want to check for non-emptiness, isn't the following equivalent? > > foreach X { > ... > t = distinct ... > tnotempty = filter t by !IsEmpty(...); > generate foo, ... > } > > Alan. > > > On Oct 3, 2011, at 12:46 AM, Dmitriy Ryaboy wrote: > > > Why not this: > > > > Y = foreach X { .. > > t = distinct ... > > generate t, foo... > > } > > > > Z = filter Y by isEmpty(t); > > > > OR: t can't be empty if the thing you are distincting is not empty, so > this > > should work: > > > > Y = filter X by IsEmpty(thing_you_wanted_to_distinct); > > Z = foreach Y { > > -- the thing you are distincting is now guaranteed to have at least 1 > > value > > t = distinct .. > > generate foo... > > } > > > > On Sun, Oct 2, 2011 at 9:28 AM, Stan Rosenberg < > > [EMAIL PROTECTED]> wrote: > > > >> Hi Folks, > >> > >> I came across a use case where I'd like to do something like this: > >> > >> FOREACH X { > >> ... > >> t = DISTINCT (...) > >> if (!IsEmpty(t)) > >> GENERATE foo, ... > >> } > >> > >> Thus, 'generate' is conditionally executed and the control flow depends > on > >> the value of some tuple 't'. > >> Can this be done in pig? > >> > >> Thanks, > >> > >> stan > >> > >> P.S. Please ignore my previous email; I accidentally triggered send > before > >> I > >> had a chance to finish it. > >> > >
+
Stan Rosenberg 2011-10-03, 18:18
-
Re: Conditional execution of 'generate' clauses
Alan Gates 2011-10-03, 18:33
No, Pig Latin does data flow only, not control flow. But I'm not sure what your code would mean. Y is bag (or a relation). Y.$0 is still a bag, not a single valued entity, as your pseudo code implies. Do you really mean something like:
foreach row in Y: if Y[row][0] == 'foo': generate X[row][0]
If that's the case you can write a UDF that would do that, and invoke it as:
foreach X { ... generate yourUDF(X, Y); }
This would give you the freedom to choose when to and when not to emit records. In the case where there were no records in Y that met your filter condition you would still have to emit a null, so you might still want to filter out nulls afterwords.
Alan. On Oct 3, 2011, at 11:18 AM, Stan Rosenberg wrote:
> Alan, > > Let me abstract my previous example to: > > foreach X { > -- do some processing and store results in Y > if (Y.$0 == 'foo') { > generate X.$0, ... > } > } > > Does pig support this type of control-flow? > > Many thanks, > > stan > > On Mon, Oct 3, 2011 at 11:42 AM, Alan Gates <[EMAIL PROTECTED]> wrote: > >> I'm not sure what you mean by t being non empty, since it's a relation and >> not an expression. But guessing that you mean there is some bag in t you >> want to check for non-emptiness, isn't the following equivalent? >> >> foreach X { >> ... >> t = distinct ... >> tnotempty = filter t by !IsEmpty(...); >> generate foo, ... >> } >> >> Alan. >> >> >> On Oct 3, 2011, at 12:46 AM, Dmitriy Ryaboy wrote: >> >>> Why not this: >>> >>> Y = foreach X { .. >>> t = distinct ... >>> generate t, foo... >>> } >>> >>> Z = filter Y by isEmpty(t); >>> >>> OR: t can't be empty if the thing you are distincting is not empty, so >> this >>> should work: >>> >>> Y = filter X by IsEmpty(thing_you_wanted_to_distinct); >>> Z = foreach Y { >>> -- the thing you are distincting is now guaranteed to have at least 1 >>> value >>> t = distinct .. >>> generate foo... >>> } >>> >>> On Sun, Oct 2, 2011 at 9:28 AM, Stan Rosenberg < >>> [EMAIL PROTECTED]> wrote: >>> >>>> Hi Folks, >>>> >>>> I came across a use case where I'd like to do something like this: >>>> >>>> FOREACH X { >>>> ... >>>> t = DISTINCT (...) >>>> if (!IsEmpty(t)) >>>> GENERATE foo, ... >>>> } >>>> >>>> Thus, 'generate' is conditionally executed and the control flow depends >> on >>>> the value of some tuple 't'. >>>> Can this be done in pig? >>>> >>>> Thanks, >>>> >>>> stan >>>> >>>> P.S. Please ignore my previous email; I accidentally triggered send >> before >>>> I >>>> had a chance to finish it. >>>> >> >>
+
Alan Gates 2011-10-03, 18:33
|
|