Though we don't support nested foreach in grammar, Pig has some limited
support for it in logical plan/runtime. For example, the following
script will contain a nested foreach:
a = load '1.txt' as (a0, a1, a2);
b = group a by a0;
c = foreach b {
c0 = a.a0;
generate c0;
};
explain c;
So I believe the basic piece to make nested foreach work is already
there. We need to further:
1. Allow parser to handle the real nested foreach statement, define the
limitation of nested foreach we support
2. Make sure Pig handles the extended scope of nested foreach
Daniel
On 05/22/2011 11:52 AM, Aniket Mokashi wrote:
> Hi,
>
> Thank you everyone for all your support. It has been a very enjoyable
> experience to work with pig community.
>
> I plan get involved through GSoC platform to contribute to pig project. I
> will be working on addition of support for nested foreach. I will also try
> to work on jiras related to this support (Please assign related jiras to
> me). My proposal to GSoC can be found at --
>
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/aniket486/1>
> <
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/aniket486/1>I> worked on a couple of interesting projects at Yahoo last summer to learn
> about internals of pig parser, logical plan build, construction of physical
> and mr plans from the logical plan. While working on support for scalars, I
> learnt about various passes in pig to reconstruct plans to optimize
> execution and limitations on it. In Pig 0.9, a few things have changed with
> parsers and optimizers. Hence, it would be beneficial for me if you can help
> me out with any comments and remarks on my approach.
>
> Here are my current thoughts on support of Nested Foreach -(
>
https://issues.apache.org/jira/browse/PIG-1631)> Pig currently supports nested_proj which internally streams the bag. This
> support can be extended by assigning innerplan to this streaming with
> nested_foreach. First step is to add parser support for this. But, this
> would need changes further to restrict generic support to the innerplan
> depending upon pig limitations. Currently, I am exploring various
> possibilities to add buildNestedForeachOp to logicalplanbuilder with or
> without using existing "generate_clause". I will upload a patch to jira once
> get projection support through nested foreach.
> Please let me know your comments on the same.
>
> Thanks,
> Aniket
>
> On Thu, May 19, 2011 at 1:19 PM, Ashutosh Chauhan<[EMAIL PROTECTED]>wrote:
>
>> Congratulations, Aniket!
>> Hoping to see many more contributions in Pig from you.
>>
>> Ashutosh
>> On Thu, May 19, 2011 at 10:08, Alan Gates<[EMAIL PROTECTED]> wrote:
>>> Please join me in welcoming Aniket Mokashi as a new committer on Pig.
>>> Aniket has been contributing to Pig since last summer. He wrote or
>> helped
>>> shepherd several major features in 0.8, including the Python UDF work,
>> the
>>> new mapreduce functionality, and the custom partitioner. We look forward
>> to
>>> more great work from him in the future.
>>>
>>> Alan.
>>>
>
>