Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - Is this desirable: relation.projection as sugar for foreach relation generate projection


Copy link to this message
-
Re: Is this desirable: relation.projection as sugar for foreach relation generate projection
Daniel Dai 2012-03-02, 23:02
I should say one operator only do one thing instead.

Daniel

On Fri, Mar 2, 2012 at 1:48 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
> But that's already not the case. The syntax "a = distinct (foreach b
> generate  $1, $2);" is completely legal.
>
> D
>
> On Fri, Feb 24, 2012 at 2:52 PM, Daniel Dai <[EMAIL PROTECTED]> wrote:
>> One of my concern is that it could complicate GUI mapping for the Pig
>> script in the future. I feel it might be more clear one statement only
>> do one thing.
>>
>> Daniel
>>
>> On Thu, Feb 23, 2012 at 2:23 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:
>>> Adam, thanks for the comments. Below is the cat of the patch (it's short
>>> enough to just paste in line):
>>>
>>> Your comments are welcome, and I'd be curious what others think as well.
>>> The blurring of the line between bags and relations is what I'm worried
>>> about, but at the same time, one of the things people confuse the most is
>>> that distinction.
>>>
>>>
>>> Index: test/org/apache/pig/test/TestEvalPipeline.java
>>> ==================================================================>>> --- test/org/apache/pig/test/TestEvalPipeline.java    (revision 1244760)
>>> +++ test/org/apache/pig/test/TestEvalPipeline.java    (working copy)
>>> @@ -383,7 +383,7 @@
>>>         pigServer.registerQuery("A = LOAD '"
>>>                 + Util.generateURI(tmpFile.toString(), pigContext) + "';");
>>>         if (eliminateDuplicates){
>>> -            pigServer.registerQuery("B = DISTINCT (FOREACH A GENERATE $0)
>>> PARALLEL 10;");
>>> +            pigServer.registerQuery("B = DISTINCT A.$0 PARALLEL 10;");
>>>         }else{
>>>             if(!useUDF) {
>>>                 pigServer.registerQuery("B = ORDER A BY $0 PARALLEL 10;");
>>> Index: test/org/apache/pig/test/TestEvalPipelineLocal.java
>>> ==================================================================>>> --- test/org/apache/pig/test/TestEvalPipelineLocal.java    (revision
>>> 1244760)
>>> +++ test/org/apache/pig/test/TestEvalPipelineLocal.java    (working copy)
>>> @@ -400,7 +400,7 @@
>>>                 + Util.generateURI(tmpFile.toString(), pigServer
>>>                         .getPigContext()) + "';");
>>>         if (eliminateDuplicates){
>>> -            pigServer.registerQuery("B = DISTINCT (FOREACH A GENERATE $0)
>>> PARALLEL 10;");
>>> +            pigServer.registerQuery("B = DISTINCT A.$0 PARALLEL 10;");
>>>         }else{
>>>             if(!useUDF) {
>>>                 pigServer.registerQuery("B = ORDER A BY $0 PARALLEL 10;");
>>> Index: src/org/apache/pig/parser/AstPrinter.g
>>> ==================================================================>>> Index: src/org/apache/pig/parser/QueryParser.g
>>> ==================================================================>>> --- src/org/apache/pig/parser/QueryParser.g    (revision 1244760)
>>> +++ src/org/apache/pig/parser/QueryParser.g    (working copy)
>>> @@ -506,7 +506,10 @@
>>>           | LEFT_PAREN! col_ref ( ASC | DESC )? RIGHT_PAREN!
>>>  ;
>>>
>>> -distinct_clause : DISTINCT^ rel partition_clause?
>>> +distinct_clause : DISTINCT rel PERIOD ( col_alias_or_index | ( LEFT_PAREN
>>> col_alias_or_index ( COMMA col_alias_or_index )* RIGHT_PAREN ) )
>>> partition_clause?
>>> +               -> ^( DISTINCT ^( FOREACH rel ^( FOREACH_PLAN_SIMPLE ^(
>>> GENERATE col_alias_or_index+ ) ) ) partition_clause? )
>>> +                | DISTINCT rel partition_clause?
>>> +               -> ^( DISTINCT rel partition_clause? )
>>>  ;
>>>
>>>  partition_clause : PARTITION^ BY! func_name