Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Is this desirable: relation.projection as sugar for foreach relation generate projection


Copy link to this message
-
Re: Is this desirable: relation.projection as sugar for foreach relation generate projection
I should say one operator only do one thing instead.

Daniel

On Fri, Mar 2, 2012 at 1:48 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
> But that's already not the case. The syntax "a = distinct (foreach b
> generate  $1, $2);" is completely legal.
>
> D
>
> On Fri, Feb 24, 2012 at 2:52 PM, Daniel Dai <[EMAIL PROTECTED]> wrote:
>> One of my concern is that it could complicate GUI mapping for the Pig
>> script in the future. I feel it might be more clear one statement only
>> do one thing.
>>
>> Daniel
>>
>> On Thu, Feb 23, 2012 at 2:23 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:
>>> Adam, thanks for the comments. Below is the cat of the patch (it's short
>>> enough to just paste in line):
>>>
>>> Your comments are welcome, and I'd be curious what others think as well.
>>> The blurring of the line between bags and relations is what I'm worried
>>> about, but at the same time, one of the things people confuse the most is
>>> that distinction.
>>>
>>>
>>> Index: test/org/apache/pig/test/TestEvalPipeline.java
>>> ==================================================================>>> --- test/org/apache/pig/test/TestEvalPipeline.java    (revision 1244760)
>>> +++ test/org/apache/pig/test/TestEvalPipeline.java    (working copy)
>>> @@ -383,7 +383,7 @@
>>>         pigServer.registerQuery("A = LOAD '"
>>>                 + Util.generateURI(tmpFile.toString(), pigContext) + "';");
>>>         if (eliminateDuplicates){
>>> -            pigServer.registerQuery("B = DISTINCT (FOREACH A GENERATE $0)
>>> PARALLEL 10;");
>>> +            pigServer.registerQuery("B = DISTINCT A.$0 PARALLEL 10;");
>>>         }else{
>>>             if(!useUDF) {
>>>                 pigServer.registerQuery("B = ORDER A BY $0 PARALLEL 10;");
>>> Index: test/org/apache/pig/test/TestEvalPipelineLocal.java
>>> ==================================================================>>> --- test/org/apache/pig/test/TestEvalPipelineLocal.java    (revision
>>> 1244760)
>>> +++ test/org/apache/pig/test/TestEvalPipelineLocal.java    (working copy)
>>> @@ -400,7 +400,7 @@
>>>                 + Util.generateURI(tmpFile.toString(), pigServer
>>>                         .getPigContext()) + "';");
>>>         if (eliminateDuplicates){
>>> -            pigServer.registerQuery("B = DISTINCT (FOREACH A GENERATE $0)
>>> PARALLEL 10;");
>>> +            pigServer.registerQuery("B = DISTINCT A.$0 PARALLEL 10;");
>>>         }else{
>>>             if(!useUDF) {
>>>                 pigServer.registerQuery("B = ORDER A BY $0 PARALLEL 10;");
>>> Index: src/org/apache/pig/parser/AstPrinter.g
>>> ==================================================================>>> Index: src/org/apache/pig/parser/QueryParser.g
>>> ==================================================================>>> --- src/org/apache/pig/parser/QueryParser.g    (revision 1244760)
>>> +++ src/org/apache/pig/parser/QueryParser.g    (working copy)
>>> @@ -506,7 +506,10 @@
>>>           | LEFT_PAREN! col_ref ( ASC | DESC )? RIGHT_PAREN!
>>>  ;
>>>
>>> -distinct_clause : DISTINCT^ rel partition_clause?
>>> +distinct_clause : DISTINCT rel PERIOD ( col_alias_or_index | ( LEFT_PAREN
>>> col_alias_or_index ( COMMA col_alias_or_index )* RIGHT_PAREN ) )
>>> partition_clause?
>>> +               -> ^( DISTINCT ^( FOREACH rel ^( FOREACH_PLAN_SIMPLE ^(
>>> GENERATE col_alias_or_index+ ) ) ) partition_clause? )
>>> +                | DISTINCT rel partition_clause?
>>> +               -> ^( DISTINCT rel partition_clause? )
>>>  ;
>>>
>>>  partition_clause : PARTITION^ BY! func_name
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB