Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Losing ordering after using ORDER BY


+
James Newhaven 2012-05-29, 19:25
+
Jonathan Coveney 2012-05-29, 19:43
Copy link to this message
-
Re: Losing ordering after using ORDER BY
Thanks Jonathan. That worked fine.

James

On 29 May 2012, at 08:43 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:

> If you do a grouping, the ordering changes. What you want to do is:
>
> D = FOREACH C GENERATE COUNT($1) as countd;
> D1 = GROUP D ALL;
> D2 = FOREACH D1 {
>  ord = ORDER $1 BY $0 desc;
>  GENERATE MyCustomEvalFunc(ord);
> }
>
> Keep in mind that you'llbe ordering all of your data on one reducer, but
> this isn't very different from what you're doing, where you were passing
> all of your data to one reducer anyway (which is what group all generally
> does). If you have memory issues, this is why.
>
> 2012/5/29 James Newhaven <[EMAIL PROTECTED]>
>
>> Hi,
>>
>> I've noticed that I seem to be losing the ordering of my relation after
>> passing the result of an ORDER BY to an EVAL function.
>>
>> For example:
>>
>> D = FOREACH C GENERATE COUNT($1) as countd;
>> E = ORDER D BY $0 DESC;
>> D1 = GROUP E ALL;
>> D2 = FOREACH D1 GENERATE MyCustomEvalFunc($1);
>>
>> When inspecting the results in MyCustomEvalFunc I noticed the ordering of
>> my results isn't the same as relation E (which uses ORDER BY DESC).
>>
>> Any help appreciated!
>>
>> Thanks,
>> James
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB