|
|
-
Losing ordering after using ORDER BY
James Newhaven 2012-05-29, 19:25
Hi,
I've noticed that I seem to be losing the ordering of my relation after passing the result of an ORDER BY to an EVAL function.
For example:
D = FOREACH C GENERATE COUNT($1) as countd; E = ORDER D BY $0 DESC; D1 = GROUP E ALL; D2 = FOREACH D1 GENERATE MyCustomEvalFunc($1);
When inspecting the results in MyCustomEvalFunc I noticed the ordering of my results isn't the same as relation E (which uses ORDER BY DESC).
Any help appreciated!
Thanks, James
+
James Newhaven 2012-05-29, 19:25
-
Re: Losing ordering after using ORDER BY
Jonathan Coveney 2012-05-29, 19:43
If you do a grouping, the ordering changes. What you want to do is:
D = FOREACH C GENERATE COUNT($1) as countd; D1 = GROUP D ALL; D2 = FOREACH D1 { ord = ORDER $1 BY $0 desc; GENERATE MyCustomEvalFunc(ord); }
Keep in mind that you'llbe ordering all of your data on one reducer, but this isn't very different from what you're doing, where you were passing all of your data to one reducer anyway (which is what group all generally does). If you have memory issues, this is why.
2012/5/29 James Newhaven <[EMAIL PROTECTED]>
> Hi, > > I've noticed that I seem to be losing the ordering of my relation after > passing the result of an ORDER BY to an EVAL function. > > For example: > > D = FOREACH C GENERATE COUNT($1) as countd; > E = ORDER D BY $0 DESC; > D1 = GROUP E ALL; > D2 = FOREACH D1 GENERATE MyCustomEvalFunc($1); > > When inspecting the results in MyCustomEvalFunc I noticed the ordering of > my results isn't the same as relation E (which uses ORDER BY DESC). > > Any help appreciated! > > Thanks, > James >
+
Jonathan Coveney 2012-05-29, 19:43
-
Re: Losing ordering after using ORDER BY
James Newhaven 2012-05-30, 21:10
Thanks Jonathan. That worked fine.
James
On 29 May 2012, at 08:43 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:
> If you do a grouping, the ordering changes. What you want to do is: > > D = FOREACH C GENERATE COUNT($1) as countd; > D1 = GROUP D ALL; > D2 = FOREACH D1 { > ord = ORDER $1 BY $0 desc; > GENERATE MyCustomEvalFunc(ord); > } > > Keep in mind that you'llbe ordering all of your data on one reducer, but > this isn't very different from what you're doing, where you were passing > all of your data to one reducer anyway (which is what group all generally > does). If you have memory issues, this is why. > > 2012/5/29 James Newhaven <[EMAIL PROTECTED]> > >> Hi, >> >> I've noticed that I seem to be losing the ordering of my relation after >> passing the result of an ORDER BY to an EVAL function. >> >> For example: >> >> D = FOREACH C GENERATE COUNT($1) as countd; >> E = ORDER D BY $0 DESC; >> D1 = GROUP E ALL; >> D2 = FOREACH D1 GENERATE MyCustomEvalFunc($1); >> >> When inspecting the results in MyCustomEvalFunc I noticed the ordering of >> my results isn't the same as relation E (which uses ORDER BY DESC). >> >> Any help appreciated! >> >> Thanks, >> James >>
+
James Newhaven 2012-05-30, 21:10
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext