-Re: Does the pig optimizer keep track of relations that are already sorted when doing a JOIN?
No, Pig currently doesn't auto-detect that data is getting sorted in
previous steps of script. So, you need to tell it by 'using merge'.
Hope it helps,
On Fri, Aug 19, 2011 at 22:51, Kevin Burton <[EMAIL PROTECTED]> wrote:
> I was reading about USING 'merge' with JOIN when relations are already
> I actually was just looking through some code and realized that one of my
> JOINs was on two relations that were *already* sorted due to a DISTINCT and
> GROUP operation.
> I just added USING 'merge' and the initial results look the same.
> I haven't benchmarked it though.
> Does/would the existing optimizer be able to detect this and just use merge
> without manual intervention?
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> Skype: *burtonator*
> Skype-in: *(415) 871-0687*