Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Does the pig optimizer keep track of relations that are already sorted when doing a JOIN?


+
Kevin Burton 2011-08-20, 05:51
Copy link to this message
-
Re: Does the pig optimizer keep track of relations that are already sorted when doing a JOIN?
Hey Kevin,

No, Pig currently doesn't auto-detect that data is getting sorted in
previous steps of script. So, you need to tell it by 'using merge'.

Hope it helps,
Ashutosh

On Fri, Aug 19, 2011 at 22:51, Kevin Burton <[EMAIL PROTECTED]> wrote:

> I was reading about USING 'merge' with JOIN when relations are already
> sorted.
>
> I actually was just looking through some code and realized that one of my
> JOINs was on two relations that were *already* sorted due to a DISTINCT and
> GROUP operation.
>
> I just added USING 'merge' and the initial results look the same.
>
> I haven't benchmarked it though.
>
> Does/would the existing optimizer be able to detect this and just use merge
> without manual intervention?
>
> --
>
> Founder/CEO Spinn3r.com
>
> Location: *San Francisco, CA*
> Skype: *burtonator*
>
> Skype-in: *(415) 871-0687*
>
+
Kevin Burton 2011-08-20, 07:09
+
Andrew Clegg 2011-08-21, 11:27
+
Ashutosh Chauhan 2011-08-21, 16:59
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB