Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Problem while using merge join

Copy link to this message
Re: Problem while using merge join
Okay, I think I have found the problem here:
http://pig.apache.org/docs/r0.11.1/perf.html#merge-joins ... there is

There may be filter statements and foreach statements between the sorted
data source and the join statement. The foreach statement should meet the
following conditions:

   - There should be no UDFs in the foreach statement.
   - The foreach statement should not change the position of the join keys.
   - There should be no transformation on the join keys which will change
   the sort order.
I have to use a UDF to transform the Map into a Bag ... any Workaround idea?

2013/9/13 John <[EMAIL PROTECTED]>

> Hi,
> I try to use a merge join for 2 bags. Here is my pig code:
> http://pastebin.com/Y9b2UtNk .
> But I got this error:
> Caused by:
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException:
> ERROR 1103: Merge join/Cogroup only supports Filter, Foreach, Ascending
> Sort, or Load as its predecessors. Found
> I think the reason is that there is no sort function or something like
> this. But the bags are definitely sorted. How can I do the merge join?
> thanks