Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Problem while using merge join


Copy link to this message
-
Re: Problem while using merge join
John 2013-09-13, 17:04
Okay, I think I have found the problem here:
http://pig.apache.org/docs/r0.11.1/perf.html#merge-joins ... there is
wirtten;

There may be filter statements and foreach statements between the sorted
data source and the join statement. The foreach statement should meet the
following conditions:

   - There should be no UDFs in the foreach statement.
   - The foreach statement should not change the position of the join keys.
   - There should be no transformation on the join keys which will change
   the sort order.
I have to use a UDF to transform the Map into a Bag ... any Workaround idea?

thanks
2013/9/13 John <[EMAIL PROTECTED]>

> Hi,
>
> I try to use a merge join for 2 bags. Here is my pig code:
> http://pastebin.com/Y9b2UtNk .
>
> But I got this error:
>
> Caused by:
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException:
> ERROR 1103: Merge join/Cogroup only supports Filter, Foreach, Ascending
> Sort, or Load as its predecessors. Found
>
> I think the reason is that there is no sort function or something like
> this. But the bags are definitely sorted. How can I do the merge join?
>
> thanks
>