Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Problem while using merge join


+
John 2013-09-13, 16:37
+
John 2013-09-13, 17:04
Copy link to this message
-
Re: Problem while using merge join
Since your join key is not in the Bag, can you do your join first and then
execute your UDF?
On Fri, Sep 13, 2013 at 10:04 AM, John <[EMAIL PROTECTED]> wrote:

> Okay, I think I have found the problem here:
> http://pig.apache.org/docs/r0.11.1/perf.html#merge-joins ... there is
> wirtten;
>
> There may be filter statements and foreach statements between the sorted
> data source and the join statement. The foreach statement should meet the
> following conditions:
>
>    - There should be no UDFs in the foreach statement.
>    - The foreach statement should not change the position of the join keys.
>    - There should be no transformation on the join keys which will change
>    the sort order.
>
>
> I have to use a UDF to transform the Map into a Bag ... any Workaround
> idea?
>
> thanks
>
>
> 2013/9/13 John <[EMAIL PROTECTED]>
>
> > Hi,
> >
> > I try to use a merge join for 2 bags. Here is my pig code:
> > http://pastebin.com/Y9b2UtNk .
> >
> > But I got this error:
> >
> > Caused by:
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException:
> > ERROR 1103: Merge join/Cogroup only supports Filter, Foreach, Ascending
> > Sort, or Load as its predecessors. Found
> >
> > I think the reason is that there is no sort function or something like
> > this. But the bags are definitely sorted. How can I do the merge join?
> >
> > thanks
> >
>
+
John 2013-09-13, 17:58
+
John 2013-09-13, 18:34
+
Shahab Yunus 2013-09-13, 19:00
+
John 2013-09-13, 19:06
+
Pradeep Gollakota 2013-09-13, 20:16
+
John 2013-09-13, 20:51