Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Problem while using merge join


Copy link to this message
-
Re: Problem while using merge join
Okay, I think I have found the problem here:
http://pig.apache.org/docs/r0.11.1/perf.html#merge-joins ... there is
wirtten;

There may be filter statements and foreach statements between the sorted
data source and the join statement. The foreach statement should meet the
following conditions:

   - There should be no UDFs in the foreach statement.
   - The foreach statement should not change the position of the join keys.
   - There should be no transformation on the join keys which will change
   the sort order.
I have to use a UDF to transform the Map into a Bag ... any Workaround idea?

thanks
2013/9/13 John <[EMAIL PROTECTED]>

> Hi,
>
> I try to use a merge join for 2 bags. Here is my pig code:
> http://pastebin.com/Y9b2UtNk .
>
> But I got this error:
>
> Caused by:
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException:
> ERROR 1103: Merge join/Cogroup only supports Filter, Foreach, Ascending
> Sort, or Load as its predecessors. Found
>
> I think the reason is that there is no sort function or something like
> this. But the bags are definitely sorted. How can I do the merge join?
>
> thanks
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB