Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Algorithm for cross product

Copy link to this message
Algorithm for cross product
Assume I have two data sources A and B
Assume I have an input format and can generate key values for both A and B
I want an algorithm which will generate the cross product of all values in A
having the key K and all values in B having the
key K.
Currently I use a mapper to generate key values for A and  have the reducer
get all values in B with key K and hold them in memory.
It works but might not scale.

Any bright ideas?

Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com