|
|
-
Re: Partitioned Datasets Map/ReduceHemanth Yamijala 2010-07-06, 04:40
Hi,
> I have written my custom partitioner for partitioning datasets. I want to > partition two datasets using the same partitioner and then in the next > mapreduce job, I want each mapper to handle the same partition from the two > sources and perform some function such as joining etc. How I can I ensure that > one mapper gets the split that corresponds to same partition from both the > sources? > Not really an answer to your specific question, but have you taken a look at Pig (http://hadoop.apache.org/pig) which is suitable for operations like Joining data sets ? |