|
|
-
Re: 答复: How To Distribute One Map Data To All Reduce Tasks?Karthik Kambatla 2012-07-05, 08:10
One way to achieve this would be to:
1. Emit the same value multiple times, each time with a different key. 2. Use these different keys, in conjunction with the partitioner, to achieve the desired distribution. Hope that helps! Karthik On Thu, Jul 5, 2012 at 12:19 AM, 静行 <[EMAIL PROTECTED]> wrote: > I have different key values to join two tables, but only a few key > values have large data to join and cost the most time, so I want to > distribute these key values to every reduce to join**** > > ** ** > > *发件人:* Devaraj k [mailto:[EMAIL PROTECTED]] > *发送时间:* 2012年7月5日 14:06 > *收件人:* [EMAIL PROTECTED] > *主题:* RE: How To Distribute One Map Data To All Reduce Tasks?**** > > ** ** > > Can you explain your usecase with some more details?**** > > **** > > Thanks**** > > Devaraj**** > ------------------------------ > > *From:* 静行 [[EMAIL PROTECTED]] > *Sent:* Thursday, July 05, 2012 9:53 AM > *To:* [EMAIL PROTECTED] > *Subject:* 答复: How To Distribute One Map Data To All Reduce Tasks?**** > > Thanks!**** > > But what I really want to know is how can I distribute one map data to > every reduce task, not one of reduce tasks.**** > > Do you have some ideas?**** > > **** > > *发件人:* Devaraj k [mailto:[EMAIL PROTECTED]] > *发送时间:* 2012年7月5日 12:12 > *收件人:* [EMAIL PROTECTED] > *主题:* RE: How To Distribute One Map Data To All Reduce Tasks?**** > > **** > > You can distribute the map data to the reduce tasks using Partitioner. By > default Job uses the HashPartitioner. You can use custom Partitioner it > according to your need.**** > > **** > > > http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Partitioner.html > **** > > **** > > Thanks**** > > Devaraj**** > ------------------------------ > > *From:* 静行 [[EMAIL PROTECTED]] > *Sent:* Thursday, July 05, 2012 9:00 AM > *To:* [EMAIL PROTECTED] > *Subject:* How To Distribute One Map Data To All Reduce Tasks?**** > > Hi all:**** > > How can I distribute one map data to all reduce tasks?**** > > **** > ------------------------------ > > > This email (including any attachments) is confidential and may be legally > privileged. If you received this email in error, please delete it > immediately and do not copy it or use it for any purpose or disclose its > contents to any other person. Thank you. > > 本电邮(包括任何附件) > 可能含有机密资料并受法律保护。如您不是正确的收件� 耍�肽�⒓瓷境�居始�G� 不要将本电邮进行复制并用� 魅魏纹渌�猛尽⒒蛲嘎侗居� 件之内容。谢谢。**** > |