Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - How To Distribute One Map Data To All Reduce Tasks?


+
静行 2012-07-05, 03:30
+
Devaraj k 2012-07-05, 04:11
+
静行 2012-07-05, 04:23
+
Devaraj k 2012-07-05, 06:06
+
静行 2012-07-05, 07:19
Copy link to this message
-
Re: 答复: How To Distribute One Map Data To All Reduce Tasks?
Karthik Kambatla 2012-07-05, 08:10
One way to achieve this would be to:

   1. Emit the same value multiple times, each time with a different key.
   2. Use these different keys, in conjunction with the partitioner, to
   achieve the desired distribution.

Hope that helps!

Karthik

On Thu, Jul 5, 2012 at 12:19 AM, 静行 <[EMAIL PROTECTED]> wrote:

>  I have different key values to join two tables, but only a few key
> values have large data to join and cost the most time, so I want to
> distribute these key values to every reduce to join****
>
> ** **
>
> *发件人:* Devaraj k [mailto:[EMAIL PROTECTED]]
> *发送时间:* 2012年7月5日 14:06
> *收件人:* [EMAIL PROTECTED]
> *主题:* RE: How To Distribute One Map Data To All Reduce Tasks?****
>
>  ** **
>
> Can you explain your usecase with some more details?****
>
>  ****
>
> Thanks****
>
> Devaraj****
>  ------------------------------
>
> *From:* 静行 [[EMAIL PROTECTED]]
> *Sent:* Thursday, July 05, 2012 9:53 AM
> *To:* [EMAIL PROTECTED]
> *Subject:* 答复: How To Distribute One Map Data To All Reduce Tasks?****
>
> Thanks!****
>
> But what I really want to know is how can I distribute one map data to
> every reduce task, not one of reduce tasks.****
>
> Do you have some ideas?****
>
>  ****
>
> *发件人:* Devaraj k [mailto:[EMAIL PROTECTED]]
> *发送时间:* 2012年7月5日 12:12
> *收件人:* [EMAIL PROTECTED]
> *主题:* RE: How To Distribute One Map Data To All Reduce Tasks?****
>
>  ****
>
> You can distribute the map data to the reduce tasks using Partitioner.  By
> default Job uses the HashPartitioner. You can use custom Partitioner it
> according to your need.****
>
>  ****
>
>
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Partitioner.html
> ****
>
>  ****
>
> Thanks****
>
> Devaraj****
>  ------------------------------
>
> *From:* 静行 [[EMAIL PROTECTED]]
> *Sent:* Thursday, July 05, 2012 9:00 AM
> *To:* [EMAIL PROTECTED]
> *Subject:* How To Distribute One Map Data To All Reduce Tasks?****
>
> Hi all:****
>
>          How can I distribute one map data to all reduce tasks?****
>
>  ****
>  ------------------------------
>
>
> This email (including any attachments) is confidential and may be legally
> privileged. If you received this email in error, please delete it
> immediately and do not copy it or use it for any purpose or disclose its
> contents to any other person. Thank you.
>
> 本电邮(包括任何附件)
> 可能含有机密资料并受法律保护。如您不是正确的收件�
耍�肽�⒓瓷境�居始�G�
不要将本电邮进行复制并用�
魅魏纹渌�猛尽⒒蛲嘎侗居�
件之内容。谢谢。****
>