Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> How to balance reduce job


+
rauljin 2013-04-17, 04:53
+
bejoy.hadoop@... 2013-04-17, 05:09
+
Mohammad Tariq 2013-04-17, 05:16
+
bejoy.hadoop@... 2013-04-17, 05:38
+
rauljin 2013-04-17, 06:47
Copy link to this message
-
Re: Re: How to balance reduce job
You can use input sampler , and you have to plug a custom partitioner which
would ensure that all reducers have near-equal pairs to process. The input
sampler goes over the sampled data before the execution of the job starts.
I also had some doubt , but got no response.

Thanks,
Rahul
On Wed, Apr 17, 2013 at 12:17 PM, rauljin <[EMAIL PROTECTED]> wrote:

> **
>      <property>
>         <name>mapred.tasktracker.map.tasks.maximum</name>
>         <value>4</value>
>     </property>
>
>     <property>
>         <name>mapred.tasktracker.reduce.tasks.maximum</name>
>         <value>4</value>
>     </property>
>
>    I am not clear the number  of reuce slots in each Task tracker.Is it
> define in the configuration?
>
>
>
>
>
> ------------------------------
> rauljin
>
>  *From:* bejoy.hadoop <[EMAIL PROTECTED]>
> *Date:* 2013-04-17 13:09
> *To:* user <[EMAIL PROTECTED]>; liujin666jin <[EMAIL PROTECTED]>
> *Subject:* Re: How to balance reduce job
>  Hi Rauljin
>
> Few things to check here.
> What is the number of reduce slots in each Task Tracker? What is the
> number of reduce tasks for your job?
> Based on the availability of slots the reduce tasks are scheduled on TTs.
>
> You can do the following
> Set the number of reduce tasks to 8 or more.
> Play with the number of slots (not very advisable for tweaking this on a
> job level )
>
> The reducers are scheduled purely based on the slot availability so it
> won't be that easy to ensure that all TT are evenly loaded with same number
> of reducers.
> Regards
> Bejoy KS
>
> Sent from remote device, Please excuse typos
> ------------------------------
> *From: *rauljin <[EMAIL PROTECTED]>
> *Date: *Wed, 17 Apr 2013 12:53:37 +0800
> *To: *[EMAIL PROTECTED]<[EMAIL PROTECTED]>
> *ReplyTo: *[EMAIL PROTECTED]
> *Subject: *How to balance reduce job
>
> 8 datanode in my hadoop cluseter ,when running reduce job,there is only 2
> datanode running the job .
>
> I want to use the 8 datanode to run the reduce job,so I can balance the
> I/O press.
>
> Any ideas?
>
> Thanks.
>
> ------------------------------
> rauljin
>
+
Tony Burton 2013-05-07, 15:13