Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - How to lower the total number of map tasks


+
Shing Hing Man 2012-10-02, 16:34
+
Bejoy Ks 2012-10-02, 17:01
+
Bejoy Ks 2012-10-02, 17:03
+
Shing Hing Man 2012-10-02, 17:38
+
Bejoy KS 2012-10-02, 17:46
+
Chris Nauroth 2012-10-02, 17:00
Copy link to this message
-
Re: How to lower the total number of map tasks
Shing Hing Man 2012-10-02, 17:33


 I set the block size using
  Configuration.setInt("dfs.block.size",134217728);
I have also set it  in mapred-site.xml.

Shing

________________________________
 From: Chris Nauroth <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; Shing Hing Man <[EMAIL PROTECTED]>
Sent: Tuesday, October 2, 2012 6:00 PM
Subject: Re: How to lower the total number of map tasks
 

Those numbers make sense, considering 1 map task per block.  16 GB file / 64 MB block size = ~242 map tasks.

When you doubled dfs.block.size, how did you accomplish that?  Typically, the block size is selected at file write time, with a default value from system configuration used if not specified.  Did you "hadoop fs -put" the file with the new block size, or was it something else?

Thank you,
--Chris
On Tue, Oct 2, 2012 at 9:34 AM, Shing Hing Man <[EMAIL PROTECTED]> wrote:
>
>
>I am running Hadoop 1.0.3 in Pseudo  distributed mode.
>When I  submit a map/reduce job to process a file of  size about 16 GB, in job.xml, I have the following
>
>
>mapred.map.tasks =242
>mapred.min.split.size =0
>dfs.block.size = 67108864
>
>
>I would like to reduce   mapred.map.tasks to see if it improves performance.
>I have tried doubling  the size of  dfs.block.size. But the    mapred.map.tasks remains unchanged.
>Is there a way to reduce  mapred.map.tasks  ?
>
>
>Thanks in advance for any assistance !  
>Shing
>
>
+
Bejoy KS 2012-10-02, 17:37
+
Shing Hing Man 2012-10-02, 18:17