|
|
-
Setting number of mappers in Teragen
anil gupta 2012-12-26, 18:19
Hi All,
I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map slots in my cluster. But when, i start my Teragen job, it only spawns 2 mappers for entire job. I have even tried using the option -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use 20 mappers for generating the data? I am using cdh4.1.2 with Mapreducev1(Hadoop 0.20.2) -- Thanks & Regards, Anil Gupta
-
Re: Setting number of mappers in Teragen
Harsh J 2012-12-26, 18:33
The MR1 teragen's mappers # depends on the total number of rows and demanded # of maps.
How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All generic options must go in before any other options do, so it should appear right after the word "teragen" in your command.
On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <[EMAIL PROTECTED]> wrote: > Hi All, > > I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map > slots in my cluster. But when, i start my Teragen job, it only spawns 2 > mappers for entire job. I have even tried using the option > -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use 20 > mappers for generating the data? I am using cdh4.1.2 with Mapreducev1(Hadoop > 0.20.2) > -- > Thanks & Regards, > Anil Gupta
-- Harsh J
-
Re: Setting number of mappers in Teragen
anil gupta 2012-12-26, 18:41
Hi Harsh,
Fixed it. I was putting the -Dmapred.map.tasks=20 after specifying the input directory. I completely forgot about this trick of genericOptionParser of Hadoop. Thanks a lot. :)
On Wed, Dec 26, 2012 at 10:33 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> The MR1 teragen's mappers # depends on the total number of rows and > demanded # of maps. > > How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All > generic options must go in before any other options do, so it should > appear right after the word "teragen" in your command. > > On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <[EMAIL PROTECTED]> > wrote: > > Hi All, > > > > I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map > > slots in my cluster. But when, i start my Teragen job, it only spawns 2 > > mappers for entire job. I have even tried using the option > > -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use > 20 > > mappers for generating the data? I am using cdh4.1.2 with > Mapreducev1(Hadoop > > 0.20.2) > > -- > > Thanks & Regards, > > Anil Gupta > > > > -- > Harsh J >
-- Thanks & Regards, Anil Gupta
|
|