Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Re: config for high memory jobs does not work, please help.


Copy link to this message
-
Re: config for high memory jobs does not work, please help.
Shaojun Zhao 2013-01-18, 22:48
I do have this in my command line, and it did not work.
-Dmapred.tasktracker.map.tasks.maximum=2

I also tried changing mapred-site.xml, and restart the tasktracker, it
did not work either. I am sure it will work if I restart everything,
but I really do not want to lose my data on hdfs. So I have not tried
restarting everyting.

Best regards,
-Shaojun
On Fri, Jan 18, 2013 at 12:23 PM, Jeffrey Buell <[EMAIL PROTECTED]> wrote:
> Try:
>
> -Dmapred.tasktracker.map.tasks.maximum=1
>
> Although I usually put this parameter in mapred-site.xml.
>
> Jeff
>
>
> Dear all,
>
> I know it is best to use small amount of mem in mapper and reduce.
> However, sometimes it is hard to do so. For example, in machine
> learning algorithms, it is common to load the model into mem in the
> mapper step. When the model is big, I have to allocate a lot of mem
> for the mapper.
>
> Here is my question: how can I config hadoop so that it does not fork
> too many mappers and run out of physical memory?
>
> My machines have 24G, and I have 100 of them. Each time, hadoop will
> fork 6 mappers on each machine, no matter what config I used. I really
> want to reduce it to what ever number I want, for example, just 1
> mapper per machine.
>
> Here are the config I tried. (I use streaming, and I pass the config
> in the command line)
>
> -Dmapred.child.java.opts=-Xmx8000m  <-- did not bring down the number of mappers
>
> -Dmapred.cluster.map.memory.mb=32000 <-- did not bring down the number
> of mappers
>
> Am I missing something here?
> I use Hadoop 0.20.205
>
> Thanks a lot in advance!
> -Shaojun