Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - config for high memory jobs does not work, please help.


Copy link to this message
-
config for high memory jobs does not work, please help.
Shaojun Zhao 2013-01-18, 20:05
Dear all,

I know it is best to use small amount of mem in mapper and reduce.
However, sometimes it is hard to do so. For example, in machine
learning algorithms, it is common to load the model into mem in the
mapper step. When the model is big, I have to allocate a lot of mem
for the mapper.

Here is my question: how can I config hadoop so that it does not fork
too many mappers and run out of physical memory?

My machines have 24G, and I have 100 of them. Each time, hadoop will
fork 6 mappers on each machine, no matter what config I used. I really
want to reduce it to what ever number I want, for example, just 1
mapper per machine.

Here are the config I tried. (I use streaming, and I pass the config
in the command line)

-Dmapred.child.java.opts=-Xmx8000m  <-- did not bring down the number of mappers

-Dmapred.cluster.map.memory.mb=32000 <-- did not bring down the number
of mappers

Am I missing something here?
I use Hadoop 0.20.205

Thanks a lot in advance!
-Shaojun
+
Jeffrey Buell 2013-01-18, 20:23
+
Arun C Murthy 2013-01-18, 22:54