|
|
-
Re: memory management of capacity schedulingHemanth Yamijala 2010-06-26, 18:09
Shashank,
> Hi, > > Setup Info: > I have 2 node hadoop (20.2) cluster on Linux boxes. > HW info: 16 CPU (Hyperthreaded) > RAM: 32 GB > > I am trying to configure capacity scheduling. I want to use memory > management provided by capacity scheduler. But I am facing few issues. > I have added hadoop-0.20.2-capacity-scheduler.jar in lib. Also added > ‘mapred.jobtracker.taskScheduler’ in hadoop-site.xml First things first - the memory management implementation in the capacity scheduler has seen significant improvements in Hadoop 0.21. Specifically, the implementation in Hadoop 0.20 could cause a high degree of cluster under utilization that was fixed in MAPREDUCE-516 and subsequent JIRAs in Hadoop 0.21. > I have added below in capacity-scheduler.xml file, but I get error: > <property> > <name>mapred.tasktracker.vmem.reserved</name> > <value>26624m</value> > <description>A number, in bytes, that represents an offset. The total > VMEM > on the machine, minus this offset, is the VMEM node-limit for all > tasks, and their descendants, spawned by the TT. > </description> > </property> > <property> > <name>mapred.task.default.maxvmem</name> > <value>512k</value> > <description>A number, in bytes, that represents the default VMEM > task-limit associated with a task. Unless overridden by a job's > setting, this number defines the VMEM task-limit. > </description> > </property> > <property> > <name>mapred.task.limit.maxvmem</name> > <value>4096m</value> > <description>A number, in bytes, that represents the upper VMEM > task-limit > associated with a task. Users, when specifying a VMEM task-limit for > their tasks, should not specify a limit which exceeds this amount. > </description> > </property> > <property> > <name>mapred.tasktracker.pmem.reserved</name> > <value>26624m</value> > <description>Physical Memory > </description> > </property> IIRC, these parameters were removed and certain new parameters were introduced. Trunk's documentation is now updated with the exact list of these parameters, their descriptions and usage - but I fear if the parameter names in Hadoop 20 and trunk would have changed. Your best bet could be to use the parameters listed in http://bit.ly/97SDz2 and try out. > > Error: > 2010-06-25 08:02:06,026 ERROR org.apache.hadoop.mapred.TaskTracker: Can not > start task tracker because java.io.IOException: Call to > node1.hadoopcluster.com/192.168.1.241:9001 failed on local exception: > java.io.IOException: Connection reset by peer > at org.apache.hadoop.ipc.Client.wrapException(Client.java:775) > at org.apache.hadoop.ipc.Client.call(Client.java:743) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown > Source) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383) > at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:314) > at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:291) > at > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:514) > at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:934) > at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2833) > Caused by: java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcher.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:33) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:234) > at sun.nio.ch.IOUtil.read(IOUtil.java:207) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236) > at > org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55) > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) > at |