You can use the below property to pass the debug params to the child jvm. And also you should make sure that have only one task running at a time by giving the input appropriately.
<description>Java opts for the task tracker child processes.
The following symbol, if present, will be interpolated: @taskid@ is replaced
by current TaskID. Any other occurrences of '@' will go unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
The configuration variable mapred.child.ulimit can be used to control the
maximum virtual memory of the child processes.
From: Ravi Prakash [[EMAIL PROTECTED]]
Sent: Thursday, March 29, 2012 9:36 PM
To: [EMAIL PROTECTED]
Subject: Re: Debug MR tasks impossible.
I know its sub-optimal but you should be able to put in as many System.out.println / log messages as you want and you should be able to see them in stdout, and syslog files. Which version of hadoop are you using?
On Thu, Mar 29, 2012 at 10:33 AM, Pedro Costa <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
I'm trying to debug map and reduce tasks for a quite long time, and it seems that it's impossible. MR are launched in new process and there's no way to debug them. Even with IsolationRunner class it's impossible. This isn't good because I really need to debug the class, to understand some changes that I made to the code.
I wonder how MapReduce programmers could debug the code that they implemented?