Okay harsh : Your hint was enought to get me back on trakc! I found the
linux container logs and they are Wonderful :)... I guess at the end of
each container run, logs get propogated into the Distributed file system's
In any case, once i dug in there, I found the cryptic failure was because
my done_intermediate permissions were bad.
anyways, thanks for the hint Harsh ! After monitoring the local
/var/log/hadoop-yarn/container/ directory, i was able to see that the
stdout/stderr files were being deleted , and then after some googling i
found a post about how YARN aggregates logs into the DFS.
Anyways, problem solved. For those curious: If debugging
Yarn-linux-containers that are dying (as shown in [local]
/var/log/hadoop-yarn/ nodemanager logs), you can dig more after the task
dies by going into
hadoop fs -cat
On Fri, Feb 14, 2014 at 9:17 AM, German Florez-Larrahondo <
[EMAIL PROTECTED]> wrote: