We are noticing a problem where we get a filesystem closed exception when a
map task is done and is finishing execution. By map task, I literally mean
the MapTask class of the map reduce code. Debugging this we found that the
mapper is getting a handle to the filesystem object and itself calling a
close on it. Because filesystem objects are cached, I believe the behaviour
is as expected in terms of the exception.
I just wanted to confirm that:
- if we do have a requirement to use a filesystem object in a mapper or
reducer, we should either not close it ourselves
- or (seems better to me) ask for a new version of the filesystem instance
by setting the fs.hdfs.impl.disable.cache property to true in job
Also, does anyone know if this behaviour was any different in Hadoop 0.20 ?
For some context, this behaviour is actually seen in Oozie, which runs a
launcher mapper for a simple java action. Hence, the java action could very
well interact with a file system. I know this is probably better addressed
in Oozie context, but wanted to get the map reduce view of things.
Harsh J 2013-01-25, 06:21
Alejandro Abdelnur 2013-01-30, 18:49
Hemanth Yamijala 2013-01-31, 04:28