Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> MultipleOutputs Files remain in temporary folder


Copy link to this message
-
Re: MultipleOutputs Files remain in temporary folder
On 05/30/2011 11:02 AM, Panayotis Antonopoulos wrote:
> Hello,
> I just noticed that the files that are created using MultipleOutputs
> remain in the temporary folder into attempt sub-folders when there is
> no normal output  (using context.write(...)).
>
> Has anyone else noticed that?
> Is there any way to change that and make the files appear in the
> output directory?
>
> Thank you in advance!
> Panagiotis.
        |mapred.local.dir|

This lets the MapReduce servers know where to store intermediate files.
This may be a comma-separated list of directories to spread the load.
Make sure thereοΏ½s enough space here for all your intermediate files. We
share the same disks for MapReduce and HDFS.
        |mapred.system.dir|

This is a folder in the|defaultFS|where MapReduce stores some control
files. In our case that would be a directory in HDFS. If you
have|dfs.permissions|(which it is by default) enabled make sure that
this directory exists and is owned by mapred:hadoop.
        |mapred.temp.dir|

This is a folder to store temporary files in. It is hardly -- if at all
used. If I understand the description correctly this is supposed to be
in HDFS but IοΏ½m not entirely sure by reading the source code. So we set
this to a directory that exists on the local filesystem as well as in HDFS.

--
Marcos Luis Ortiz Valmaseda
  Software Engineer (Distributed Systems)
  http://uncubanitolinuxero.blogspot.com