-Re: MultipleOutputs Files remain in temporary folder
Marcos Ortiz 2011-05-30, 19:51
On 05/30/2011 11:02 AM, Panayotis Antonopoulos wrote:
> I just noticed that the files that are created using MultipleOutputs
> remain in the temporary folder into attempt sub-folders when there is
> no normal output (using context.write(...)).
> Has anyone else noticed that?
> Is there any way to change that and make the files appear in the
> output directory?
> Thank you in advance!
This lets the MapReduce servers know where to store intermediate files.
This may be a comma-separated list of directories to spread the load.
Make sure thereοΏ½s enough space here for all your intermediate files. We
share the same disks for MapReduce and HDFS.
This is a folder in the|defaultFS|where MapReduce stores some control
files. In our case that would be a directory in HDFS. If you
have|dfs.permissions|(which it is by default) enabled make sure that
this directory exists and is owned by mapred:hadoop.
This is a folder to store temporary files in. It is hardly -- if at all
used. If I understand the description correctly this is supposed to be
in HDFS but IοΏ½m not entirely sure by reading the source code. So we set
this to a directory that exists on the local filesystem as well as in HDFS.
Marcos Luis Ortiz Valmaseda
Software Engineer (Distributed Systems)