|
|
-
Re: MultipleOutputs Files remain in temporary folderMarcos Ortiz 2011-05-30, 19:51
On 05/30/2011 11:02 AM, Panayotis Antonopoulos wrote:
> Hello, > I just noticed that the files that are created using MultipleOutputs > remain in the temporary folder into attempt sub-folders when there is > no normal output (using context.write(...)). > > Has anyone else noticed that? > Is there any way to change that and make the files appear in the > output directory? > > Thank you in advance! > Panagiotis. |mapred.local.dir| This lets the MapReduce servers know where to store intermediate files. This may be a comma-separated list of directories to spread the load. Make sure thereοΏ½s enough space here for all your intermediate files. We share the same disks for MapReduce and HDFS. |mapred.system.dir| This is a folder in the|defaultFS|where MapReduce stores some control files. In our case that would be a directory in HDFS. If you have|dfs.permissions|(which it is by default) enabled make sure that this directory exists and is owned by mapred:hadoop. |mapred.temp.dir| This is a folder to store temporary files in. It is hardly -- if at all used. If I understand the description correctly this is supposed to be in HDFS but IοΏ½m not entirely sure by reading the source code. So we set this to a directory that exists on the local filesystem as well as in HDFS. -- Marcos Luis Ortiz Valmaseda Software Engineer (Distributed Systems) http://uncubanitolinuxero.blogspot.com |