Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - Can hadoop.tmp.dir be multivalued?


+
anil gupta 2012-12-18, 18:45
Copy link to this message
-
Re: Can hadoop.tmp.dir be multivalued?
Harsh J 2012-12-18, 19:13
The purpose of the hadoop.tmp.dir is as its name says - for actual,
temporary data. For a more out-of-box experience, such that users have
little trouble configuring to get started, we use it as a base
property for several actual required properties. This is not suitable
for production of course - and is only done for OOB experience.

If you wish to grant your TaskTracker or NodeManager several disks to
parallelize IO upon, use/override their respective local directory
configurations - and quit leveraging the out-of-box hadoop.tmp.dir
default.

Also, what version of Hadoop are you asking your question around? The
property mapreduce.cluster.temp.dir does not exist/is not available in
1.x and is irrelevant in 2.x. It seems to be a legacy property that is
no longer utilized.

On Wed, Dec 19, 2012 at 12:15 AM, anil gupta <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> On my worker nodes i have 10 drives. So, in order to balance disk i/o i
> wanted to evenly distribute the disk read/write load. "hadoop.tmp.dir" is
> used for a lot of things in MR.
>
> mapreduce.cluster.local.dir${hadoop.tmp.dir}/mapred/localThe local directory
> where MapReduce stores intermediate data files. May be a comma-separated
> list of directories on different devices in order to spread disk i/o.
> Directories that do not exist are ignored.
> mapreduce.jobtracker.system.dir${hadoop.tmp.dir}/mapred/systemThe directory
> where MapReduce stores control files.
> mapreduce.jobtracker.staging.root.dir${hadoop.tmp.dir}/mapred/stagingThe
> root of the staging area for users' job files In practice, this should be
> the directory where users' home directories are located (usually /user)
> mapreduce.cluster.temp.dir${hadoop.tmp.dir}/mapred/tempA shared directory
> for temporary files.
>
> I am aware that mapreduce.cluster.local.dir can be multivalued and i can
> exlicitly set this property but i was wondering that it would be even better
> if i can set multiple values in hadoop.tmp.dir property. Also, is
> mapreduce.cluster.temp.dir property multivalued or single valued?
>
> --
> Thanks & Regards,
> Anil Gupta

--
Harsh J