Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Can hadoop.tmp.dir be multivalued?


+
anil gupta 2012-12-18, 18:45
Copy link to this message
-
Re: Can hadoop.tmp.dir be multivalued?
The purpose of the hadoop.tmp.dir is as its name says - for actual,
temporary data. For a more out-of-box experience, such that users have
little trouble configuring to get started, we use it as a base
property for several actual required properties. This is not suitable
for production of course - and is only done for OOB experience.

If you wish to grant your TaskTracker or NodeManager several disks to
parallelize IO upon, use/override their respective local directory
configurations - and quit leveraging the out-of-box hadoop.tmp.dir
default.

Also, what version of Hadoop are you asking your question around? The
property mapreduce.cluster.temp.dir does not exist/is not available in
1.x and is irrelevant in 2.x. It seems to be a legacy property that is
no longer utilized.

On Wed, Dec 19, 2012 at 12:15 AM, anil gupta <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> On my worker nodes i have 10 drives. So, in order to balance disk i/o i
> wanted to evenly distribute the disk read/write load. "hadoop.tmp.dir" is
> used for a lot of things in MR.
>
> mapreduce.cluster.local.dir${hadoop.tmp.dir}/mapred/localThe local directory
> where MapReduce stores intermediate data files. May be a comma-separated
> list of directories on different devices in order to spread disk i/o.
> Directories that do not exist are ignored.
> mapreduce.jobtracker.system.dir${hadoop.tmp.dir}/mapred/systemThe directory
> where MapReduce stores control files.
> mapreduce.jobtracker.staging.root.dir${hadoop.tmp.dir}/mapred/stagingThe
> root of the staging area for users' job files In practice, this should be
> the directory where users' home directories are located (usually /user)
> mapreduce.cluster.temp.dir${hadoop.tmp.dir}/mapred/tempA shared directory
> for temporary files.
>
> I am aware that mapreduce.cluster.local.dir can be multivalued and i can
> exlicitly set this property but i was wondering that it would be even better
> if i can set multiple values in hadoop.tmp.dir property. Also, is
> mapreduce.cluster.temp.dir property multivalued or single valued?
>
> --
> Thanks & Regards,
> Anil Gupta

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB