Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: Can hadoop.tmp.dir be multivalued?


+
anil gupta 2012-12-18, 19:41
Copy link to this message
-
Re: Can hadoop.tmp.dir be multivalued?
Hi Anil,

Answering over [EMAIL PROTECTED]
[https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!forum/cdh-user]
cause the answer is CDH specific.

MR1 properties listing is documented at the MR1 Apache Hadoop docs
site, under http://archive.cloudera.com/cdh4/cdh/4/mr1/ at
http://archive.cloudera.com/cdh4/cdh/4/mr1/mapred-default.html

On Wed, Dec 19, 2012 at 1:11 AM, anil gupta <[EMAIL PROTECTED]> wrote:
> Hi Harsh,
>
> Sorry, i forgot to mention that I am using cdh4.1 and using MRv1. I got the
> mapreduce.cluster.temp.dir property from
> http://hadoop.apache.org/docs/mapreduce/current/mapred-default.html. Is it
> an incorrect source?
> Thanks for the prompt reply.
>
> ~Anil
>
>
> On Tue, Dec 18, 2012 at 11:13 AM, Harsh J <[EMAIL PROTECTED]> wrote:
>>
>> The purpose of the hadoop.tmp.dir is as its name says - for actual,
>> temporary data. For a more out-of-box experience, such that users have
>> little trouble configuring to get started, we use it as a base
>> property for several actual required properties. This is not suitable
>> for production of course - and is only done for OOB experience.
>>
>> If you wish to grant your TaskTracker or NodeManager several disks to
>> parallelize IO upon, use/override their respective local directory
>> configurations - and quit leveraging the out-of-box hadoop.tmp.dir
>> default.
>>
>> Also, what version of Hadoop are you asking your question around? The
>> property mapreduce.cluster.temp.dir does not exist/is not available in
>> 1.x and is irrelevant in 2.x. It seems to be a legacy property that is
>> no longer utilized.
>>
>> On Wed, Dec 19, 2012 at 12:15 AM, anil gupta <[EMAIL PROTECTED]>
>> wrote:
>> > Hi All,
>> >
>> > On my worker nodes i have 10 drives. So, in order to balance disk i/o i
>> > wanted to evenly distribute the disk read/write load. "hadoop.tmp.dir"
>> > is
>> > used for a lot of things in MR.
>> >
>> > mapreduce.cluster.local.dir${hadoop.tmp.dir}/mapred/localThe local
>> > directory
>> > where MapReduce stores intermediate data files. May be a comma-separated
>> > list of directories on different devices in order to spread disk i/o.
>> > Directories that do not exist are ignored.
>> > mapreduce.jobtracker.system.dir${hadoop.tmp.dir}/mapred/systemThe
>> > directory
>> > where MapReduce stores control files.
>> > mapreduce.jobtracker.staging.root.dir${hadoop.tmp.dir}/mapred/stagingThe
>> > root of the staging area for users' job files In practice, this should
>> > be
>> > the directory where users' home directories are located (usually /user)
>> > mapreduce.cluster.temp.dir${hadoop.tmp.dir}/mapred/tempA shared
>> > directory
>> > for temporary files.
>> >
>> > I am aware that mapreduce.cluster.local.dir can be multivalued and i can
>> > exlicitly set this property but i was wondering that it would be even
>> > better
>> > if i can set multiple values in hadoop.tmp.dir property. Also, is
>> > mapreduce.cluster.temp.dir property multivalued or single valued?
>> >
>> > --
>> > Thanks & Regards,
>> > Anil Gupta
>>
>>
>>
>> --
>> Harsh J
>
>
>
>
> --
> Thanks & Regards,
> Anil Gupta

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB