Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?


Copy link to this message
-
Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?
yes, hadoop.tmp.dir is both local and hdfs .
2013/12/17 Raviteja Chirala <[EMAIL PROTECTED]>

> If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount
> dir, create same in hdfs.
> ―
> Sent from Mailbox <https://www.dropbox.com/mailbox> for iPad
>
>
> On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao <[EMAIL PROTECTED]>wrote:
>
>> Thanks very much, I suppose I know what I should do with
>>
>>
>> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <[EMAIL PROTECTED]>wrote:
>>
>>>  Hi,
>>>
>>>
>>>
>>> *hadoop.tmp.dir* is not the exact configuration you are looking for
>>> spreading the disk I/O
>>>
>>>
>>>
>>> This is the default base directory ( its single directory not multiple)
>>> used in case you didn’t configure your own directories for processes such
>>> as NameNode, DataNode and NodeManager.
>>>
>>>
>>>
>>> Exact configurations where you need to configure comma separated values
>>> are as follows.
>>>
>>>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>>>
>>> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>>>
>>> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
>>> *yarn-site.xml*
>>>
>>>
>>>
>>> Please note all above configurations are for Hadoop 2.x
>>>
>>>
>>>
>>> Configure different subdirectories if you are using same disk for
>>> multiple processes.
>>>
>>>                 Ex: /hadoop/data1/dfs/data
>>>
>>>                         And
>>>
>>>                      /hadoop/data1/yarn/nm-local-dir
>>>
>>>
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Vinayakumar B
>>>
>>> *From:* Tao Xiao [mailto:[EMAIL PROTECTED]]
>>> *Sent:* 16 December 2013 14:42
>>> *To:* [EMAIL PROTECTED]
>>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
>>> node?
>>>
>>>
>>>
>>> Thanks.
>>>
>>> In order to spread I/O among multiple disks, should I assign a
>>> comma-separated list of directories which are located on different disks to
>>> "hadoop.tmp.dir"?
>>>
>>> for example,
>>>
>>>  <property>
>>>
>>>       <name>hadoop.tmp.dir</name>
>>>
>>>
>>> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>>>
>>>  </property>
>>>
>>>
>>>
>>> 2013/12/16 Shekhar Sharma <[EMAIL PROTECTED]>
>>>
>>> hadoop.tmp.dir is a directory created on local file system
>>> For example if you have set hadoop.tmp.dir property to
>>> /home/training/hadoop
>>>
>>> This directory will be created when you format the namenode by running
>>> the command
>>> hadoop namenode -format
>>>
>>> When you open this folder
>>>
>>>
>>> you will see two subfolders dfs and mapred.
>>>
>>> the /home/training/hadoop/mapred folder will be on HDFS also
>>>
>>> Hope this clears
>>> Regards,
>>> Som Shekhar Sharma
>>> +91-8197243810
>>>
>>>
>>>
>>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <[EMAIL PROTECTED]>
>>> wrote:
>>> > Hi,
>>> >
>>> > Make sure to also set mapred.local.dir to the same set of output
>>> > directories, this is were the intermediate key-value pairs are stored!
>>> >
>>> > Regards, Dieter
>>> >
>>> >
>>> > 2013/12/16 Tao Xiao <[EMAIL PROTECTED]>
>>> >>
>>> >> I have ten disks per node,and I don't know what value I should set to
>>> >> "hadoop.tmp.dir". Some said this property refers to a location in
>>> local disk
>>> >> while some other said it refers to a directory in HDFS. I'm confused,
>>> who
>>> >> can explain it ?
>>> >>
>>> >> I want to spread I/O since I have ten disks per node, so should I set
>>> a
>>> >> comma-separated list of directories (which are on different disks) to
>>> >> "hadoop.tmp.dir" ?
>>> >
>>> >
>>>
>>>
>>>
>>
>>
>