Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Need your help with Hadoop


Copy link to this message
-
Re: Need your help with Hadoop
You can mount the disks you need on Linux as proper paths, mkdir some
directories and make their owner the same user as the one who runs the
DataNode process, and then add the paths as a comma separated list in
hdfs-site.xml's "dfs.data.dir" (or "dfs.datanode.data.dir" if using 2.x) as:

<property>
<name>dfs.data.dir</name>
<value>/path/to/disk/1/mount/subdir,/path/to/disk/2/mount/subdir</value>
</property>
On Fri, Mar 22, 2013 at 9:03 AM, 姚吉龙 <[EMAIL PROTECTED]> wrote:

> Hi
>
> I need to mount two disk volume to my data-path, eg./usr/hadoop/tem/data
> and /sda
> How can I set this ?
>
>
> BRs
> Geelong
>
> 2013/3/20 Harsh J <[EMAIL PROTECTED]>
>
>> Hi,
>>
>> The settings property is "dfs.data.dir" (or "dfs.datanode.data.dir") and
>> its present in the hdfs-site.xml file at each DataNode, under
>> $HADOOP_HOME/conf/ directories usually. Look for asymmetrical configs among
>> various DNs for that property.
>>
>>
>> On Tue, Mar 19, 2013 at 9:09 PM, 姚吉龙 <[EMAIL PROTECTED]> wrote:
>>
>>> Thank for your reply.
>>> I am wondering  which parameters defines the capacity of datanode,or the
>>> way to calculate the capacity. I have considered the your answer before, I
>>> do not know how to modify the settings.
>>>  Besides, from my point, the capacity will be related with the disk
>>> volume which means that the capacity will be defined by the disk mounted
>>> on file system of Hadoop user's temp directory. While I can't find the
>>> detailed instructions about this.
>>> Why the capacity of others nodes is about 50G?
>>> These bothers me a lot.
>>>
>>> BRs
>>> Geelong
>>>
>>> 2013/3/19 Harsh J <[EMAIL PROTECTED]>
>>>
>>>> You'd probably want to recheck your configuration of dfs.data.dir on
>>>> the node16 (perhaps its overriding the usual default), to see if it is
>>>> perhaps including more dirs than normal (and they may be all on the same
>>>> disks as well, the DN counts space via du/df on each directory so the
>>>> number can grow that way).
>>>>
>>>> Also, please direct usage questions to [EMAIL PROTECTED]ommunity, which I've included in my response :)
>>>>
>>>>
>>>> On Tue, Mar 19, 2013 at 5:40 PM, 姚吉龙 <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I am a newer for the Hadoop platform, I really need your help.
>>>>> Now we have 32 datanodes available, while we find that the Configured
>>>>> Capacity is different among these datanodes though the hardware is the same.
>>>>> I wonder the reson why the node16 is much bigger than the others,
>>>>> besides which is main factor or directory that determine the capacity for
>>>>> each datanode.
>>>>>
>>>>>
>>>>> I wiil apprecite your kindly help, this problem has been puzzled me
>>>>> for a long time.
>>>>>
>>>>> BRs
>>>>> Geelong
>>>>>
>>>>> --
>>>>> From Good To Great
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>>
>>>
>>> --
>>> From Good To Great
>>>
>>
>>
>>
>> --
>> Harsh J
>>
>
>
>
> --
> From Good To Great
>

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB