|
|
-
Re: Need your help with Hadoop
Harsh J 2013-03-19, 14:57
You'd probably want to recheck your configuration of dfs.data.dir on the node16 (perhaps its overriding the usual default), to see if it is perhaps including more dirs than normal (and they may be all on the same disks as well, the DN counts space via du/df on each directory so the number can grow that way).
Also, please direct usage questions to [EMAIL PROTECTED] community, which I've included in my response :) On Tue, Mar 19, 2013 at 5:40 PM, 姚吉龙 <[EMAIL PROTECTED]> wrote:
> Hi > > I am a newer for the Hadoop platform, I really need your help. > Now we have 32 datanodes available, while we find that the Configured > Capacity is different among these datanodes though the hardware is the same. > I wonder the reson why the node16 is much bigger than the others, besides > which is main factor or directory that determine the capacity for each > datanode. > > > I wiil apprecite your kindly help, this problem has been puzzled me for a > long time. > > BRs > Geelong > > -- > From Good To Great >
-- Harsh J
-
Re: Need your help with Hadoop
Harsh J 2013-03-22, 05:04
You can mount the disks you need on Linux as proper paths, mkdir some directories and make their owner the same user as the one who runs the DataNode process, and then add the paths as a comma separated list in hdfs-site.xml's "dfs.data.dir" (or "dfs.datanode.data.dir" if using 2.x) as:
<property> <name>dfs.data.dir</name> <value>/path/to/disk/1/mount/subdir,/path/to/disk/2/mount/subdir</value> </property> On Fri, Mar 22, 2013 at 9:03 AM, 姚吉龙 <[EMAIL PROTECTED]> wrote:
> Hi > > I need to mount two disk volume to my data-path, eg./usr/hadoop/tem/data > and /sda > How can I set this ? > > > BRs > Geelong > > 2013/3/20 Harsh J <[EMAIL PROTECTED]> > >> Hi, >> >> The settings property is "dfs.data.dir" (or "dfs.datanode.data.dir") and >> its present in the hdfs-site.xml file at each DataNode, under >> $HADOOP_HOME/conf/ directories usually. Look for asymmetrical configs among >> various DNs for that property. >> >> >> On Tue, Mar 19, 2013 at 9:09 PM, 姚吉龙 <[EMAIL PROTECTED]> wrote: >> >>> Thank for your reply. >>> I am wondering which parameters defines the capacity of datanode,or the >>> way to calculate the capacity. I have considered the your answer before, I >>> do not know how to modify the settings. >>> Besides, from my point, the capacity will be related with the disk >>> volume which means that the capacity will be defined by the disk mounted >>> on file system of Hadoop user's temp directory. While I can't find the >>> detailed instructions about this. >>> Why the capacity of others nodes is about 50G? >>> These bothers me a lot. >>> >>> BRs >>> Geelong >>> >>> 2013/3/19 Harsh J <[EMAIL PROTECTED]> >>> >>>> You'd probably want to recheck your configuration of dfs.data.dir on >>>> the node16 (perhaps its overriding the usual default), to see if it is >>>> perhaps including more dirs than normal (and they may be all on the same >>>> disks as well, the DN counts space via du/df on each directory so the >>>> number can grow that way). >>>> >>>> Also, please direct usage questions to [EMAIL PROTECTED]ommunity, which I've included in my response :) >>>> >>>> >>>> On Tue, Mar 19, 2013 at 5:40 PM, 姚吉龙 <[EMAIL PROTECTED]> wrote: >>>> >>>>> Hi >>>>> >>>>> I am a newer for the Hadoop platform, I really need your help. >>>>> Now we have 32 datanodes available, while we find that the Configured >>>>> Capacity is different among these datanodes though the hardware is the same. >>>>> I wonder the reson why the node16 is much bigger than the others, >>>>> besides which is main factor or directory that determine the capacity for >>>>> each datanode. >>>>> >>>>> >>>>> I wiil apprecite your kindly help, this problem has been puzzled me >>>>> for a long time. >>>>> >>>>> BRs >>>>> Geelong >>>>> >>>>> -- >>>>> From Good To Great >>>>> >>>> >>>> >>>> >>>> -- >>>> Harsh J >>>> >>> >>> >>> >>> -- >>> From Good To Great >>> >> >> >> >> -- >> Harsh J >> > > > > -- > From Good To Great >
-- Harsh J
-
Re: Need your help with Hadoop
Harsh J 2013-03-22, 06:55
Use dfs.data.dir if you are on a 1.x or 0.20.x based release. Use dfs.datanode.data.dir if you are on a 2.x release. The property applies to a DataNode as its name goes, and hence has to be present on ALL the DataNodes' hdfs-site.xml.
Please also include the user list in your response, thanks! :) On Fri, Mar 22, 2013 at 12:15 PM, 姚吉龙 <[EMAIL PROTECTED]> wrote:
> Hi > > Just as you said, we have changed the parameters: dfs.datanode.data.dir > in hdfs-site.xml, but it did not work. How can I make this setting works? > or we need to change every hdfs.xml including the namenode and other > datanode? > > Are there any specific need for the path we added into the hdfs.xml? > More detials you can see in the details > > > BRs > Geelong > > > 2013/3/22 Harsh J <[EMAIL PROTECTED]> > >> You can mount the disks you need on Linux as proper paths, mkdir some >> directories and make their owner the same user as the one who runs the >> DataNode process, and then add the paths as a comma separated list in >> hdfs-site.xml's "dfs.data.dir" (or "dfs.datanode.data.dir" if using 2.x) as: >> >> <property> >> <name>dfs.data.dir</name> >> <value>/path/to/disk/1/mount/subdir,/path/to/disk/2/mount/subdir</value> >> </property> >> >> >> On Fri, Mar 22, 2013 at 9:03 AM, 姚吉龙 <[EMAIL PROTECTED]> wrote: >> >>> Hi >>> >>> I need to mount two disk volume to my data-path, eg./usr/hadoop/tem/data >>> and /sda >>> How can I set this ? >>> >>> >>> BRs >>> Geelong >>> >>> 2013/3/20 Harsh J <[EMAIL PROTECTED]> >>> >>>> Hi, >>>> >>>> The settings property is "dfs.data.dir" (or "dfs.datanode.data.dir") >>>> and its present in the hdfs-site.xml file at each DataNode, under >>>> $HADOOP_HOME/conf/ directories usually. Look for asymmetrical configs among >>>> various DNs for that property. >>>> >>>> >>>> On Tue, Mar 19, 2013 at 9:09 PM, 姚吉龙 <[EMAIL PROTECTED]> wrote: >>>> >>>>> Thank for your reply. >>>>> I am wondering which parameters defines the capacity of datanode,or >>>>> the way to calculate the capacity. I have considered the your answer >>>>> before, I do not know how to modify the settings. >>>>> Besides, from my point, the capacity will be related with the disk >>>>> volume which means that the capacity will be defined by the disk mounted >>>>> on file system of Hadoop user's temp directory. While I can't find the >>>>> detailed instructions about this. >>>>> Why the capacity of others nodes is about 50G? >>>>> These bothers me a lot. >>>>> >>>>> BRs >>>>> Geelong >>>>> >>>>> 2013/3/19 Harsh J <[EMAIL PROTECTED]> >>>>> >>>>>> You'd probably want to recheck your configuration of dfs.data.dir on >>>>>> the node16 (perhaps its overriding the usual default), to see if it is >>>>>> perhaps including more dirs than normal (and they may be all on the same >>>>>> disks as well, the DN counts space via du/df on each directory so the >>>>>> number can grow that way). >>>>>> >>>>>> Also, please direct usage questions to [EMAIL PROTECTED]ommunity, which I've included in my response :) >>>>>> >>>>>> >>>>>> On Tue, Mar 19, 2013 at 5:40 PM, 姚吉龙 <[EMAIL PROTECTED]> wrote: >>>>>> >>>>>>> Hi >>>>>>> >>>>>>> I am a newer for the Hadoop platform, I really need your help. >>>>>>> Now we have 32 datanodes available, while we find that the >>>>>>> Configured Capacity is different among these datanodes though the hardware >>>>>>> is the same. >>>>>>> I wonder the reson why the node16 is much bigger than the others, >>>>>>> besides which is main factor or directory that determine the capacity for >>>>>>> each datanode. >>>>>>> >>>>>>> >>>>>>> I wiil apprecite your kindly help, this problem has been puzzled me >>>>>>> for a long time. >>>>>>> >>>>>>> BRs >>>>>>> Geelong >>>>>>> >>>>>>> -- >>>>>>> From Good To Great >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Harsh J >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> From Good To Great >>>>> >>>> >>>> >>>> >>>> -- >>>> Harsh J >>>> >>> >>> >>> >>> -- >>> From Good To Great >>> >> >> >> >> -- >> Harsh J >> > > > > --
Harsh J
|
|