Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> balance blocks between small and bigger disks in the same datanode.


Copy link to this message
-
balance blocks between small and bigger disks in the same datanode.
Hi All,

I was looking into FAQ, but well still have questions.
Datanodes in my production are running low in the space of one of dfs.data.dir
/dev/sda5             --> 355G   322G    33G  91% /hadoop1  <----
/dev/sdb1             --> 484G   324G   161G  67% /hadoop2
/dev/sdc1                   484G   318G   167G  66% /hadoop3

/hadoop1 has smaller space since the very beginning because its drive
is being shared with operating system.
I found one FAQ in wiki page
"3.12. On an individual data node, how do you balance the blocks on the disk?

Hadoop currently does not have a method by which to do this
automatically. To do this manually:

1    Take down the HDFS
2   Use the UNIX mv command to move the individual blocks and meta
pairs from one directory to another on each host
3    Restart the HDFS "
Question of step 1, take down the hdfs.
does that mean the whole cluster OR just datanode process of a
datanode/tasktracker host?

Question of step 2,

2.1 "moving blk and meta pair."

are blk and meta pairs referring to

cd /hadoop1/data/current
$ ls -al *8816473533602921489*
-rw-rw-r-- 1 apps apps 1734467 Aug 27 21:03 blk_-8816473533602921489
-rw-rw-r-- 1 apps apps      63 Aug 27 21:03
blk_-8816473533602921489_78445781.meta

???

2.2 "from one directory to another on each host"

does it needs to be like blk(and meta) from "current" has to be landed
to "current" directory of another dfs.data.dir
mv /hadoop1/data/current/*8816473533602921489* /hadoop2/data/current/

or it can be different directory name in destination side.
2.3 how about subdirXX?

under /hadoop1/data/current/
....
....
55G subdir36
49G subdir37
.....
.....

it is so tempting to move subdir36, subdir37 because they are huge.
should it look like

mv /hadoop1/data/current/subdir36/*  /hadoop2/data/current/subdir36/

well... under /hadoop2/data/current/subdir36/
also have bunch of blk(and meta) and bunch of subdirectories as well
which mean if i do move, it might be some collide ?
Thanks in advances.
-P