The block to host mapping isn't persisted in the metadata. This is
also the reason why the steps include a restart, which will re-trigger
a block report (and avoid gotchas) that will update the NN of the new
listing at each DN. Thats what makes this method "crude" at the same
time - you're leveraging a behavior thats not guaranteed to be
unchanged in future.
The balancer is the right way to go about it.
On Mon, Jul 8, 2013 at 6:53 PM, Eitan Rosenfeld <[EMAIL PROTECTED]> wrote:
> Hi Azurry, I'd also like to be able to manually move blocks.
> One piece that is missing in your current approach is updating any
> block mappings that the cluster relies on.
> The namenode has a mapping of blocks to datanodes, and the datanode
> has, as the comments say, a "block -> stream of bytes" mapping.
> As I understand it, the namenode's mappings have to be updated to
> reflect the new block locations.
> The datanode might not need intervention, I'm not sure.
> Can anyone else chime in on those areas?
> The balancer that Allan suggested likely demonstrates all of the ins
> and outs in order successfully complete a block transfer.
> Thus, the balancer is where I'll begin my efforts to learn how to
> manually move blocks.
> Any other pointers would be helpful.
> Thank you,
> On Mon, Jul 8, 2013 at 2:15 PM, Allan <[EMAIL PROTECTED]> wrote:
>> If the imbalance is across data nodes then you need to run the balancer.
>> Sent from my iPad
>> On Jul 8, 2013, at 1:15 AM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
>>> Hi Dear all,
>>> There are some unbalanced data nodes in my cluster, some nodes reached more
>>> than 95% disk usage.
>>> so Can I move some block data from one node to another node directly?
>>> such as: from n1 to n2:
>>> 1) scp /data/xxxx/blk_* n2:/data/subdir11/
>>> 2) rm -rf data/xxxx/blk_*
>>> 3) hadoop-dameon.sh stop datanode (on n1)
>>> 4) hadoop-damon.sh start datanode(on n1)
>>> 5) hadoop-dameon.sh stop datanode (on n2)
>>> 6) hadoop-damon.sh start datanode(on n2)
>>> Am I right? Thanks for any inputs.