Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Re: Can I move block data directly?


Copy link to this message
-
Re: Can I move block data directly?
Eitan,

The block to host mapping isn't persisted in the metadata. This is
also the reason why the steps include a restart, which will re-trigger
a block report (and avoid gotchas) that will update the NN of the new
listing at each DN. Thats what makes this method "crude" at the same
time - you're leveraging a behavior thats not guaranteed to be
unchanged in future.

The balancer is the right way to go about it.

On Mon, Jul 8, 2013 at 6:53 PM, Eitan Rosenfeld <[EMAIL PROTECTED]> wrote:
> Hi Azurry, I'd also like to be able to manually move blocks.
>
> One piece that is missing in your current approach is updating any
> block mappings that the cluster relies on.
> The namenode has a mapping of blocks to datanodes, and the datanode
> has, as the comments say, a "block -> stream of bytes" mapping.
>
> As I understand it, the namenode's mappings have to be updated to
> reflect the new block locations.
> The datanode might not need intervention, I'm not sure.
>
> Can anyone else chime in on those areas?
>
> The balancer that Allan suggested likely demonstrates all of the ins
> and outs in order successfully complete a block transfer.
> Thus, the balancer is where I'll begin my efforts to learn how to
> manually move blocks.
>
> Any other pointers would be helpful.
>
> Thank you,
> Eitan
>
> On Mon, Jul 8, 2013 at 2:15 PM, Allan <[EMAIL PROTECTED]> wrote:
>> If the imbalance is across data nodes then you need to run the balancer.
>>
>> Sent from my iPad
>>
>> On Jul 8, 2013, at 1:15 AM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
>>
>>> Hi Dear all,
>>>
>>> There are some unbalanced data nodes in my cluster, some nodes reached more
>>> than 95% disk usage.
>>>
>>> so Can I move some block data from one node to another node directly?
>>>
>>> such as: from n1 to n2:
>>>
>>> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
>>> 2) rm -rf data/xxxx/blk_*
>>> 3) hadoop-dameon.sh stop datanode (on n1)
>>> 4) hadoop-damon.sh start datanode(on n1)
>>> 5) hadoop-dameon.sh stop datanode (on n2)
>>> 6) hadoop-damon.sh start datanode(on n2)
>>>
>>> Am I right? Thanks for any inputs.

--
Harsh J