Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Re: disk used percentage is not symmetric on datanodes (balancer)


+
Tapas Sarangi 2013-03-19, 01:26
+
Bertrand Dechoux 2013-03-18, 21:43
+
Алексей Бабутин 2013-03-22, 16:05
+
Jamal B 2013-03-24, 20:29
+
see1230 2013-03-25, 03:33
+
Tapas Sarangi 2013-03-24, 20:48
+
Alexey Babutin 2013-03-24, 21:46
+
Tapas Sarangi 2013-03-25, 00:29
+
Alexey Babutin 2013-03-24, 21:50
+
Tapas Sarangi 2013-03-24, 20:44
+
Jamal B 2013-03-24, 21:34
+
Tapas Sarangi 2013-03-24, 23:09
Copy link to this message
-
Re: disk used percentage is not symmetric on datanodes (balancer)
Jamal B 2013-03-25, 01:06
dfs.datanode.du.reserved

You could tweak that param on the smaller nodes to "force" the flow of
blocks to other nodes.   A short term hack at best, but should help the
situation a bit.
On Mar 24, 2013 7:09 PM, "Tapas Sarangi" <[EMAIL PROTECTED]> wrote:

>
> On Mar 24, 2013, at 4:34 PM, Jamal B <[EMAIL PROTECTED]> wrote:
>
> It shouldn't cause further problems since most of your small nodes are
> already their capacity.  You could set or increase the dfs reserved
> property on your smaller nodes to force the flow of blocks onto the larger
> nodes.
>
>
> Thanks.  Can you please specify which are the dfs properties that we can
> set or modify to force the flow of blocks directed towards the larger nodes
> than the smaller nodes ?
>
> -----
>
>
>
>
>
>
> On Mar 24, 2013 4:45 PM, "Tapas Sarangi" <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> Thanks for the idea, I will give this a try and report back.
>>
>> My worry is if we decommission a small node (one at a time), will it move
>> the data to larger nodes or choke another smaller nodes ? In principle it
>> should distribute the blocks, the point is it is not distributing the way
>> we expect it to, so do you think this may cause further problems ?
>>
>> ---------
>>
>> On Mar 24, 2013, at 3:37 PM, Jamal B <[EMAIL PROTECTED]> wrote:
>>
>> Then I think the only way around this would be to decommission  1 at a
>> time, the smaller nodes, and ensure that the blocks are moved to the larger
>> nodes.
>>
>> And once complete, bring back in the smaller nodes, but maybe only after
>> you tweak the rack topology to match your disk layout more than network
>> layout to compensate for the unbalanced nodes.
>>
>>
>> Just my 2 cents
>>
>>
>> On Sun, Mar 24, 2013 at 4:31 PM, Tapas Sarangi <[EMAIL PROTECTED]>wrote:
>>
>>> Thanks. We have a 1-1 configuration of drives and folder in all the
>>> datanodes.
>>>
>>> -Tapas
>>>
>>> On Mar 24, 2013, at 3:29 PM, Jamal B <[EMAIL PROTECTED]> wrote:
>>>
>>> On both types of nodes, what is your dfs.data.dir set to? Does it
>>> specify multiple folders on the same set's of drives or is it 1-1 between
>>> folder and drive?  If it's set to multiple folders on the same drives, it
>>> is probably multiplying the amount of "available capacity" incorrectly in
>>> that it assumes a 1-1 relationship between folder and total capacity of the
>>> drive.
>>>
>>>
>>> On Sun, Mar 24, 2013 at 3:01 PM, Tapas Sarangi <[EMAIL PROTECTED]>wrote:
>>>
>>>> Yes, thanks for pointing, but I already know that it is completing the
>>>> balancing when exiting otherwise it shouldn't exit.
>>>> Your answer doesn't solve the problem I mentioned earlier in my
>>>> message. 'hdfs' is stalling and hadoop is not writing unless space is
>>>> cleared up from the cluster even though "df" shows the cluster has about
>>>> 500 TB of free space.
>>>>
>>>> -------
>>>>
>>>>
>>>> On Mar 24, 2013, at 1:54 PM, Balaji Narayanan (பாலாஜி நாராயணன்) <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>  -setBalancerBandwidth <bandwidth in bytes per second>
>>>>
>>>> So the value is bytes per second. If it is running and exiting,it means
>>>> it has completed the balancing.
>>>>
>>>>
>>>> On 24 March 2013 11:32, Tapas Sarangi <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> Yes, we are running balancer, though a balancer process runs for
>>>>> almost a day or more before exiting and starting over.
>>>>> Current dfs.balance.bandwidthPerSec value is set to 2x10^9. I assume
>>>>> that's bytes so about 2 GigaByte/sec. Shouldn't that be reasonable ? If it
>>>>> is in Bits then we have a problem.
>>>>> What's the unit for "dfs.balance.bandwidthPerSec" ?
>>>>>
>>>>> -----
>>>>>
>>>>> On Mar 24, 2013, at 1:23 PM, Balaji Narayanan (பாலாஜி நாராயணன்) <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>> Are you running balancer? If balancer is running and if it is slow,
>>>>> try increasing the balancer bandwidth
>>>>>
>>>>>
>>>>> On 24 March 2013 09:21, Tapas Sarangi <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Thanks for the follow up. I don't know whether attachment will pass
+
Tapas Sarangi 2013-03-25, 01:25
+
Jamal B 2013-03-25, 02:09
+
Alexey Babutin 2013-03-24, 21:04