Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: disk used percentage is not symmetric on datanodes (balancer)


+
Tapas Sarangi 2013-03-19, 01:26
+
Bertrand Dechoux 2013-03-18, 21:43
+
Алексей Бабутин 2013-03-22, 16:05
+
Jamal B 2013-03-24, 20:29
+
see1230 2013-03-25, 03:33
+
Tapas Sarangi 2013-03-24, 20:48
+
Alexey Babutin 2013-03-24, 21:46
+
Tapas Sarangi 2013-03-25, 00:29
+
Alexey Babutin 2013-03-24, 21:50
+
Tapas Sarangi 2013-03-24, 20:44
+
Jamal B 2013-03-24, 21:34
+
Tapas Sarangi 2013-03-24, 23:09
+
Jamal B 2013-03-25, 01:06
Copy link to this message
-
Re: disk used percentage is not symmetric on datanodes (balancer)
Thanks. Does this need a restart of hadoop in the nodes where this modification is made ?

-----

On Mar 24, 2013, at 8:06 PM, Jamal B <[EMAIL PROTECTED]> wrote:

> dfs.datanode.du.reserved
>
> You could tweak that param on the smaller nodes to "force" the flow of blocks to other nodes.   A short term hack at best, but should help the situation a bit.
>
> On Mar 24, 2013 7:09 PM, "Tapas Sarangi" <[EMAIL PROTECTED]> wrote:
>
> On Mar 24, 2013, at 4:34 PM, Jamal B <[EMAIL PROTECTED]> wrote:
>
>> It shouldn't cause further problems since most of your small nodes are already their capacity.  You could set or increase the dfs reserved property on your smaller nodes to force the flow of blocks onto the larger nodes.
>>
>>
>
> Thanks.  Can you please specify which are the dfs properties that we can set or modify to force the flow of blocks directed towards the larger nodes than the smaller nodes ?
>
> -----
>
>
>
>>
>
>
>> On Mar 24, 2013 4:45 PM, "Tapas Sarangi" <[EMAIL PROTECTED]> wrote:
>> Hi,
>>
>> Thanks for the idea, I will give this a try and report back.
>>
>> My worry is if we decommission a small node (one at a time), will it move the data to larger nodes or choke another smaller nodes ? In principle it should distribute the blocks, the point is it is not distributing the way we expect it to, so do you think this may cause further problems ?
>>
>> ---------
>>
>> On Mar 24, 2013, at 3:37 PM, Jamal B <[EMAIL PROTECTED]> wrote:
>>
>>> Then I think the only way around this would be to decommission  1 at a time, the smaller nodes, and ensure that the blocks are moved to the larger nodes.  
>>> And once complete, bring back in the smaller nodes, but maybe only after you tweak the rack topology to match your disk layout more than network layout to compensate for the unbalanced nodes.  
>>>
>>> Just my 2 cents
>>>
>>>
>>> On Sun, Mar 24, 2013 at 4:31 PM, Tapas Sarangi <[EMAIL PROTECTED]> wrote:
>>> Thanks. We have a 1-1 configuration of drives and folder in all the datanodes.
>>>
>>> -Tapas
>>>
>>> On Mar 24, 2013, at 3:29 PM, Jamal B <[EMAIL PROTECTED]> wrote:
>>>
>>>> On both types of nodes, what is your dfs.data.dir set to? Does it specify multiple folders on the same set's of drives or is it 1-1 between folder and drive?  If it's set to multiple folders on the same drives, it is probably multiplying the amount of "available capacity" incorrectly in that it assumes a 1-1 relationship between folder and total capacity of the drive.
>>>>
>>>>
>>>> On Sun, Mar 24, 2013 at 3:01 PM, Tapas Sarangi <[EMAIL PROTECTED]> wrote:
>>>> Yes, thanks for pointing, but I already know that it is completing the balancing when exiting otherwise it shouldn't exit.
>>>> Your answer doesn't solve the problem I mentioned earlier in my message. 'hdfs' is stalling and hadoop is not writing unless space is cleared up from the cluster even though "df" shows the cluster has about 500 TB of free space.
>>>>
>>>> -------
>>>>  
>>>>
>>>> On Mar 24, 2013, at 1:54 PM, Balaji Narayanan (பாலாஜி நாராயணன்) <[EMAIL PROTECTED]> wrote:
>>>>
>>>>>  -setBalancerBandwidth <bandwidth in bytes per second>
>>>>>
>>>>> So the value is bytes per second. If it is running and exiting,it means it has completed the balancing.
>>>>>
>>>>>
>>>>> On 24 March 2013 11:32, Tapas Sarangi <[EMAIL PROTECTED]> wrote:
>>>>> Yes, we are running balancer, though a balancer process runs for almost a day or more before exiting and starting over.
>>>>> Current dfs.balance.bandwidthPerSec value is set to 2x10^9. I assume that's bytes so about 2 GigaByte/sec. Shouldn't that be reasonable ? If it is in Bits then we have a problem.
>>>>> What's the unit for "dfs.balance.bandwidthPerSec" ?
>>>>>
>>>>> -----
>>>>>
>>>>> On Mar 24, 2013, at 1:23 PM, Balaji Narayanan (பாலாஜி நாராயணன்) <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Are you running balancer? If balancer is running and if it is slow, try increasing the balancer bandwidth
+
Jamal B 2013-03-25, 02:09
+
Alexey Babutin 2013-03-24, 21:04