Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> High load on datanode startup


+
Darrell Taylor 2012-05-09, 16:52
+
Raj Vishwanathan 2012-05-09, 21:23
+
Darrell Taylor 2012-05-09, 21:40
+
Raj Vishwanathan 2012-05-09, 21:52
+
Darrell Taylor 2012-05-10, 06:57
+
Todd Lipcon 2012-05-10, 08:33
+
Darrell Taylor 2012-05-10, 10:57
+
Raj Vishwanathan 2012-05-10, 16:58
+
Darrell Taylor 2012-05-11, 09:29
+
Todd Lipcon 2012-05-11, 09:32
+
Harsh J 2012-05-11, 10:36
+
Serge Blazhiyevskyy 2012-05-09, 16:56
+
Darrell Taylor 2012-05-09, 16:58
+
Serge Blazhiyevskyy 2012-05-09, 17:04
+
Darrell Taylor 2012-05-09, 19:23
+
Serge Blazhiyevskyy 2012-05-09, 21:00
+
Darrell Taylor 2012-05-09, 21:27
Copy link to this message
-
Re: High load on datanode startup
I would wait for that number to go down to 0

That could a reason for your CPU utilization

Regards,
Serge
On 5/9/12 2:27 PM, "Darrell Taylor" <[EMAIL PROTECTED]> wrote:

>On Wed, May 9, 2012 at 10:00 PM, Serge Blazhiyevskyy <
>[EMAIL PROTECTED]> wrote:
>
>> Looks like you have some under replicated blocks. Does that number
>> decreases if you fsck multiple times?
>>
>
>Yes, since my last post it's now down to 353....
>
>Status: HEALTHY
> Total size:    246983628437 B (Total open files size: 372 B)
> Total dirs:    15172
> Total files:   39637 (Files currently being written: 7)
> Total blocks (validated):      41046 (avg. block size 6017239 B) (Total
>open file blocks (not validated): 6)
> Minimally replicated blocks:   41046 (100.0 %)
> Over-replicated blocks:        0 (0.0 %)
> Under-replicated blocks:       353 (0.86001074 %)
> Mis-replicated blocks:         0 (0.0 %)
> Default replication factor:    3
> Average block replication:     3.016981
> Corrupt blocks:                0
> Missing replicas:              1774 (1.4325514 %)
> Number of data-nodes:          5
> Number of racks:               1
>FSCK ended at Wed May 09 21:26:40 UTC 2012 in 904 milliseconds
>
>
>
>
>>
>>
>> Regards,
>> Serge
>>
>> On 5/9/12 12:23 PM, "Darrell Taylor" <[EMAIL PROTECTED]> wrote:
>>
>> >On Wed, May 9, 2012 at 6:04 PM, Serge Blazhiyevskyy <
>> >[EMAIL PROTECTED]> wrote:
>> >
>> >>
>> >> Whats the response from fsck look like?
>> >>
>> >>
>> >[snip lots of stuff about under replicated blocks]
>> >
>> >......Status: HEALTHY
>> > Total size:    246858876262 B (Total open files size: 372 B)
>> > Total dirs:    14914
>> > Total files:   39248 (Files currently being written: 4)
>> > Total blocks (validated):      40657 (avg. block size 6071743 B)
>>(Total
>> >open file blocks (not validated): 4)
>> > Minimally replicated blocks:   40657 (100.0 %)
>> > Over-replicated blocks:        0 (0.0 %)
>> > Under-replicated blocks:       1410 (3.4680374 %)
>> > Mis-replicated blocks:         0 (0.0 %)
>> > Default replication factor:    3
>> > Average block replication:     2.9911454
>> > Corrupt blocks:                0
>> > Missing replicas:              2831 (2.3279145 %)
>> > Number of data-nodes:          5
>> > Number of racks:               1
>> >FSCK ended at Wed May 09 19:19:11 UTC 2012 in 980 milliseconds
>> >
>> >
>> >Further information to add to this, it appear to be affecting 2 nodes
>>in
>> >the cluster, one more than the other though.  In the last couple of
>>hours
>> >one of the nodes has also experienced high load, this has now dropped
>>but
>> >both of these nodes are now considered dead by the namenode.  The first
>> >box
>> >load is still increasing, currently 234! I think I might have to
>>reboot it
>> >via IPMI.
>> >
>> >
>> >>
>> >> hadoop fsck /
>> >>
>> >>
>> >> It might be the case that some of the blocks are misreplicated
>> >>
>> >>
>> >> Serge
>> >>
>> >> Hadoopway.blogspot.com
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On 5/9/12 9:58 AM, "Darrell Taylor" <[EMAIL PROTECTED]> wrote:
>> >>
>> >> >On Wed, May 9, 2012 at 5:56 PM, Serge Blazhiyevskyy <
>> >> >[EMAIL PROTECTED]> wrote:
>> >> >
>> >> >> Take a look at your data distribution for that cluster. Maybe, it
>>is
>> >> >> unbalanced.
>> >> >>
>> >> >>
>> >> >> Run balancer, if it isŠ
>> >> >>
>> >> >
>> >> >The cluster is balanced, I ran balancer yesterday.  Oddly enough the
>> >> >problem started after I had run the balancer.
>> >> >
>> >> >I'm running CDH3 btw.
>> >> >
>> >> >
>> >> >
>> >> >>
>> >> >> Regards,
>> >> >> Serge
>> >> >>
>> >> >> hadoopway.blogspot.com
>> >> >>
>> >> >>
>> >> >>
>> >> >> On 5/9/12 9:52 AM, "Darrell Taylor" <[EMAIL PROTECTED]>
>> wrote:
>> >> >>
>> >> >> >Hi,
>> >> >> >
>> >> >> >I wonder if someone could give some pointers with a problem I'm
>> >>having?
>> >> >> >
>> >> >> >I have a 7 machine cluster setup for testing and we have been
>> >>pouring
>> >> >>data
>> >> >> >into it for a week without issue, have learnt several thing along