Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> High load on datanode startup


Copy link to this message
-
Re: High load on datanode startup
I would wait for that number to go down to 0

That could a reason for your CPU utilization

Regards,
Serge
On 5/9/12 2:27 PM, "Darrell Taylor" <[EMAIL PROTECTED]> wrote:

>On Wed, May 9, 2012 at 10:00 PM, Serge Blazhiyevskyy <
>[EMAIL PROTECTED]> wrote:
>
>> Looks like you have some under replicated blocks. Does that number
>> decreases if you fsck multiple times?
>>
>
>Yes, since my last post it's now down to 353....
>
>Status: HEALTHY
> Total size:    246983628437 B (Total open files size: 372 B)
> Total dirs:    15172
> Total files:   39637 (Files currently being written: 7)
> Total blocks (validated):      41046 (avg. block size 6017239 B) (Total
>open file blocks (not validated): 6)
> Minimally replicated blocks:   41046 (100.0 %)
> Over-replicated blocks:        0 (0.0 %)
> Under-replicated blocks:       353 (0.86001074 %)
> Mis-replicated blocks:         0 (0.0 %)
> Default replication factor:    3
> Average block replication:     3.016981
> Corrupt blocks:                0
> Missing replicas:              1774 (1.4325514 %)
> Number of data-nodes:          5
> Number of racks:               1
>FSCK ended at Wed May 09 21:26:40 UTC 2012 in 904 milliseconds
>
>
>
>
>>
>>
>> Regards,
>> Serge
>>
>> On 5/9/12 12:23 PM, "Darrell Taylor" <[EMAIL PROTECTED]> wrote:
>>
>> >On Wed, May 9, 2012 at 6:04 PM, Serge Blazhiyevskyy <
>> >[EMAIL PROTECTED]> wrote:
>> >
>> >>
>> >> Whats the response from fsck look like?
>> >>
>> >>
>> >[snip lots of stuff about under replicated blocks]
>> >
>> >......Status: HEALTHY
>> > Total size:    246858876262 B (Total open files size: 372 B)
>> > Total dirs:    14914
>> > Total files:   39248 (Files currently being written: 4)
>> > Total blocks (validated):      40657 (avg. block size 6071743 B)
>>(Total
>> >open file blocks (not validated): 4)
>> > Minimally replicated blocks:   40657 (100.0 %)
>> > Over-replicated blocks:        0 (0.0 %)
>> > Under-replicated blocks:       1410 (3.4680374 %)
>> > Mis-replicated blocks:         0 (0.0 %)
>> > Default replication factor:    3
>> > Average block replication:     2.9911454
>> > Corrupt blocks:                0
>> > Missing replicas:              2831 (2.3279145 %)
>> > Number of data-nodes:          5
>> > Number of racks:               1
>> >FSCK ended at Wed May 09 19:19:11 UTC 2012 in 980 milliseconds
>> >
>> >
>> >Further information to add to this, it appear to be affecting 2 nodes
>>in
>> >the cluster, one more than the other though.  In the last couple of
>>hours
>> >one of the nodes has also experienced high load, this has now dropped
>>but
>> >both of these nodes are now considered dead by the namenode.  The first
>> >box
>> >load is still increasing, currently 234! I think I might have to
>>reboot it
>> >via IPMI.
>> >
>> >
>> >>
>> >> hadoop fsck /
>> >>
>> >>
>> >> It might be the case that some of the blocks are misreplicated
>> >>
>> >>
>> >> Serge
>> >>
>> >> Hadoopway.blogspot.com
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On 5/9/12 9:58 AM, "Darrell Taylor" <[EMAIL PROTECTED]> wrote:
>> >>
>> >> >On Wed, May 9, 2012 at 5:56 PM, Serge Blazhiyevskyy <
>> >> >[EMAIL PROTECTED]> wrote:
>> >> >
>> >> >> Take a look at your data distribution for that cluster. Maybe, it
>>is
>> >> >> unbalanced.
>> >> >>
>> >> >>
>> >> >> Run balancer, if it isŠ
>> >> >>
>> >> >
>> >> >The cluster is balanced, I ran balancer yesterday.  Oddly enough the
>> >> >problem started after I had run the balancer.
>> >> >
>> >> >I'm running CDH3 btw.
>> >> >
>> >> >
>> >> >
>> >> >>
>> >> >> Regards,
>> >> >> Serge
>> >> >>
>> >> >> hadoopway.blogspot.com
>> >> >>
>> >> >>
>> >> >>
>> >> >> On 5/9/12 9:52 AM, "Darrell Taylor" <[EMAIL PROTECTED]>
>> wrote:
>> >> >>
>> >> >> >Hi,
>> >> >> >
>> >> >> >I wonder if someone could give some pointers with a problem I'm
>> >>having?
>> >> >> >
>> >> >> >I have a 7 machine cluster setup for testing and we have been
>> >>pouring
>> >> >>data
>> >> >> >into it for a week without issue, have learnt several thing along
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB