Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> one of our datanodes stops working after few hours


Copy link to this message
-
Re: one of our datanodes stops working after few hours
I will try, thanks.  I have not ran NFS since 1998 :).

-Jack

On Mon, May 2, 2011 at 10:10 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> Hi Jack,
>
> Try turning off your clienttrace logs in the DN log4j.properties, perhaps?
>
> By any chance do you log to NFS?
>
> Your blocked threads all seem to be waiting on appends to log4j.
>
> -Todd
>
> On Mon, May 2, 2011 at 7:29 PM, Jack Levin <[EMAIL PROTECTED]> wrote:
>
>> As requested:
>>
>> http://pastebin.com/aySaTADp
>>
>> Note, blocked threads.
>>
>> -Jack
>>
>> On Mon, May 2, 2011 at 2:39 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>
>> wrote:
>> > I think Todd was asking to have a jstack without yourkit, so it
>> > shouldn't be an issue for you :)
>> >
>> > J-D
>> >
>> > On Mon, May 2, 2011 at 1:56 PM, Jack Levin <[EMAIL PROTECTED]> wrote:
>> >> my yourkit version expired :)... but here is the jstack when it
>> >> happens: http://pastebin.com/5v6mHg3t
>> >>
>> >> On Mon, May 2, 2011 at 1:00 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
>> >>> On Mon, May 2, 2011 at 12:56 PM, Jack Levin <[EMAIL PROTECTED]> wrote:
>> >>>
>> >>>> Tried removing yourkit and run on javasun, same thing.  We have some
>> >>>> threads blocked, does anyone know what they block on?
>> >>>>
>> >>>
>> >>> Which threads are blocked? Can you get some jstacks without yourkit?
>> >>>
>> >>> -Todd
>> >>>
>> >>>
>> >>>>
>> >>>> -Jack
>> >>>>
>> >>>> On Mon, May 2, 2011 at 7:53 AM, Todd Lipcon <[EMAIL PROTECTED]>
>> wrote:
>> >>>> > Hi Jack,
>> >>>> >
>> >>>> > Does this happen even if you aren't running Yourkit on the DN?
>> >>>> >
>> >>>> > Can you try using a Sun JDK instead of OpenJDK?
>> >>>> >
>> >>>> > -Todd
>> >>>> >
>> >>>> > On Sun, May 1, 2011 at 7:34 PM, Jack Levin <[EMAIL PROTECTED]>
>> wrote:
>> >>>> >
>> >>>> >> Version:         0.20.2+320 hdfs
>> >>>> >> .89 HBASE
>> >>>> >>
>> >>>> >> ulimit is 32k
>> >>>> >> xcievers is 5k
>> >>>> >>
>> >>>> >> Note from the jstack, I am not exceeding xcievers.
>> >>>> >>
>> >>>> >> -Jack
>> >>>> >>
>> >>>> >> On Sun, May 1, 2011 at 6:19 PM, Michael Segel <
>> >>>> [EMAIL PROTECTED]>
>> >>>> >> wrote:
>> >>>> >> >
>> >>>> >> >
>> >>>> >> > What's your xceivers set to?
>> >>>> >> > What's the ulimit -n  set for hdfs/hadoop user... (You didn't say
>> >>>> which
>> >>>> >> release/version you were using.)
>> >>>> >> >
>> >>>> >> >> Date: Sun, 1 May 2011 17:47:18 -0700
>> >>>> >> >> Subject: one of our datanodes stops working after few hours
>> >>>> >> >> From: [EMAIL PROTECTED]
>> >>>> >> >> To: [EMAIL PROTECTED]
>> >>>> >> >>
>> >>>> >> >> I took a jstack (http://pastebin.com/5v6mHg3t).   After few
>> hours,
>> >>>> its
>> >>>> >> >> literally staggers to a halt and gets very very slow... Any
>> ideas
>> >>>> >> >> whats its blocking on?
>> >>>> >> >> (main issue is that fsreads for RS get really slow when that
>> >>>> happens).
>> >>>> >> >>
>> >>>> >> >> -Jack
>> >>>> >> >
>> >>>> >>
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > --
>> >>>> > Todd Lipcon
>> >>>> > Software Engineer, Cloudera
>> >>>> >
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Todd Lipcon
>> >>> Software Engineer, Cloudera
>> >>>
>> >>
>> >
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB