Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HMaster and HRegionServer going down


Copy link to this message
-
Re: HMaster and HRegionServer going down
Can you reproduce the problem? if yes,

add the following in your hbase-env.sh

export HBASE_MASTER_OPTS="-verbose:gc -XX:+PrintGCDateStamps
-XX:+PrintGCDetails -Xloggc:$HBASE_LOG_DIR/hmaster_gc.log
$HBASE_MASTER_OPTS"

export HBASE_REGIONSERVER_OPTS="-verbose:gc -XX:+PrintGCDateStamps
-XX:+PrintGCDetails -Xloggc:$HBASE_LOG_DIR/hmaster_gc.log
$HBASE_REGIONSERVER_OPTS"

then, you will got GC log, I just guess this problem was lead with GC.

On Thu, Jun 6, 2013 at 10:53 AM, Vimal Jain <[EMAIL PROTECTED]> wrote:

> Hi Azuryy/Ted,
> Can you please help here...
> On Jun 5, 2013 7:23 PM, "Kevin O'dell" <[EMAIL PROTECTED]> wrote:
>
> > No!
> >
> > Just kidding, you can unsubscribe by going to the Apache site:
> >
> > http://hbase.apache.org/mail-lists.html
> >
> >
> > On Wed, Jun 5, 2013 at 9:34 AM, Joseph Coleman <
> > [EMAIL PROTECTED]> wrote:
> >
> > > Please remove me from this list
> > >
> > >
> > > On 6/5/13 8:32 AM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
> > >
> > > >Ok.
> > > >I dont have any batch read/write to hbase.
> > > >
> > > >
> > > >On Wed, Jun 5, 2013 at 6:08 PM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
> > > >
> > > >> gc log cannot get by default. need some configuration. do you have
> > some
> > > >> batch read or write to hbase?
> > > >>
> > > >> --Send from my Sony mobile.
> > > >> On Jun 5, 2013 8:25 PM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
> > > >>
> > > >> > I dont have GC logs.Do you get it by default  or it has to be
> > > >>configured
> > > >> ?
> > > >> > After i came to know about crash , i checked which all processes
> are
> > > >> > running using "jps"
> > > >> > It displayed 4 processes ,
> "namenode","datanode","secondarynamenode"
> > > >>and
> > > >> > "HQuorumpeer".
> > > >> > So i stopped dfs by running $HADOOP_HOME/bin/stop-dfs.sh and then
> i
> > > >> stopped
> > > >> > hbase by running $HBASE_HOME/bin/stop-hbase.sh
> > > >> >
> > > >> >
> > > >> > On Wed, Jun 5, 2013 at 5:49 PM, Azuryy Yu <[EMAIL PROTECTED]>
> > wrote:
> > > >> >
> > > >> > > do you have GC log? and what you did during crash? and whats
> your
> > gc
> > > >> > > options?
> > > >> > >
> > > >> > > for the dn error, thats net work issue generally, because dn
> > > >>received
> > > >> an
> > > >> > > incomplete packet.
> > > >> > >
> > > >> > > --Send from my Sony mobile.
> > > >> > > On Jun 5, 2013 8:10 PM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
> > > >> > >
> > > >> > > > Yes.
> > > >> > > > Thats true.
> > > >> > > > There are some errors in all 3 logs during same period , i.e.
> > > >>data ,
> > > >> > > master
> > > >> > > > and region.
> > > >> > > > But i am unable to deduce the exact cause of error.
> > > >> > > > Can you please help in detecting the problem ?
> > > >> > > >
> > > >> > > > So far i am suspecting following :-
> > > >> > > > I have 1GB heap (default) allocated for all 3 processes , i.e.
> > > >> > > > Master,Region,Zookeeper.
> > > >> > > > Both  Master and Region took more time for GC ( as inferred
> from
> > > >> lines
> > > >> > in
> > > >> > > > logs like "slept more time than configured one" etc ) .
> > > >> > > > Due to this there was  zookeeper connection time out for both
> > > >>Master
> > > >> > and
> > > >> > > > Region and hence both went down.
> > > >> > > >
> > > >> > > > I am newbie to Hbase and hence may be my findings are not
> > correct.
> > > >> > > > I want to be 100 % sure before increasing heap space for both
> > > >>Master
> > > >> > and
> > > >> > > > Region ( Both around 2GB) to solve this.
> > > >> > > > At present i have restarted the cluster with default heap
> space
> > > >>only
> > > >> (
> > > >> > > 1GB
> > > >> > > > ).
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > On Wed, Jun 5, 2013 at 5:23 PM, Azuryy Yu <[EMAIL PROTECTED]
> >
> > > >> wrote:
> > > >> > > >
> > > >> > > > > there have errors in your dats node log, and the error time
> > > >>match
> > > >> > with
> > > >> > > rs
> > > >> > > > > log error time.
> > > >> > > > >
> > > >> > > > > --Send from my Sony mobile.