Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - HMaster and HRegionServer going down


Copy link to this message
-
Re: HMaster and HRegionServer going down
Azuryy Yu 2013-06-06, 05:23
And, please check your namenode log.
On Thu, Jun 6, 2013 at 1:20 PM, Azuryy Yu <[EMAIL PROTECTED]> wrote:

> Can you reproduce the problem? if yes,
>
> add the following in your hbase-env.sh
>
> export HBASE_MASTER_OPTS="-verbose:gc -XX:+PrintGCDateStamps
> -XX:+PrintGCDetails -Xloggc:$HBASE_LOG_DIR/hmaster_gc.log
> $HBASE_MASTER_OPTS"
>
> export HBASE_REGIONSERVER_OPTS="-verbose:gc -XX:+PrintGCDateStamps
> -XX:+PrintGCDetails -Xloggc:$HBASE_LOG_DIR/hmaster_gc.log
> $HBASE_REGIONSERVER_OPTS"
>
> then, you will got GC log, I just guess this problem was lead with GC.
>
>
>
> On Thu, Jun 6, 2013 at 10:53 AM, Vimal Jain <[EMAIL PROTECTED]> wrote:
>
>> Hi Azuryy/Ted,
>> Can you please help here...
>> On Jun 5, 2013 7:23 PM, "Kevin O'dell" <[EMAIL PROTECTED]> wrote:
>>
>> > No!
>> >
>> > Just kidding, you can unsubscribe by going to the Apache site:
>> >
>> > http://hbase.apache.org/mail-lists.html
>> >
>> >
>> > On Wed, Jun 5, 2013 at 9:34 AM, Joseph Coleman <
>> > [EMAIL PROTECTED]> wrote:
>> >
>> > > Please remove me from this list
>> > >
>> > >
>> > > On 6/5/13 8:32 AM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
>> > >
>> > > >Ok.
>> > > >I dont have any batch read/write to hbase.
>> > > >
>> > > >
>> > > >On Wed, Jun 5, 2013 at 6:08 PM, Azuryy Yu <[EMAIL PROTECTED]>
>> wrote:
>> > > >
>> > > >> gc log cannot get by default. need some configuration. do you have
>> > some
>> > > >> batch read or write to hbase?
>> > > >>
>> > > >> --Send from my Sony mobile.
>> > > >> On Jun 5, 2013 8:25 PM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
>> > > >>
>> > > >> > I dont have GC logs.Do you get it by default  or it has to be
>> > > >>configured
>> > > >> ?
>> > > >> > After i came to know about crash , i checked which all processes
>> are
>> > > >> > running using "jps"
>> > > >> > It displayed 4 processes ,
>> "namenode","datanode","secondarynamenode"
>> > > >>and
>> > > >> > "HQuorumpeer".
>> > > >> > So i stopped dfs by running $HADOOP_HOME/bin/stop-dfs.sh and
>> then i
>> > > >> stopped
>> > > >> > hbase by running $HBASE_HOME/bin/stop-hbase.sh
>> > > >> >
>> > > >> >
>> > > >> > On Wed, Jun 5, 2013 at 5:49 PM, Azuryy Yu <[EMAIL PROTECTED]>
>> > wrote:
>> > > >> >
>> > > >> > > do you have GC log? and what you did during crash? and whats
>> your
>> > gc
>> > > >> > > options?
>> > > >> > >
>> > > >> > > for the dn error, thats net work issue generally, because dn
>> > > >>received
>> > > >> an
>> > > >> > > incomplete packet.
>> > > >> > >
>> > > >> > > --Send from my Sony mobile.
>> > > >> > > On Jun 5, 2013 8:10 PM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
>> > > >> > >
>> > > >> > > > Yes.
>> > > >> > > > Thats true.
>> > > >> > > > There are some errors in all 3 logs during same period , i.e.
>> > > >>data ,
>> > > >> > > master
>> > > >> > > > and region.
>> > > >> > > > But i am unable to deduce the exact cause of error.
>> > > >> > > > Can you please help in detecting the problem ?
>> > > >> > > >
>> > > >> > > > So far i am suspecting following :-
>> > > >> > > > I have 1GB heap (default) allocated for all 3 processes ,
>> i.e.
>> > > >> > > > Master,Region,Zookeeper.
>> > > >> > > > Both  Master and Region took more time for GC ( as inferred
>> from
>> > > >> lines
>> > > >> > in
>> > > >> > > > logs like "slept more time than configured one" etc ) .
>> > > >> > > > Due to this there was  zookeeper connection time out for both
>> > > >>Master
>> > > >> > and
>> > > >> > > > Region and hence both went down.
>> > > >> > > >
>> > > >> > > > I am newbie to Hbase and hence may be my findings are not
>> > correct.
>> > > >> > > > I want to be 100 % sure before increasing heap space for both
>> > > >>Master
>> > > >> > and
>> > > >> > > > Region ( Both around 2GB) to solve this.
>> > > >> > > > At present i have restarted the cluster with default heap
>> space
>> > > >>only
>> > > >> (
>> > > >> > > 1GB
>> > > >> > > > ).
>> > > >> > > >
>> > > >> > > >
>> > > >> > > >
>> > > >> > > > On Wed, Jun 5, 2013 at 5:23 PM, Azuryy Yu <