Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - HMaster and HRegionServer going down


Copy link to this message
-
Re: HMaster and HRegionServer going down
Kevin O'dell 2013-06-05, 13:53
No!

Just kidding, you can unsubscribe by going to the Apache site:

http://hbase.apache.org/mail-lists.html
On Wed, Jun 5, 2013 at 9:34 AM, Joseph Coleman <
[EMAIL PROTECTED]> wrote:

> Please remove me from this list
>
>
> On 6/5/13 8:32 AM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
>
> >Ok.
> >I dont have any batch read/write to hbase.
> >
> >
> >On Wed, Jun 5, 2013 at 6:08 PM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
> >
> >> gc log cannot get by default. need some configuration. do you have some
> >> batch read or write to hbase?
> >>
> >> --Send from my Sony mobile.
> >> On Jun 5, 2013 8:25 PM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
> >>
> >> > I dont have GC logs.Do you get it by default  or it has to be
> >>configured
> >> ?
> >> > After i came to know about crash , i checked which all processes are
> >> > running using "jps"
> >> > It displayed 4 processes , "namenode","datanode","secondarynamenode"
> >>and
> >> > "HQuorumpeer".
> >> > So i stopped dfs by running $HADOOP_HOME/bin/stop-dfs.sh and then i
> >> stopped
> >> > hbase by running $HBASE_HOME/bin/stop-hbase.sh
> >> >
> >> >
> >> > On Wed, Jun 5, 2013 at 5:49 PM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
> >> >
> >> > > do you have GC log? and what you did during crash? and whats your gc
> >> > > options?
> >> > >
> >> > > for the dn error, thats net work issue generally, because dn
> >>received
> >> an
> >> > > incomplete packet.
> >> > >
> >> > > --Send from my Sony mobile.
> >> > > On Jun 5, 2013 8:10 PM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
> >> > >
> >> > > > Yes.
> >> > > > Thats true.
> >> > > > There are some errors in all 3 logs during same period , i.e.
> >>data ,
> >> > > master
> >> > > > and region.
> >> > > > But i am unable to deduce the exact cause of error.
> >> > > > Can you please help in detecting the problem ?
> >> > > >
> >> > > > So far i am suspecting following :-
> >> > > > I have 1GB heap (default) allocated for all 3 processes , i.e.
> >> > > > Master,Region,Zookeeper.
> >> > > > Both  Master and Region took more time for GC ( as inferred from
> >> lines
> >> > in
> >> > > > logs like "slept more time than configured one" etc ) .
> >> > > > Due to this there was  zookeeper connection time out for both
> >>Master
> >> > and
> >> > > > Region and hence both went down.
> >> > > >
> >> > > > I am newbie to Hbase and hence may be my findings are not correct.
> >> > > > I want to be 100 % sure before increasing heap space for both
> >>Master
> >> > and
> >> > > > Region ( Both around 2GB) to solve this.
> >> > > > At present i have restarted the cluster with default heap space
> >>only
> >> (
> >> > > 1GB
> >> > > > ).
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Wed, Jun 5, 2013 at 5:23 PM, Azuryy Yu <[EMAIL PROTECTED]>
> >> wrote:
> >> > > >
> >> > > > > there have errors in your dats node log, and the error time
> >>match
> >> > with
> >> > > rs
> >> > > > > log error time.
> >> > > > >
> >> > > > > --Send from my Sony mobile.
> >> > > > > On Jun 5, 2013 5:06 PM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
> >> > > > >
> >> > > > > > I don't think so , as i dont find any issues in data node
> >>logs.
> >> > > > > > Also there are lot of exceptions like "session expired" ,
> >>"slept
> >> > more
> >> > > > > than
> >> > > > > > configured time" . what are these ?
> >> > > > > >
> >> > > > > >
> >> > > > > > On Wed, Jun 5, 2013 at 2:27 PM, Azuryy Yu <[EMAIL PROTECTED]
> >
> >> > > wrote:
> >> > > > > >
> >> > > > > > > Because your data node 192.168.20.30 broke down. which
> >>leads to
> >> > RS
> >> > > > > down.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Wed, Jun 5, 2013 at 3:19 PM, Vimal Jain
> >><[EMAIL PROTECTED]>
> >> > > wrote:
> >> > > > > > >
> >> > > > > > > > Here is the complete log:
> >> > > > > > > >
> >> > > > > > > > http://bin.cakephp.org/saved/103001 - Hregion
> >> > > > > > > > http://bin.cakephp.org/saved/103000 - Hmaster
> >> > > > > > > > http://bin.cakephp.org/saved/103002 - Datanode
> >> > > > > > >

Kevin O'Dell
Systems Engineer, Cloudera