Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HMaster and HRegionServer going down


Copy link to this message
-
Re: HMaster and HRegionServer going down
gc log cannot get by default. need some configuration. do you have some
batch read or write to hbase?

--Send from my Sony mobile.
On Jun 5, 2013 8:25 PM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:

> I dont have GC logs.Do you get it by default  or it has to be configured ?
> After i came to know about crash , i checked which all processes are
> running using "jps"
> It displayed 4 processes , "namenode","datanode","secondarynamenode" and
> "HQuorumpeer".
> So i stopped dfs by running $HADOOP_HOME/bin/stop-dfs.sh and then i stopped
> hbase by running $HBASE_HOME/bin/stop-hbase.sh
>
>
> On Wed, Jun 5, 2013 at 5:49 PM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
>
> > do you have GC log? and what you did during crash? and whats your gc
> > options?
> >
> > for the dn error, thats net work issue generally, because dn received an
> > incomplete packet.
> >
> > --Send from my Sony mobile.
> > On Jun 5, 2013 8:10 PM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
> >
> > > Yes.
> > > Thats true.
> > > There are some errors in all 3 logs during same period , i.e. data ,
> > master
> > > and region.
> > > But i am unable to deduce the exact cause of error.
> > > Can you please help in detecting the problem ?
> > >
> > > So far i am suspecting following :-
> > > I have 1GB heap (default) allocated for all 3 processes , i.e.
> > > Master,Region,Zookeeper.
> > > Both  Master and Region took more time for GC ( as inferred from lines
> in
> > > logs like "slept more time than configured one" etc ) .
> > > Due to this there was  zookeeper connection time out for both Master
> and
> > > Region and hence both went down.
> > >
> > > I am newbie to Hbase and hence may be my findings are not correct.
> > > I want to be 100 % sure before increasing heap space for both Master
> and
> > > Region ( Both around 2GB) to solve this.
> > > At present i have restarted the cluster with default heap space only (
> > 1GB
> > > ).
> > >
> > >
> > >
> > > On Wed, Jun 5, 2013 at 5:23 PM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
> > >
> > > > there have errors in your dats node log, and the error time match
> with
> > rs
> > > > log error time.
> > > >
> > > > --Send from my Sony mobile.
> > > > On Jun 5, 2013 5:06 PM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > I don't think so , as i dont find any issues in data node logs.
> > > > > Also there are lot of exceptions like "session expired" , "slept
> more
> > > > than
> > > > > configured time" . what are these ?
> > > > >
> > > > >
> > > > > On Wed, Jun 5, 2013 at 2:27 PM, Azuryy Yu <[EMAIL PROTECTED]>
> > wrote:
> > > > >
> > > > > > Because your data node 192.168.20.30 broke down. which leads to
> RS
> > > > down.
> > > > > >
> > > > > >
> > > > > > On Wed, Jun 5, 2013 at 3:19 PM, Vimal Jain <[EMAIL PROTECTED]>
> > wrote:
> > > > > >
> > > > > > > Here is the complete log:
> > > > > > >
> > > > > > > http://bin.cakephp.org/saved/103001 - Hregion
> > > > > > > http://bin.cakephp.org/saved/103000 - Hmaster
> > > > > > > http://bin.cakephp.org/saved/103002 - Datanode
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Jun 5, 2013 at 11:58 AM, Vimal Jain <[EMAIL PROTECTED]>
> > > > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > > I have set up Hbase in pseudo-distributed mode.
> > > > > > > > It was working fine for 6 days , but suddenly today morning
> > both
> > > > > > HMaster
> > > > > > > > and Hregion process went down.
> > > > > > > > I checked in logs of both hadoop and hbase.
> > > > > > > > Please help here.
> > > > > > > > Here are the snippets :-
> > > > > > > >
> > > > > > > > *Datanode logs:*
> > > > > > > > 2013-06-05 05:12:51,436 INFO
> > > > > > > > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in
> > > > > > > receiveBlock
> > > > > > > > for block blk_1597245478875608321_2818 java.io.EOFException:
> > > while
> > > > > > trying
> > > > > > > > to read 2347 bytes
> > > > > > > > 2013-06-05 05:12:51,442 INFO
> > > > > > > > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> > > > > > > > blk_1597245478875608321_2818 received exception
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB