Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBASE -- Session Expire ?


Copy link to this message
-
Re: HBASE -- Session Expire ?
Jay,

You need to modify the zoo.cfg to reflect the quorum.

server.0=localhost:2888:3888 will change to something like

server.0=zk_host_1:2888:3888
server.1=zk_host_2:2888:3888
server.3=zk_host_3:2888:3888

The same config needs to be on all the zookeeper hosts.

Also, I assume it's a self managed ZK.

Secondly, I'm seeing session timeouts between RS and ZK, which means there is something going on because of which RS is not able to talk to ZK. This could happen due to the following reasons:

1. RS is loaded and is not able to communicate with ZK. This could be due to a GC pause as well. Based on what you are saying, there is nothing happening on the cluster so that should not be the case

2. The network is acting up. It is very much possible that packets are getting dropped. I have send that happen myself and it was really hard to debug. The NoRouteToHostExceptions hints at that. I'm seeing those in your RS logs too, although that's to do with it not being able to talk to HDFS:

> 2012-07-03 18:47:25,161 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream 172.18.0.18:50010 java.net.NoRouteToHostException: No route to host
Do you have monitoring in place? Can you get more info on whats going on on the hosts and the network?

Also, you can collocate datanodes and region servers, which is not what you have done currently.

What's the hardware config on these boxes?

-Amandeep
On Tuesday, July 3, 2012 at 8:16 PM, Jay Wilson wrote:

> First, thank you for looking at this for me.
>
> Second, the network is up. It is dedicated to the cluster and it appears
> stable.
>
> Third, I haven't modified the zoo.cfg; however, I have put it on
> pastebin. I made all my zookeeper changes in hbase-site.xml
>
> zoo.cfg -- http://pastebin.com/download.php?i=askC9VRG
> hbase-site.xml -- http://pastebin.com/download.php?i=DkLGr57G
>
> HMASTER LOG -- http://pastebin.com/download.php?i=i4U52cWf
>
> ZK (devrackA-03) -- http://pastebin.com/download.php?i=CRyQFKFF
> ZK (devrackA-04) -- http://pastebin.com/download.php?i=WAqAhjdh
> ZK (devrackA-05) -- http://pastebin.com/download.php?i=cS1Gm19x
>
> RS (devrackA-06) -- http://pastebin.com/download.php?i=XayB2HeX
> RS (devrackB-07) -- http://pastebin.com/download.php?i=RQZ45a8j
> RS (devrackB-08) -- http://pastebin.com/download.php?i=ZDZD0z7B
>
> ---
> Jay Wilson
>
> On 7/3/2012 5:23 PM, Amandeep Khurana wrote:
> > Can you put your zoo.cfg and hbase-site.xml on pastebin and put the links here? Have you verified that your network is fine?
> > Also, can you put up your RS and ZK logs too?
> >
> >
> >
> > On Tuesday, July 3, 2012 at 5:19 PM, Jay Wilson wrote:
> >
> > > I have reread the sections in the O'Reilly HBase book on cluster
> > > configuration and troubleshooting and I am still getting "session
> > > expired" after X number of minutes. X being anywhere from 15 to 20 minutes.
> > >
> > > There is 0 load on the cluster and it's using a dedicated isolated
> > > network. No jobs running just the Hadoop/Hbase java processes.
> > >
> > > I have separated the Hadoop and HBase processes as follows:
> > >
> > > devrackA-00 (NameNode)
> > > devrackA-01 (SecondaryNameNode)
> > > devrackA-03 (HQuorumPeer + HMaster)
> > > devrackA-04 (HQuorumPeer)
> > > devrackA-05 (HQuorumPeer)
> > > devrackA-06 (HRegionServer)
> > > devrackA-07
> > > to (DataNode)
> > > devrackA-20
> > > devrackB-00
> > > to (DataNode)
> > > devrackB-06
> > > devrackB-07 (HRegionServer)
> > > devrackB-08 (HRegionServer)
> > > devrackB-09
> > > to (DataNode)
> > > devrackB-20
> > >
> > > I did have DataNode on my HQuorumPeers and HRegionServers, but I have
> > > excluded them and verified they are excluded:
> > >
> > > Name: devrackA-03
> > > Decommission Status : Normal
> > > Configured Capacity: 0 (0 KB)
> > > DFS Used: 0 (0 KB)
> > > Non DFS Used: 0 (0 KB)
> > > DFS Remaining: 0(0 KB)
> > > DFS Used%: 100%
> > > DFS Remaining%: 0%
> > > Last contact: Wed Dec 31 16:00:00 PST 1969
> > >
> > >
> >
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB