Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - Cannot locate root region


Copy link to this message
-
Cannot locate root region
Karthik Ranganathan 2010-01-28, 23:57
Hey guys,

Ran into some issues while testing and wanted to understand what has happened better. Got the following exception when I went to the web UI

Trying to contact region server 10.129.68.204:60020 for region .META.,,1, row '', but failed after 3 attempts.
Exceptions:
org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2254)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1837)
        at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>From a program that reads from a HBase table:
java.lang.reflect.UndeclaredThrowableException
        at $Proxy1.getRegionInfo(Unknown Source)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:985)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:625)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:601)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:675)
<snip>
Followed  up on the hmaster's log:

2010-01-28 11:21:16,148 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner scan of 1 row(s) of meta region {server: 10.129.68.204:60020, regionname: .META.,,1, startKey: <>} complete
2010-01-28 11:21:16,148 INFO org.apache.hadoop.hbase.master.BaseScanner: All 1 .META. region(s) scanned
2010-01-28 11:21:34,539 DEBUG org.apache.hadoop.hbase.master.ServerManager: Received report from unknown server -- telling it to MSG_CALL_SERVER_STARTUP: 10.129.68.203,60020,1263605543210
2010-01-28 11:21:35,622 INFO org.apache.hadoop.hbase.master.ServerManager: Received start message from: hbasetest004.ash1.facebook.com,60020,1264706494600
2010-01-28 11:21:36,649 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Updated ZNode /hbase/rs/1264706494600 with data 10.129.68.203:60020
2010-01-28 11:21:40,704 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 39 on 60000, call createTable({NAME => 'test1', FAMILIES => [{NAME => 'cf1', VERSIONS => '3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}) from 10.131.29.183:63308: error: org.apache.hadoop.hbase.TableExistsException: test1
org.apache.hadoop.hbase.TableExistsException: test1
        at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:792)
        at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:756)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)

>From a hregionserver's logs:

2010-01-28 11:20:22,589 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=19.661453MB (20616528), Free=2377.0137MB (2492479408), Max=2396.675MB (2513095936), Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0, Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted/Run=NaN
2010-01-28 11:21:22,588 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=19.661453MB (20616528), Free=2377.0137MB (2492479408), Max=2396.675MB (2513095936), Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0, Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted/Run=NaN
2010-01-28 11:22:18,794 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_CALL_SERVER_STARTUP
The code says the following:
              case MSG_CALL_SERVER_STARTUP:
                // We the MSG_CALL_SERVER_STARTUP on startup but we can also
                // get it when the master is panicking because for instance
                // the HDFS has been yanked out from under it.  Be wary of
                // this message.

Any ideas on what is going on? The best I can come up with is perhaps a flaky DNS - would that explain this? This happened on three of our test clusters at almost the same time. Also, what is the most graceful/simplest way to recover from this?
Thanks
Karthik