Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> RE: Follow-up to regionservers not being online - more logs included


+
rama krishna 2012-10-19, 15:45
+
Dan Brodsky 2012-10-19, 13:41
Copy link to this message
-
Re: Follow-up to regionservers not being online - more logs included
Can you attach the Master logs also.  Looks that the ROOT region assignment
failed.  This seems to be the first problem.

Regards
Ram

On Fri, Oct 19, 2012 at 7:11 PM, Dan Brodsky <[EMAIL PROTECTED]> wrote:

> I'm still having several issues with my cluster. This used to all
> work, and there have been no recent configuration changes.
>
> To recap, Master and regionservers all appear to start successfully,
> but several regionservers do not show as online on Hbase master status
> page. Moreover, there appear to be a bunch of regions stuck in
> transition that never open. Of the 5 regions currently on the status
> page, only two have a numberOfOnlineRegions > 0.
>
> Log file snippets:
>
> First, the ZooKeeper Dump from off the master status web page shows
> that some of the regionservers have connected to ZK, but they still
> don't show as being online. Note that the IP ending in 217 is the
> Hbase master, the ones ending in 31-40 are RS's 1-10 respectively:
> http://paste.ee/p/JAUfJ
>
> This is the log file for one of the regionservers that did not come
> online, showing not much of anything, I'm afraid:
> http://paste.ee/p/KHgOP
>
> In one of the RegionServers that did come online, I'm seeing this
> error repeat over and over (several of the RS_ZK_REGION_OPENING debug
> statements precede the error): http://paste.ee/p/lbiTN
>
> ZooKeeper log for one of the ZK nodes. Not much remarkable here; the
> nodes connect successfully, and there's a repeat opening/closing of a
> session with the Hbase master (which is also a ZK quorum peer):
> http://paste.ee/p/zjSCO
>
> The master log doesn't show much. A lot this:
>
> CatalogTracker: Failed verification of .META.,,1 at
> address=dn-4,60020,1350563250999;
> org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException: Region is not
> online: .META.,,1
>
> But then it does find .META. and open it on a different RS:
>
> 2012-10-19 12:59:21,480 INFO
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling
> OPENED event for .META.,,1.1028785192 from dn-3,60020,1350651496690;
> deleting unassigned node
> 2012-10-19 12:59:21,482 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: The master has
> opened the region .META.,,1.1028785192 that was online on
> dn-3,60020,1350651496690
> 2012-10-19 12:59:21,497 INFO org.apache.hadoop.hbase.master.HMaster:
> .META. assigned=2, rit=false, location=dn-3,60020,1350651496690
>
> The master log file goes on to show that 71 regions come online, which
> is consistent with the master status page.
>
> Thoughts?
>
+
ramkrishna vasudevan 2012-10-19, 16:23