Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> infinite loop of RS_ZK_REGION_SPLIT on .94.2


Copy link to this message
-
Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2
Matt:
>From the following we can see that region bc62a8a72124a4ba3f6b9f302587903c
cannot be found:

2012-11-02 00:00:02,909 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=RS_ZK_REGION_SPLIT,
server=HadoopNode162.hotpads.srv,60020,1351788248279,
region=bc62a8a72124a4ba3f6b9f302587903c
2012-11-02 00:00:02,909 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
bc62a8a72124a4ba3f6b9f302587903c *not found on server
HadoopNode162.hotpads*.srv,60020,1351788248279;failed
processing
2012-11-02 00:00:02,909 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
bc62a8a72124a4ba3f6b9f302587903c from server
HadoopNode162.hotpads.srv,60020,1351788248279 but it doesn't exist anymore,
probably already processed its split

Have you run hbck to repair your cluster ?

Thanks

On Sat, Nov 3, 2012 at 2:29 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:

> Here's a sample of the master's logs from yesterday.  It's not correlated
> exactly with the other pastebin log, but there's 3GB of this from
> yesterday: http://pastebin.com/wP2rNN1t
>
> I'm am pushing the cluster a bit with importing data so testing the split
> code harder than normal.  The regions are 500-1GB gzipped.  I can look into
> it more but trying to figure out what to look for.
>
> Thanks Ted,
> Matt
>
>
> On Sat, Nov 3, 2012 at 2:03 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > Matt:
> > This is the method which made the logging:
> >   private static int tickleNodeSplit(ZooKeeperWatcher zkw,
> >       HRegionInfo parent, HRegionInfo a, HRegionInfo b, ServerName
> > serverName,
> >       final int znodeVersion)
> >   throws KeeperException, IOException {
> >     byte [] payload = Writables.getBytes(a, b);
> >     return ZKAssign.transitionNode(zkw, parent, serverName,
> >       EventType.RS_ZK_REGION_SPLIT, EventType.RS_ZK_REGION_SPLIT,
> >       znodeVersion, payload);
> >   }
> >
> > transitionZKNode() calls tickleNodeSplit() when waiting for master to
> split
> > the region. Obviously something caused the master not able to split.
> >
> > How large is the region ?
> >
> > Can you pastebin master log for that period of time ?
> >
> > Thanks
> >
> > On Sat, Nov 3, 2012 at 1:54 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:
> >
> > > We upgraded from .94.0 to .94.2 last week and have started to encounter
> > > infinite loops of region-transition on splits.  I'm not sure yet if
> it's
> > > all splits nor if it's related to load.  Solution so far has been to
> > > restart the regionserver process.
> > >
> > > log snippet:
> > > http://pastebin.com/LpienZ7B
> > >
> > > It's repeating these two lines:
> > > 2012-11-02 01:35:33,312 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > > regionserver:60020-0x13ab46479832b76 Attempting to transition node
> > > cf3e9bc069e1888983c06dc8e053ffcf from RS_ZK_REGION_SPLIT to
> > > RS_ZK_REGION_SPLIT
> > > 2012-11-02 01:35:33,364 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > > regionserver:60020-0x13ab46479832b76 Successfully transitioned node
> > > cf3e9bc069e1888983c06dc8e053ffcf from RS_ZK_REGION_SPLIT to
> > > RS_ZK_REGION_SPLIT
> > >
> > > with the occasional:
> > > 2012-11-02 01:35:34,476 DEBUG
> > > org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on
> > the
> > > master to process the split for cf3e9bc069e1888983c06dc8e053ffcf
> > >
> > > Should the region transition from RS_ZK_REGION_SPLIT to itself?  It
> looks
> > > wrong, but I'm not familiar with the code at all.
> > >
> > > Thanks,
> > > Matt
> > >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB