|
|
Jack Levin 2011-05-24, 22:33
just put new hbase version on our test cluster. and been testing it... so far if I shutdown an RS, master does not reassign its regions, and we remain inconsistent forerver, likewise when new RS is up, it does not get regions assigned to it, this is the master log: 2011-05-24 15:30:57,724 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected, path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 2011-05-24 15:30:57,724 INFO org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [img645.prod.imageshack.com,60020,1306276075768] 2011-05-24 15:30:57,724 INFO org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo found for img645.prod.imageshack.com,60020,1306276075768 2011-05-24 15:30:57,726 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs 2011-05-24 15:31:03,330 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs 2011-05-24 15:31:03,338 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s) of data from znode /hbase/rs/img645.prod.imageshack.com,60020,1306276262774 and set watcher; img645.prod.imageshack.com:60020 2011-05-24 15:31:03,350 INFO org.apache.hadoop.hbase.master.ServerManager: Server start rejected; we already have img645.imageshack.us:60020 registered; existingServer=serverName=img645.imageshack.us,60020,1306276075768, load=(requests=0, regions=0, usedHeap=40, maxHeap=3995), newServer=serverName=img645.imageshack.us,60020,1306276262774, load=(requests=0, regions=0, usedHeap=23, maxHeap=3995) 2011-05-24 15:31:03,350 INFO org.apache.hadoop.hbase.master.ServerManager: Triggering server recovery; existingServer img645.imageshack.us,60020,1306276075768 looks stale 2011-05-24 15:31:03,353 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=img645.imageshack.us,60020,1306276075768 to dead servers, submitted shutdown handler to be executed, root=false, meta=false 2011-05-24 15:31:03,353 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for img645.imageshack.us,60020,1306276075768 2011-05-24 15:31:04,348 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Reassigning 0 region(s) that img645.imageshack.us,60020,1306276075768 was carrying (skipping 0 regions(s) that are already in transition) 2011-05-24 15:31:04,348 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of img645.imageshack.us,60020,1306276075768 2011-05-24 15:31:06,333 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server img645.imageshack.us,60020,1306276262774 came back up, removed it from the dead servers list 2011-05-24 15:31:06,333 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=img645.imageshack.us,60020,1306276262774, regionCount=0, userLoad=false 2011-05-24 15:31:49,890 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection opening connection to ZooKeeper with ensemble (img648:2181) 2011-05-24 15:31:49,890 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=img648:2181 sessionTimeout=180000 watcher=hconnection 2011-05-24 15:31:49,891 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server img648/38.99.76.205:2181 2011-05-24 15:31:49,892 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to img648/38.99.76.205:2181, initiating session 2011-05-24 15:31:49,893 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server img648/38.99.76.205:2181, sessionid 0x13024216e690004, negotiated timeout = 180000 2011-05-24 15:31:49,894 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: hconnection Received ZooKeeper Event, type=None, state=SyncConnected, path=null 2011-05-24 15:31:49,895 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: hconnection-0x13024216e690004 connected 2011-05-24 15:31:49,896 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x13024216e690004 Set watcher on existing znode /hbase/master 2011-05-24 15:31:49,896 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x13024216e690004 Retrieved 32 byte(s) of data from znode /hbase/master and set watcher; img648.prod.imageshack.com:60000 2011-05-24 15:31:49,897 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x13024216e690004 Set watcher on existing znode /hbase/root-region-server 2011-05-24 15:31:49,897 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x13024216e690004 Retrieved 26 byte(s) of data from znode /hbase/root-region-server and set watcher; img731.imageshack.us:60020 2011-05-24 15:31:49,900 DEBUG org.apache.hadoop.hbase.client.MetaScanner: Scanning .META. starting at row= for max=2147483647 rows 2011-05-24 15:31:49,900 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@26f50154; hsa=img731.imageshack.us:60020 2011-05-24 15:31:49,913 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Cached location for .META.,,1.1028785192 is img654.imageshack.us:60020 2011-05-24 15:31:50,061 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x13024216e690004 2011-05-24 15:31:50,063 INFO org.apache.zookeeper.ZooKeeper: Session: 0x13024216e690004 closed 2011-05-24 15:31:50,063 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
Please help :)
-Jack
What's the relationsjp between img645.imageshack.us< http://img645.imageshack.us:60020/>andimg645.prod.imageshack.com ? On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > just put new hbase version on our test cluster. and been testing it... > so far if I shutdown an RS, master does not reassign its regions, and > we remain inconsistent forerver, likewise when new RS is up, it does > not get regions assigned to it, this is the master log: > > > 2011-05-24 15:30:57,724 DEBUG > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: > master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper > Event, type=NodeDeleted, state=SyncConnected, > path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 > 2011-05-24 15:30:57,724 INFO > org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer > ephemeral node deleted, processing expiration > [img645.prod.imageshack.com,60020,1306276075768] > 2011-05-24 15:30:57,724 INFO > org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo > found for img645.prod.imageshack.com,60020,1306276075768 > 2011-05-24 15:30:57,726 DEBUG > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: > master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper > Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs > 2011-05-24 15:31:03,330 DEBUG > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: > master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper > Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs > 2011-05-24 15:31:03,338 DEBUG > org.apache.hadoop.hbase.zookeeper.ZKUtil: > master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s) > of data from znode > /hbase/rs/img645.prod.imageshack.com,60020,1306276262774 and set > watcher; img645.prod.imageshack.com:60020 > 2011-05-24 15:31:03,350 INFO > org.apache.hadoop.hbase.master.ServerManager: Server start rejected; > we already have img645.imageshack.us:60020 registered; > existingServer=serverName=img645.imageshack.us,60020,1306276075768, > load=(requests=0, regions=0, usedHeap=40, maxHeap=3995), > newServer=serverName=img645.imageshack.us,60020,1306276262774, > load=(requests=0, regions=0, usedHeap=23, maxHeap=3995) > 2011-05-24 15:31:03,350 INFO > org.apache.hadoop.hbase.master.ServerManager: Triggering server > recovery; existingServer img645.imageshack.us,60020,1306276075768 > looks stale > 2011-05-24 15:31:03,353 DEBUG > org.apache.hadoop.hbase.master.ServerManager: > Added=img645.imageshack.us,60020,1306276075768 to dead servers, > submitted shutdown handler to be executed, root=false, meta=false > 2011-05-24 15:31:03,353 INFO > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: > Splitting logs for img645.imageshack.us,60020,1306276075768 > 2011-05-24 15:31:04,348 INFO > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: > Reassigning 0 region(s) that img645.imageshack.us,60020,1306276075768 > was carrying (skipping 0 regions(s) that are already in transition) > 2011-05-24 15:31:04,348 INFO > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished > processing of shutdown of img645.imageshack.us,60020,1306276075768 > 2011-05-24 15:31:06,333 DEBUG > org.apache.hadoop.hbase.master.ServerManager: Server > img645.imageshack.us,60020,1306276262774 came back up, removed it from > the dead servers list > 2011-05-24 15:31:06,333 INFO > org.apache.hadoop.hbase.master.ServerManager: Registering > server=img645.imageshack.us,60020,1306276262774, regionCount=0, > userLoad=false > 2011-05-24 15:31:49,890 DEBUG > org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection opening > connection to ZooKeeper with ensemble (img648:2181) > 2011-05-24 15:31:49,890 INFO org.apache.zookeeper.ZooKeeper: > Initiating client connection, connectString=img648:2181 > sessionTimeout=180000 watcher=hconnection > 2011-05-24 15:31:49,891 INFO org.apache.zookeeper.ClientCnxn: Opening > socket connection to server img648/38.99.76.205
Dave Latham 2011-05-24, 22:43
Are you using the graceful_stop script? In 0.90.3 the bin/graceful_stop.sh script was updated to disable the master's balancer. However, it doesn't seem that anything re-enables it, so if you're using it you need to re-enable it on your own. See the book for more details: http://hbase.apache.org/book.html#decommissionDave On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > just put new hbase version on our test cluster. and been testing it... > so far if I shutdown an RS, master does not reassign its regions, and > we remain inconsistent forerver, likewise when new RS is up, it does > not get regions assigned to it, this is the master log: > > > 2011-05-24 15:30:57,724 DEBUG > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: > master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper > Event, type=NodeDeleted, state=SyncConnected, > path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 > 2011-05-24 15:30:57,724 INFO > org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer > ephemeral node deleted, processing expiration > [img645.prod.imageshack.com,60020,1306276075768] > 2011-05-24 15:30:57,724 INFO > org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo > found for img645.prod.imageshack.com,60020,1306276075768 > 2011-05-24 15:30:57,726 DEBUG > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: > master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper > Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs > 2011-05-24 15:31:03,330 DEBUG > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: > master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper > Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs > 2011-05-24 15:31:03,338 DEBUG > org.apache.hadoop.hbase.zookeeper.ZKUtil: > master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s) > of data from znode > /hbase/rs/img645.prod.imageshack.com,60020,1306276262774 and set > watcher; img645.prod.imageshack.com:60020 > 2011-05-24 15:31:03,350 INFO > org.apache.hadoop.hbase.master.ServerManager: Server start rejected; > we already have img645.imageshack.us:60020 registered; > existingServer=serverName=img645.imageshack.us,60020,1306276075768, > load=(requests=0, regions=0, usedHeap=40, maxHeap=3995), > newServer=serverName=img645.imageshack.us,60020,1306276262774, > load=(requests=0, regions=0, usedHeap=23, maxHeap=3995) > 2011-05-24 15:31:03,350 INFO > org.apache.hadoop.hbase.master.ServerManager: Triggering server > recovery; existingServer img645.imageshack.us,60020,1306276075768 > looks stale > 2011-05-24 15:31:03,353 DEBUG > org.apache.hadoop.hbase.master.ServerManager: > Added=img645.imageshack.us,60020,1306276075768 to dead servers, > submitted shutdown handler to be executed, root=false, meta=false > 2011-05-24 15:31:03,353 INFO > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: > Splitting logs for img645.imageshack.us,60020,1306276075768 > 2011-05-24 15:31:04,348 INFO > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: > Reassigning 0 region(s) that img645.imageshack.us,60020,1306276075768 > was carrying (skipping 0 regions(s) that are already in transition) > 2011-05-24 15:31:04,348 INFO > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished > processing of shutdown of img645.imageshack.us,60020,1306276075768 > 2011-05-24 15:31:06,333 DEBUG > org.apache.hadoop.hbase.master.ServerManager: Server > img645.imageshack.us,60020,1306276262774 came back up, removed it from > the dead servers list > 2011-05-24 15:31:06,333 INFO > org.apache.hadoop.hbase.master.ServerManager: Registering > server=img645.imageshack.us,60020,1306276262774, regionCount=0, > userLoad=false > 2011-05-24 15:31:49,890 DEBUG > org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection opening > connection to ZooKeeper with ensemble (img648:2181) > 2011-05-24 15:31:49,890 INFO org.apache.zookeeper.ZooKeeper: > Initiating client connection, connectString=img648:2181
Jack Levin 2011-05-24, 22:50
looks like our balancer is on: hbase(main):001:0> balance_switch true true 0 row(s) in 0.3700 seconds I simply kill PID for RS, and it stays on the list with regions assigned, and master does not know about it. So it still does not work. -Jack On Tue, May 24, 2011 at 3:43 PM, Dave Latham <[EMAIL PROTECTED]> wrote: > Are you using the graceful_stop script? > > In 0.90.3 the bin/graceful_stop.sh script was updated to disable the > master's balancer. However, it doesn't seem that anything re-enables it, so > if you're using it you need to re-enable it on your own. See the book for > more details: > http://hbase.apache.org/book.html#decommission> > Dave > > On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > >> just put new hbase version on our test cluster. and been testing it... >> so far if I shutdown an RS, master does not reassign its regions, and >> we remain inconsistent forerver, likewise when new RS is up, it does >> not get regions assigned to it, this is the master log: >> >> >> 2011-05-24 15:30:57,724 DEBUG >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >> Event, type=NodeDeleted, state=SyncConnected, >> path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 >> 2011-05-24 15:30:57,724 INFO >> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer >> ephemeral node deleted, processing expiration >> [img645.prod.imageshack.com,60020,1306276075768] >> 2011-05-24 15:30:57,724 INFO >> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo >> found for img645.prod.imageshack.com,60020,1306276075768 >> 2011-05-24 15:30:57,726 DEBUG >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >> 2011-05-24 15:31:03,330 DEBUG >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >> 2011-05-24 15:31:03,338 DEBUG >> org.apache.hadoop.hbase.zookeeper.ZKUtil: >> master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s) >> of data from znode >> /hbase/rs/img645.prod.imageshack.com,60020,1306276262774 and set >> watcher; img645.prod.imageshack.com:60020 >> 2011-05-24 15:31:03,350 INFO >> org.apache.hadoop.hbase.master.ServerManager: Server start rejected; >> we already have img645.imageshack.us:60020 registered; >> existingServer=serverName=img645.imageshack.us,60020,1306276075768, >> load=(requests=0, regions=0, usedHeap=40, maxHeap=3995), >> newServer=serverName=img645.imageshack.us,60020,1306276262774, >> load=(requests=0, regions=0, usedHeap=23, maxHeap=3995) >> 2011-05-24 15:31:03,350 INFO >> org.apache.hadoop.hbase.master.ServerManager: Triggering server >> recovery; existingServer img645.imageshack.us,60020,1306276075768 >> looks stale >> 2011-05-24 15:31:03,353 DEBUG >> org.apache.hadoop.hbase.master.ServerManager: >> Added=img645.imageshack.us,60020,1306276075768 to dead servers, >> submitted shutdown handler to be executed, root=false, meta=false >> 2011-05-24 15:31:03,353 INFO >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: >> Splitting logs for img645.imageshack.us,60020,1306276075768 >> 2011-05-24 15:31:04,348 INFO >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: >> Reassigning 0 region(s) that img645.imageshack.us,60020,1306276075768 >> was carrying (skipping 0 regions(s) that are already in transition) >> 2011-05-24 15:31:04,348 INFO >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished >> processing of shutdown of img645.imageshack.us,60020,1306276075768 >> 2011-05-24 15:31:06,333 DEBUG >> org.apache.hadoop.hbase.master.ServerManager: Server >> img645.imageshack.us,60020,1306276262774 came back up, removed it from >> the dead servers list
Jack Levin 2011-05-24, 23:04
img645.prod.imageshack.us and img645.imageshack.us are both point to the same IP. -Jack On Tue, May 24, 2011 at 3:50 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > looks like our balancer is on: > > hbase(main):001:0> balance_switch true > true > 0 row(s) in 0.3700 seconds > > I simply kill PID for RS, and it stays on the list with regions > assigned, and master does not know about it. > > So it still does not work. > > -Jack > > On Tue, May 24, 2011 at 3:43 PM, Dave Latham <[EMAIL PROTECTED]> wrote: >> Are you using the graceful_stop script? >> >> In 0.90.3 the bin/graceful_stop.sh script was updated to disable the >> master's balancer. However, it doesn't seem that anything re-enables it, so >> if you're using it you need to re-enable it on your own. See the book for >> more details: >> http://hbase.apache.org/book.html#decommission>> >> Dave >> >> On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >> >>> just put new hbase version on our test cluster. and been testing it... >>> so far if I shutdown an RS, master does not reassign its regions, and >>> we remain inconsistent forerver, likewise when new RS is up, it does >>> not get regions assigned to it, this is the master log: >>> >>> >>> 2011-05-24 15:30:57,724 DEBUG >>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>> Event, type=NodeDeleted, state=SyncConnected, >>> path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 >>> 2011-05-24 15:30:57,724 INFO >>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer >>> ephemeral node deleted, processing expiration >>> [img645.prod.imageshack.com,60020,1306276075768] >>> 2011-05-24 15:30:57,724 INFO >>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo >>> found for img645.prod.imageshack.com,60020,1306276075768 >>> 2011-05-24 15:30:57,726 DEBUG >>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>> 2011-05-24 15:31:03,330 DEBUG >>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>> 2011-05-24 15:31:03,338 DEBUG >>> org.apache.hadoop.hbase.zookeeper.ZKUtil: >>> master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s) >>> of data from znode >>> /hbase/rs/img645.prod.imageshack.com,60020,1306276262774 and set >>> watcher; img645.prod.imageshack.com:60020 >>> 2011-05-24 15:31:03,350 INFO >>> org.apache.hadoop.hbase.master.ServerManager: Server start rejected; >>> we already have img645.imageshack.us:60020 registered; >>> existingServer=serverName=img645.imageshack.us,60020,1306276075768, >>> load=(requests=0, regions=0, usedHeap=40, maxHeap=3995), >>> newServer=serverName=img645.imageshack.us,60020,1306276262774, >>> load=(requests=0, regions=0, usedHeap=23, maxHeap=3995) >>> 2011-05-24 15:31:03,350 INFO >>> org.apache.hadoop.hbase.master.ServerManager: Triggering server >>> recovery; existingServer img645.imageshack.us,60020,1306276075768 >>> looks stale >>> 2011-05-24 15:31:03,353 DEBUG >>> org.apache.hadoop.hbase.master.ServerManager: >>> Added=img645.imageshack.us,60020,1306276075768 to dead servers, >>> submitted shutdown handler to be executed, root=false, meta=false >>> 2011-05-24 15:31:03,353 INFO >>> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: >>> Splitting logs for img645.imageshack.us,60020,1306276075768 >>> 2011-05-24 15:31:04,348 INFO >>> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: >>> Reassigning 0 region(s) that img645.imageshack.us,60020,1306276075768 >>> was carrying (skipping 0 regions(s) that are already in transition) >>> 2011-05-24 15:31:04,348 INFO >>> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished >>> processing of shutdown of img645.imageshack.us,60020,1306276075768
Jack Levin 2011-05-24, 23:37
figured it out... the /etc/hosts file has ip to name, was used by zookeeper was *.prod.imageshack.com, while hostname was imgXX.imageshack.us... use by Regionserver/Master - Ideally, all three components should source hostnames form same place, whether its hostname or /etc/hosts (or dns), etc... it gotta be consistent, otherwise aliases end up screwing things up and people will end up guessing why things don't work. -Jack On Tue, May 24, 2011 at 4:04 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > img645.prod.imageshack.us and img645.imageshack.us are both point to > the same IP. > > -Jack > > On Tue, May 24, 2011 at 3:50 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >> looks like our balancer is on: >> >> hbase(main):001:0> balance_switch true >> true >> 0 row(s) in 0.3700 seconds >> >> I simply kill PID for RS, and it stays on the list with regions >> assigned, and master does not know about it. >> >> So it still does not work. >> >> -Jack >> >> On Tue, May 24, 2011 at 3:43 PM, Dave Latham <[EMAIL PROTECTED]> wrote: >>> Are you using the graceful_stop script? >>> >>> In 0.90.3 the bin/graceful_stop.sh script was updated to disable the >>> master's balancer. However, it doesn't seem that anything re-enables it, so >>> if you're using it you need to re-enable it on your own. See the book for >>> more details: >>> http://hbase.apache.org/book.html#decommission>>> >>> Dave >>> >>> On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>> >>>> just put new hbase version on our test cluster. and been testing it... >>>> so far if I shutdown an RS, master does not reassign its regions, and >>>> we remain inconsistent forerver, likewise when new RS is up, it does >>>> not get regions assigned to it, this is the master log: >>>> >>>> >>>> 2011-05-24 15:30:57,724 DEBUG >>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>> Event, type=NodeDeleted, state=SyncConnected, >>>> path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 >>>> 2011-05-24 15:30:57,724 INFO >>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer >>>> ephemeral node deleted, processing expiration >>>> [img645.prod.imageshack.com,60020,1306276075768] >>>> 2011-05-24 15:30:57,724 INFO >>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo >>>> found for img645.prod.imageshack.com,60020,1306276075768 >>>> 2011-05-24 15:30:57,726 DEBUG >>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>>> 2011-05-24 15:31:03,330 DEBUG >>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>>> 2011-05-24 15:31:03,338 DEBUG >>>> org.apache.hadoop.hbase.zookeeper.ZKUtil: >>>> master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s) >>>> of data from znode >>>> /hbase/rs/img645.prod.imageshack.com,60020,1306276262774 and set >>>> watcher; img645.prod.imageshack.com:60020 >>>> 2011-05-24 15:31:03,350 INFO >>>> org.apache.hadoop.hbase.master.ServerManager: Server start rejected; >>>> we already have img645.imageshack.us:60020 registered; >>>> existingServer=serverName=img645.imageshack.us,60020,1306276075768, >>>> load=(requests=0, regions=0, usedHeap=40, maxHeap=3995), >>>> newServer=serverName=img645.imageshack.us,60020,1306276262774, >>>> load=(requests=0, regions=0, usedHeap=23, maxHeap=3995) >>>> 2011-05-24 15:31:03,350 INFO >>>> org.apache.hadoop.hbase.master.ServerManager: Triggering server >>>> recovery; existingServer img645.imageshack.us,60020,1306276075768 >>>> looks stale >>>> 2011-05-24 15:31:03,353 DEBUG >>>> org.apache.hadoop.hbase.master.ServerManager: >>>> Added=img645.imageshack.us,60020,1306276075768 to dead servers, >>>> submitted shutdown handler to be executed, root=false, meta=false
Andrew Purtell 2011-05-25, 00:45
> From: Jack Levin <[EMAIL PROTECTED]> > figured it out... the /etc/hosts file has ip to name, was used by > zookeeper was *.prod.imageshack.com, while hostname was > imgXX.imageshack.us... use by Regionserver/Master - Ideally, all > three components should source hostnames form same place, whether its > hostname or /etc/hosts (or dns), etc... it gotta be consistent, > otherwise aliases end up screwing things up and people will end up > guessing why things don't work.
I suspect users encountering this will happen from time to time.
One of our teams encountered something like this (but with 0.20.x so the result was much worse) prior to establishing better practice, i.e. use of Puppet to distribute resolv.conf, nsswitch.conf, and hosts from a central location. Inconsistencies with reverse hostname lookups will give any distributed system that uses it for naming fits. No reason not to use it if convenient, but we should definitely call this out prominently in the book xml.
- Andy
Jack Levin 2011-05-25, 01:01
Then I recommend scratching hostname use in leu of reverse lookup only
-Jack On May 24, 2011, at 5:45 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>> From: Jack Levin <[EMAIL PROTECTED]> >> figured it out... the /etc/hosts file has ip to name, was used by >> zookeeper was *.prod.imageshack.com, while hostname was >> imgXX.imageshack.us... use by Regionserver/Master - Ideally, all >> three components should source hostnames form same place, whether its >> hostname or /etc/hosts (or dns), etc... it gotta be consistent, >> otherwise aliases end up screwing things up and people will end up >> guessing why things don't work. > > I suspect users encountering this will happen from time to time. > > One of our teams encountered something like this (but with 0.20.x so the result was much worse) prior to establishing better practice, i.e. use of Puppet to distribute resolv.conf, nsswitch.conf, and hosts from a central location. Inconsistencies with reverse hostname lookups will give any distributed system that uses it for naming fits. No reason not to use it if convenient, but we should definitely call this out prominently in the book xml. > > - Andy >
Jean-Daniel Cryans 2011-05-25, 01:02
Zookeeper doesn't query addresses, it's all done in HBase which in turn stores it in ZK. Also http://hbase.apache.org/book.html#dnsJ-D On Tue, May 24, 2011 at 4:37 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > figured it out... the /etc/hosts file has ip to name, was used by > zookeeper was *.prod.imageshack.com, while hostname was > imgXX.imageshack.us... use by Regionserver/Master - Ideally, all > three components should source hostnames form same place, whether its > hostname or /etc/hosts (or dns), etc... it gotta be consistent, > otherwise aliases end up screwing things up and people will end up > guessing why things don't work. > > -Jack > > On Tue, May 24, 2011 at 4:04 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >> img645.prod.imageshack.us and img645.imageshack.us are both point to >> the same IP. >> >> -Jack >> >> On Tue, May 24, 2011 at 3:50 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>> looks like our balancer is on: >>> >>> hbase(main):001:0> balance_switch true >>> true >>> 0 row(s) in 0.3700 seconds >>> >>> I simply kill PID for RS, and it stays on the list with regions >>> assigned, and master does not know about it. >>> >>> So it still does not work. >>> >>> -Jack >>> >>> On Tue, May 24, 2011 at 3:43 PM, Dave Latham <[EMAIL PROTECTED]> wrote: >>>> Are you using the graceful_stop script? >>>> >>>> In 0.90.3 the bin/graceful_stop.sh script was updated to disable the >>>> master's balancer. However, it doesn't seem that anything re-enables it, so >>>> if you're using it you need to re-enable it on your own. See the book for >>>> more details: >>>> http://hbase.apache.org/book.html#decommission>>>> >>>> Dave >>>> >>>> On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>> >>>>> just put new hbase version on our test cluster. and been testing it... >>>>> so far if I shutdown an RS, master does not reassign its regions, and >>>>> we remain inconsistent forerver, likewise when new RS is up, it does >>>>> not get regions assigned to it, this is the master log: >>>>> >>>>> >>>>> 2011-05-24 15:30:57,724 DEBUG >>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>> Event, type=NodeDeleted, state=SyncConnected, >>>>> path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 >>>>> 2011-05-24 15:30:57,724 INFO >>>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer >>>>> ephemeral node deleted, processing expiration >>>>> [img645.prod.imageshack.com,60020,1306276075768] >>>>> 2011-05-24 15:30:57,724 INFO >>>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo >>>>> found for img645.prod.imageshack.com,60020,1306276075768 >>>>> 2011-05-24 15:30:57,726 DEBUG >>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>>>> 2011-05-24 15:31:03,330 DEBUG >>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>>>> 2011-05-24 15:31:03,338 DEBUG >>>>> org.apache.hadoop.hbase.zookeeper.ZKUtil: >>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s) >>>>> of data from znode >>>>> /hbase/rs/img645.prod.imageshack.com,60020,1306276262774 and set >>>>> watcher; img645.prod.imageshack.com:60020 >>>>> 2011-05-24 15:31:03,350 INFO >>>>> org.apache.hadoop.hbase.master.ServerManager: Server start rejected; >>>>> we already have img645.imageshack.us:60020 registered; >>>>> existingServer=serverName=img645.imageshack.us,60020,1306276075768, >>>>> load=(requests=0, regions=0, usedHeap=40, maxHeap=3995), >>>>> newServer=serverName=img645.imageshack.us,60020,1306276262774, >>>>> load=(requests=0, regions=0, usedHeap=23, maxHeap=3995) >>>>> 2011-05-24 15:31:03,350 INFO >>>>> org.apache.hadoop.hbase.master.ServerManager: Triggering server
Jack Levin 2011-05-25, 02:03
"HBase uses the local hostname to self-report it's IP address." using 'hostname' as authoritative name for regionserver is what caused all of the confusion, hostname usually not governed by name resolution (/etc/hosts, dns), some users may call their servers something other than whats in dns, so hbase will break for them if they do. Better idea would be to check eth0 for IP, get reverse dns name for it, and use that. just my small two cents. -Jack On Tue, May 24, 2011 at 6:02 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: > Zookeeper doesn't query addresses, it's all done in HBase which in > turn stores it in ZK. > > Also http://hbase.apache.org/book.html#dns> > J-D > > On Tue, May 24, 2011 at 4:37 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >> figured it out... the /etc/hosts file has ip to name, was used by >> zookeeper was *.prod.imageshack.com, while hostname was >> imgXX.imageshack.us... use by Regionserver/Master - Ideally, all >> three components should source hostnames form same place, whether its >> hostname or /etc/hosts (or dns), etc... it gotta be consistent, >> otherwise aliases end up screwing things up and people will end up >> guessing why things don't work. >> >> -Jack >> >> On Tue, May 24, 2011 at 4:04 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>> img645.prod.imageshack.us and img645.imageshack.us are both point to >>> the same IP. >>> >>> -Jack >>> >>> On Tue, May 24, 2011 at 3:50 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>> looks like our balancer is on: >>>> >>>> hbase(main):001:0> balance_switch true >>>> true >>>> 0 row(s) in 0.3700 seconds >>>> >>>> I simply kill PID for RS, and it stays on the list with regions >>>> assigned, and master does not know about it. >>>> >>>> So it still does not work. >>>> >>>> -Jack >>>> >>>> On Tue, May 24, 2011 at 3:43 PM, Dave Latham <[EMAIL PROTECTED]> wrote: >>>>> Are you using the graceful_stop script? >>>>> >>>>> In 0.90.3 the bin/graceful_stop.sh script was updated to disable the >>>>> master's balancer. However, it doesn't seem that anything re-enables it, so >>>>> if you're using it you need to re-enable it on your own. See the book for >>>>> more details: >>>>> http://hbase.apache.org/book.html#decommission>>>>> >>>>> Dave >>>>> >>>>> On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>>> >>>>>> just put new hbase version on our test cluster. and been testing it... >>>>>> so far if I shutdown an RS, master does not reassign its regions, and >>>>>> we remain inconsistent forerver, likewise when new RS is up, it does >>>>>> not get regions assigned to it, this is the master log: >>>>>> >>>>>> >>>>>> 2011-05-24 15:30:57,724 DEBUG >>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>>> Event, type=NodeDeleted, state=SyncConnected, >>>>>> path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 >>>>>> 2011-05-24 15:30:57,724 INFO >>>>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer >>>>>> ephemeral node deleted, processing expiration >>>>>> [img645.prod.imageshack.com,60020,1306276075768] >>>>>> 2011-05-24 15:30:57,724 INFO >>>>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo >>>>>> found for img645.prod.imageshack.com,60020,1306276075768 >>>>>> 2011-05-24 15:30:57,726 DEBUG >>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>>>>> 2011-05-24 15:31:03,330 DEBUG >>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>>>>> 2011-05-24 15:31:03,338 DEBUG >>>>>> org.apache.hadoop.hbase.zookeeper.ZKUtil: >>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s) >>>>>> of data from znode
Jack Levin 2011-05-25, 02:11
like 19:09:23 208.94.1.52 jack@zero:~ $ host 38.99.76.204 204.76.99.38.in-addr.arpa domain name pointer img646.imageshack.us. 19:10:26 208.94.1.52 jack@zero:~ $ This is the name I wanted it to use. It appears that with current setup, we can't change hostnames. -Jack On Tue, May 24, 2011 at 7:03 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > "HBase uses the local hostname to self-report it's IP address." > > using 'hostname' as authoritative name for regionserver is what caused > all of the confusion, hostname usually not governed by name resolution > (/etc/hosts, dns), some users may call their servers something other > than whats in dns, so hbase will break for them if they do. Better > idea would be to check eth0 for IP, get reverse dns name for it, and > use that. > > just my small two cents. > > -Jack > > On Tue, May 24, 2011 at 6:02 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: >> Zookeeper doesn't query addresses, it's all done in HBase which in >> turn stores it in ZK. >> >> Also http://hbase.apache.org/book.html#dns>> >> J-D >> >> On Tue, May 24, 2011 at 4:37 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>> figured it out... the /etc/hosts file has ip to name, was used by >>> zookeeper was *.prod.imageshack.com, while hostname was >>> imgXX.imageshack.us... use by Regionserver/Master - Ideally, all >>> three components should source hostnames form same place, whether its >>> hostname or /etc/hosts (or dns), etc... it gotta be consistent, >>> otherwise aliases end up screwing things up and people will end up >>> guessing why things don't work. >>> >>> -Jack >>> >>> On Tue, May 24, 2011 at 4:04 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>> img645.prod.imageshack.us and img645.imageshack.us are both point to >>>> the same IP. >>>> >>>> -Jack >>>> >>>> On Tue, May 24, 2011 at 3:50 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>>> looks like our balancer is on: >>>>> >>>>> hbase(main):001:0> balance_switch true >>>>> true >>>>> 0 row(s) in 0.3700 seconds >>>>> >>>>> I simply kill PID for RS, and it stays on the list with regions >>>>> assigned, and master does not know about it. >>>>> >>>>> So it still does not work. >>>>> >>>>> -Jack >>>>> >>>>> On Tue, May 24, 2011 at 3:43 PM, Dave Latham <[EMAIL PROTECTED]> wrote: >>>>>> Are you using the graceful_stop script? >>>>>> >>>>>> In 0.90.3 the bin/graceful_stop.sh script was updated to disable the >>>>>> master's balancer. However, it doesn't seem that anything re-enables it, so >>>>>> if you're using it you need to re-enable it on your own. See the book for >>>>>> more details: >>>>>> http://hbase.apache.org/book.html#decommission>>>>>> >>>>>> Dave >>>>>> >>>>>> On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>>>> >>>>>>> just put new hbase version on our test cluster. and been testing it... >>>>>>> so far if I shutdown an RS, master does not reassign its regions, and >>>>>>> we remain inconsistent forerver, likewise when new RS is up, it does >>>>>>> not get regions assigned to it, this is the master log: >>>>>>> >>>>>>> >>>>>>> 2011-05-24 15:30:57,724 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>>>> Event, type=NodeDeleted, state=SyncConnected, >>>>>>> path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 >>>>>>> 2011-05-24 15:30:57,724 INFO >>>>>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer >>>>>>> ephemeral node deleted, processing expiration >>>>>>> [img645.prod.imageshack.com,60020,1306276075768] >>>>>>> 2011-05-24 15:30:57,724 INFO >>>>>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo >>>>>>> found for img645.prod.imageshack.com,60020,1306276075768 >>>>>>> 2011-05-24 15:30:57,726 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>>>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs
Its different in 0.92.0 Jack. We'll use whatever the master tells us our name is, not what the regionserver finds for its name. St.Ack On Tue, May 24, 2011 at 7:03 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > "HBase uses the local hostname to self-report it's IP address." > > using 'hostname' as authoritative name for regionserver is what caused > all of the confusion, hostname usually not governed by name resolution > (/etc/hosts, dns), some users may call their servers something other > than whats in dns, so hbase will break for them if they do. Better > idea would be to check eth0 for IP, get reverse dns name for it, and > use that. > > just my small two cents. > > -Jack > > On Tue, May 24, 2011 at 6:02 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: >> Zookeeper doesn't query addresses, it's all done in HBase which in >> turn stores it in ZK. >> >> Also http://hbase.apache.org/book.html#dns>> >> J-D >> >> On Tue, May 24, 2011 at 4:37 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>> figured it out... the /etc/hosts file has ip to name, was used by >>> zookeeper was *.prod.imageshack.com, while hostname was >>> imgXX.imageshack.us... use by Regionserver/Master - Ideally, all >>> three components should source hostnames form same place, whether its >>> hostname or /etc/hosts (or dns), etc... it gotta be consistent, >>> otherwise aliases end up screwing things up and people will end up >>> guessing why things don't work. >>> >>> -Jack >>> >>> On Tue, May 24, 2011 at 4:04 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>> img645.prod.imageshack.us and img645.imageshack.us are both point to >>>> the same IP. >>>> >>>> -Jack >>>> >>>> On Tue, May 24, 2011 at 3:50 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>>> looks like our balancer is on: >>>>> >>>>> hbase(main):001:0> balance_switch true >>>>> true >>>>> 0 row(s) in 0.3700 seconds >>>>> >>>>> I simply kill PID for RS, and it stays on the list with regions >>>>> assigned, and master does not know about it. >>>>> >>>>> So it still does not work. >>>>> >>>>> -Jack >>>>> >>>>> On Tue, May 24, 2011 at 3:43 PM, Dave Latham <[EMAIL PROTECTED]> wrote: >>>>>> Are you using the graceful_stop script? >>>>>> >>>>>> In 0.90.3 the bin/graceful_stop.sh script was updated to disable the >>>>>> master's balancer. However, it doesn't seem that anything re-enables it, so >>>>>> if you're using it you need to re-enable it on your own. See the book for >>>>>> more details: >>>>>> http://hbase.apache.org/book.html#decommission>>>>>> >>>>>> Dave >>>>>> >>>>>> On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>>>>> >>>>>>> just put new hbase version on our test cluster. and been testing it... >>>>>>> so far if I shutdown an RS, master does not reassign its regions, and >>>>>>> we remain inconsistent forerver, likewise when new RS is up, it does >>>>>>> not get regions assigned to it, this is the master log: >>>>>>> >>>>>>> >>>>>>> 2011-05-24 15:30:57,724 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>>>> Event, type=NodeDeleted, state=SyncConnected, >>>>>>> path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 >>>>>>> 2011-05-24 15:30:57,724 INFO >>>>>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer >>>>>>> ephemeral node deleted, processing expiration >>>>>>> [img645.prod.imageshack.com,60020,1306276075768] >>>>>>> 2011-05-24 15:30:57,724 INFO >>>>>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo >>>>>>> found for img645.prod.imageshack.com,60020,1306276075768 >>>>>>> 2011-05-24 15:30:57,726 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>>>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>>>>>> 2011-05-24 15:31:03,330 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper
|
|