Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> 0.90.3


looks like our balancer is on:

hbase(main):001:0> balance_switch true
true
0 row(s) in 0.3700 seconds

I simply kill PID for RS, and it stays on the list with regions
assigned, and master does not know about it.

So it still does not work.

-Jack

On Tue, May 24, 2011 at 3:43 PM, Dave Latham <[EMAIL PROTECTED]> wrote:
> Are you using the graceful_stop script?
>
> In 0.90.3 the bin/graceful_stop.sh script was updated to disable the
> master's balancer.  However, it doesn't seem that anything re-enables it, so
> if you're using it you need to re-enable it on your own.  See the book for
> more details:
> http://hbase.apache.org/book.html#decommission
>
> Dave
>
> On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[EMAIL PROTECTED]> wrote:
>
>> just put new hbase version on our test cluster. and been testing it...
>> so far if I shutdown an RS, master does not reassign its regions, and
>> we remain inconsistent forerver, likewise when new RS is up, it does
>> not get regions assigned to it, this is the master log:
>>
>>
>> 2011-05-24 15:30:57,724 DEBUG
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper
>> Event, type=NodeDeleted, state=SyncConnected,
>> path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768
>> 2011-05-24 15:30:57,724 INFO
>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer
>> ephemeral node deleted, processing expiration
>> [img645.prod.imageshack.com,60020,1306276075768]
>> 2011-05-24 15:30:57,724 INFO
>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo
>> found for img645.prod.imageshack.com,60020,1306276075768
>> 2011-05-24 15:30:57,726 DEBUG
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper
>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs
>> 2011-05-24 15:31:03,330 DEBUG
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper
>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs
>> 2011-05-24 15:31:03,338 DEBUG
>> org.apache.hadoop.hbase.zookeeper.ZKUtil:
>> master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s)
>> of data from znode
>> /hbase/rs/img645.prod.imageshack.com,60020,1306276262774 and set
>> watcher; img645.prod.imageshack.com:60020
>> 2011-05-24 15:31:03,350 INFO
>> org.apache.hadoop.hbase.master.ServerManager: Server start rejected;
>> we already have img645.imageshack.us:60020 registered;
>> existingServer=serverName=img645.imageshack.us,60020,1306276075768,
>> load=(requests=0, regions=0, usedHeap=40, maxHeap=3995),
>> newServer=serverName=img645.imageshack.us,60020,1306276262774,
>> load=(requests=0, regions=0, usedHeap=23, maxHeap=3995)
>> 2011-05-24 15:31:03,350 INFO
>> org.apache.hadoop.hbase.master.ServerManager: Triggering server
>> recovery; existingServer img645.imageshack.us,60020,1306276075768
>> looks stale
>> 2011-05-24 15:31:03,353 DEBUG
>> org.apache.hadoop.hbase.master.ServerManager:
>> Added=img645.imageshack.us,60020,1306276075768 to dead servers,
>> submitted shutdown handler to be executed, root=false, meta=false
>> 2011-05-24 15:31:03,353 INFO
>> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler:
>> Splitting logs for img645.imageshack.us,60020,1306276075768
>> 2011-05-24 15:31:04,348 INFO
>> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler:
>> Reassigning 0 region(s) that img645.imageshack.us,60020,1306276075768
>> was carrying (skipping 0 regions(s) that are already in transition)
>> 2011-05-24 15:31:04,348 INFO
>> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished
>> processing of shutdown of img645.imageshack.us,60020,1306276075768
>> 2011-05-24 15:31:06,333 DEBUG
>> org.apache.hadoop.hbase.master.ServerManager: Server
>> img645.imageshack.us,60020,1306276262774 came back up, removed it from
>> the dead servers list