|
yutoo yanio
2012-11-15, 12:18
ramkrishna vasudevan
2012-11-15, 17:20
yutoo yanio
2012-11-17, 04:08
ramkrishna vasudevan
2012-11-17, 05:01
yutoo yanio
2012-11-17, 05:41
ramkrishna vasudevan
2012-11-17, 08:06
|
-
region server not responseyutoo yanio 2012-11-15, 12:18
hi every body
in my cluster sometime a region server not responsed and we give exception like this : "org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 22 actions: servers with issues: n6:60020" after this exception i check log of n6 and log has not any exception but region server does not responsed. when i run "hbase-daemon.sh stop regionserver" on n6, this command does not complete( show unlimited dot in screen) i should run "kill -9" to kill the region server process and run again. what happened? thanks.
-
Re: region server not responseramkrishna vasudevan 2012-11-15, 17:20
Check your UI. Does it show any regions in transition?
Can you try doing kill -9 and restart the region server. Regards Ram On Thu, Nov 15, 2012 at 5:48 PM, yutoo yanio <[EMAIL PROTECTED]> wrote: > hi every body > > in my cluster sometime a region server not responsed and we give exception > like this : > "org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > Failed 22 actions: servers with issues: n6:60020" > after this exception i check log of n6 and log has not any exception but > region server does not responsed. > when i run "hbase-daemon.sh stop regionserver" on n6, this command does not > complete( show unlimited dot in screen) > > i should run "kill -9" to kill the region server process and run again. > > what happened? > thanks. >
-
Re: region server not responseyutoo yanio 2012-11-17, 04:08
my region server stopped after 30 minutes!!!!
this my shutdown log after "2012-11-15 15:38:44,170 INFO org.apache.hadoop.hbase.regionserver.Leases: regionserver60020.leaseChecker closed leases" wait about 30mins and "2012-11-15 16:08:10,361 INFO org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO DeleteFamily was added to..." why?? 2012-11-15 15:38:43,538 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC Server listener on 60020 2012-11-15 15:38:43,538 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server handler 4 on 60020: exiting 2012-11-15 15:38:43,542 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC Server Responder 2012-11-15 15:38:43,542 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC Server Responder 2012-11-15 15:38:43,542 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: Sending interrupt to stop the worker thread 2012-11-15 15:38:43,562 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer 2012-11-15 15:38:43,563 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker interrupted while waiting for task, exiting: java.lang.InterruptedException 2012-11-15 15:38:43,563 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker n6,60020,1352965568246 exiting 2012-11-15 15:38:43,572 INFO org.mortbay.log: Stopped SelectChannelConnector@0.0.0.0:60030 2012-11-15 15:38:43,701 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 483.41 MB of total=4.03 GB 2012-11-15 15:38:43,737 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=483.42 MB, total=3.56 GB, single=1.01 GB, multi=2.97 GB, memory=0 KB 2012-11-15 15:38:44,169 INFO org.apache.hadoop.hbase.regionserver.Leases: regionserver60020.leaseChecker closing leases 2012-11-15 15:38:44,170 INFO org.apache.hadoop.hbase.regionserver.Leases: regionserver60020.leaseChecker closed leases 2012-11-15 16:08:10,361 INFO org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO DeleteFamily was added to HFile (hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f) 2012-11-15 16:08:10,361 INFO org.apache.hadoop.hbase.regionserver.Store: Flushed , sequenceid=275054780, memsize=128.0m, into tmp file hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f 2012-11-15 16:08:10,374 DEBUG org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f to hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/table2/d204665e2a7446758e3575057f48bc4f 2012-11-15 16:08:10,386 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/table2/d204665e2a7446758e3575057f48bc4f, entries=541204, sequenceid=275054780, filesize=35.2m 2012-11-15 16:08:10,387 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.0m/134218592, currentsize=78.9m/82739248 for region table1,09366395507%20121105,1352812474398.cc44e74d2b64c638d5e9d88b33b71633. in 4514605ms, sequenceid=275054780, compaction requested=false 2012-11-15 16:08:10,388 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: regionserver60020.cacheFlusher exiting 2012-11-15 16:08:10,428 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new createWriter -- HADOOP-6840 2012-11-15 16:08:10,428 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Path=hdfs://m2/hbase/.logs/n6,60020,1352965568246/n6%2C60020%2C1352965568246.1352983090387, syncFs=true, hflush=false, compression=false 2012-11-15 16:08:10,475 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/n6,60020,1352965568246/n6%2C60020%2C1352965568246.1352976383263, entries=12, filesize=1742. for /hbase/.logs/n6,60020,1352965568246/n6%2C60020%2C1352965568246.1352983090387 2012-11-15 16:08:10,475 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Last sequenceid written is empty. Deleting all old hlogs 2012-11-15 16:08:10,475 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: moving old hlog file /hbase/.logs/n6,60020,1352965568246/n6%2C60020%2C1352965568246.1352976383263 whose highest sequenceid is 275054780 to /hbase/.oldlogs/n6%2C60020%2C1352965568246.1352976383263 2012-11-15 16:08:10,487 INFO org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting. On Thu, Nov 15, 2012 at 8:50 PM, ramkrishna vasudevan < [EMAIL PROTECTED]> wrote:
-
Re: region server not responseramkrishna vasudevan 2012-11-17, 05:01
Could you send the exact Exception that you had got. The logs that you
have attached comes after the region server starts going down. Regards Ram On Sat, Nov 17, 2012 at 9:38 AM, yutoo yanio <[EMAIL PROTECTED]> wrote: > my region server stopped after 30 minutes!!!! > this my shutdown log > after "2012-11-15 15:38:44,170 INFO > org.apache.hadoop.hbase.regionserver.Leases: regionserver60020.leaseChecker > closed leases" wait about 30mins and "2012-11-15 16:08:10,361 INFO > org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO > DeleteFamily was added to..." > > why?? > > > 2012-11-15 15:38:43,538 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > IPC Server listener on 60020 > 2012-11-15 15:38:43,538 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC > Server handler 4 on 60020: exiting > 2012-11-15 15:38:43,542 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > IPC Server Responder > 2012-11-15 15:38:43,542 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > IPC Server Responder > 2012-11-15 15:38:43,542 INFO > org.apache.hadoop.hbase.regionserver.SplitLogWorker: Sending interrupt to > stop the worker thread > 2012-11-15 15:38:43,562 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer > 2012-11-15 15:38:43,563 INFO > org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker > interrupted while waiting for task, exiting: java.lang.InterruptedException > 2012-11-15 15:38:43,563 INFO > org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker > n6,60020,1352965568246 exiting > 2012-11-15 15:38:43,572 INFO org.mortbay.log: Stopped > SelectChannelConnector@0.0.0.0:60030 > 2012-11-15 15:38:43,701 DEBUG > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction > started; Attempting to free 483.41 MB of total=4.03 GB > 2012-11-15 15:38:43,737 DEBUG > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction > completed; freed=483.42 MB, total=3.56 GB, single=1.01 GB, multi=2.97 GB, > memory=0 KB > 2012-11-15 15:38:44,169 INFO org.apache.hadoop.hbase.regionserver.Leases: > regionserver60020.leaseChecker closing leases > 2012-11-15 15:38:44,170 INFO org.apache.hadoop.hbase.regionserver.Leases: > regionserver60020.leaseChecker closed leases > 2012-11-15 16:08:10,361 INFO > org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO > DeleteFamily was added to HFile > > (hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f) > 2012-11-15 16:08:10,361 INFO org.apache.hadoop.hbase.regionserver.Store: > Flushed , sequenceid=275054780, memsize=128.0m, into tmp file > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f > 2012-11-15 16:08:10,374 DEBUG org.apache.hadoop.hbase.regionserver.Store: > Renaming flushed file at > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f > to > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/table2/d204665e2a7446758e3575057f48bc4f > 2012-11-15 16:08:10,386 INFO org.apache.hadoop.hbase.regionserver.Store: > Added > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/table2/d204665e2a7446758e3575057f48bc4f, > entries=541204, sequenceid=275054780, filesize=35.2m > 2012-11-15 16:08:10,387 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Finished memstore flush of ~128.0m/134218592, currentsize=78.9m/82739248 > for region > table1,09366395507%20121105,1352812474398.cc44e74d2b64c638d5e9d88b33b71633. > in 4514605ms, sequenceid=275054780, compaction requested=false > 2012-11-15 16:08:10,388 INFO > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: > regionserver60020.cacheFlusher exiting > 2012-11-15 16:08:10,428 DEBUG > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: using new > createWriter -- HADOOP-6840 > 2012-11-15 16:08:10,428 DEBUG > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: > > Path=hdfs://m2/hbase/.logs/n6,60020,1352965568246/n6%2C60020%2C1352965568246.1352983090387,
-
Re: region server not responseyutoo yanio 2012-11-17, 05:41
i said that not exception occurred
On Sat, Nov 17, 2012 at 8:31 AM, ramkrishna vasudevan < [EMAIL PROTECTED]> wrote: > Could you send the exact Exception that you had got. The logs that you > have attached comes after the region server starts going down. > > Regards > Ram > > On Sat, Nov 17, 2012 at 9:38 AM, yutoo yanio <[EMAIL PROTECTED]> > wrote: > > > my region server stopped after 30 minutes!!!! > > this my shutdown log > > after "2012-11-15 15:38:44,170 INFO > > org.apache.hadoop.hbase.regionserver.Leases: > regionserver60020.leaseChecker > > closed leases" wait about 30mins and "2012-11-15 16:08:10,361 INFO > > org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO > > DeleteFamily was added to..." > > > > why?? > > > > > > 2012-11-15 15:38:43,538 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > > IPC Server listener on 60020 > > 2012-11-15 15:38:43,538 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC > > Server handler 4 on 60020: exiting > > 2012-11-15 15:38:43,542 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > > IPC Server Responder > > 2012-11-15 15:38:43,542 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > > IPC Server Responder > > 2012-11-15 15:38:43,542 INFO > > org.apache.hadoop.hbase.regionserver.SplitLogWorker: Sending interrupt to > > stop the worker thread > > 2012-11-15 15:38:43,562 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer > > 2012-11-15 15:38:43,563 INFO > > org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker > > interrupted while waiting for task, exiting: > java.lang.InterruptedException > > 2012-11-15 15:38:43,563 INFO > > org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker > > n6,60020,1352965568246 exiting > > 2012-11-15 15:38:43,572 INFO org.mortbay.log: Stopped > > SelectChannelConnector@0.0.0.0:60030 > > 2012-11-15 15:38:43,701 DEBUG > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction > > started; Attempting to free 483.41 MB of total=4.03 GB > > 2012-11-15 15:38:43,737 DEBUG > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction > > completed; freed=483.42 MB, total=3.56 GB, single=1.01 GB, multi=2.97 GB, > > memory=0 KB > > 2012-11-15 15:38:44,169 INFO org.apache.hadoop.hbase.regionserver.Leases: > > regionserver60020.leaseChecker closing leases > > 2012-11-15 15:38:44,170 INFO org.apache.hadoop.hbase.regionserver.Leases: > > regionserver60020.leaseChecker closed leases > > 2012-11-15 16:08:10,361 INFO > > org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO > > DeleteFamily was added to HFile > > > > > (hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f) > > 2012-11-15 16:08:10,361 INFO org.apache.hadoop.hbase.regionserver.Store: > > Flushed , sequenceid=275054780, memsize=128.0m, into tmp file > > > > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f > > 2012-11-15 16:08:10,374 DEBUG org.apache.hadoop.hbase.regionserver.Store: > > Renaming flushed file at > > > > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f > > to > > > > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/table2/d204665e2a7446758e3575057f48bc4f > > 2012-11-15 16:08:10,386 INFO org.apache.hadoop.hbase.regionserver.Store: > > Added > > > > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/table2/d204665e2a7446758e3575057f48bc4f, > > entries=541204, sequenceid=275054780, filesize=35.2m > > 2012-11-15 16:08:10,387 INFO > org.apache.hadoop.hbase.regionserver.HRegion: > > Finished memstore flush of ~128.0m/134218592, currentsize=78.9m/82739248 > > for region > > > table1,09366395507%20121105,1352812474398.cc44e74d2b64c638d5e9d88b33b71633. > > in 4514605ms, sequenceid=275054780, compaction requested=false > > 2012-11-15 16:08:10,388 INFO > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: > > regionserver60020.cacheFlusher exiting
-
Re: region server not responseramkrishna vasudevan 2012-11-17, 08:06
Oh....sorry i missed that.. :)
On Sat, Nov 17, 2012 at 11:11 AM, yutoo yanio <[EMAIL PROTECTED]> wrote: > i said that not exception occurred > > On Sat, Nov 17, 2012 at 8:31 AM, ramkrishna vasudevan < > [EMAIL PROTECTED]> wrote: > > > Could you send the exact Exception that you had got. The logs that you > > have attached comes after the region server starts going down. > > > > Regards > > Ram > > > > On Sat, Nov 17, 2012 at 9:38 AM, yutoo yanio <[EMAIL PROTECTED]> > > wrote: > > > > > my region server stopped after 30 minutes!!!! > > > this my shutdown log > > > after "2012-11-15 15:38:44,170 INFO > > > org.apache.hadoop.hbase.regionserver.Leases: > > regionserver60020.leaseChecker > > > closed leases" wait about 30mins and "2012-11-15 16:08:10,361 INFO > > > org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO > > > DeleteFamily was added to..." > > > > > > why?? > > > > > > > > > 2012-11-15 15:38:43,538 INFO org.apache.hadoop.ipc.HBaseServer: > Stopping > > > IPC Server listener on 60020 > > > 2012-11-15 15:38:43,538 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC > > > Server handler 4 on 60020: exiting > > > 2012-11-15 15:38:43,542 INFO org.apache.hadoop.ipc.HBaseServer: > Stopping > > > IPC Server Responder > > > 2012-11-15 15:38:43,542 INFO org.apache.hadoop.ipc.HBaseServer: > Stopping > > > IPC Server Responder > > > 2012-11-15 15:38:43,542 INFO > > > org.apache.hadoop.hbase.regionserver.SplitLogWorker: Sending interrupt > to > > > stop the worker thread > > > 2012-11-15 15:38:43,562 INFO > > > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer > > > 2012-11-15 15:38:43,563 INFO > > > org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker > > > interrupted while waiting for task, exiting: > > java.lang.InterruptedException > > > 2012-11-15 15:38:43,563 INFO > > > org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker > > > n6,60020,1352965568246 exiting > > > 2012-11-15 15:38:43,572 INFO org.mortbay.log: Stopped > > > SelectChannelConnector@0.0.0.0:60030 > > > 2012-11-15 15:38:43,701 DEBUG > > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > eviction > > > started; Attempting to free 483.41 MB of total=4.03 GB > > > 2012-11-15 15:38:43,737 DEBUG > > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > eviction > > > completed; freed=483.42 MB, total=3.56 GB, single=1.01 GB, multi=2.97 > GB, > > > memory=0 KB > > > 2012-11-15 15:38:44,169 INFO > org.apache.hadoop.hbase.regionserver.Leases: > > > regionserver60020.leaseChecker closing leases > > > 2012-11-15 15:38:44,170 INFO > org.apache.hadoop.hbase.regionserver.Leases: > > > regionserver60020.leaseChecker closed leases > > > 2012-11-15 16:08:10,361 INFO > > > org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO > > > DeleteFamily was added to HFile > > > > > > > > > (hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f) > > > 2012-11-15 16:08:10,361 INFO > org.apache.hadoop.hbase.regionserver.Store: > > > Flushed , sequenceid=275054780, memsize=128.0m, into tmp file > > > > > > > > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f > > > 2012-11-15 16:08:10,374 DEBUG > org.apache.hadoop.hbase.regionserver.Store: > > > Renaming flushed file at > > > > > > > > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/.tmp/d204665e2a7446758e3575057f48bc4f > > > to > > > > > > > > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/table2/d204665e2a7446758e3575057f48bc4f > > > 2012-11-15 16:08:10,386 INFO > org.apache.hadoop.hbase.regionserver.Store: > > > Added > > > > > > > > > hdfs://m2/hbase/table1/cc44e74d2b64c638d5e9d88b33b71633/table2/d204665e2a7446758e3575057f48bc4f, > > > entries=541204, sequenceid=275054780, filesize=35.2m > > > 2012-11-15 16:08:10,387 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > > Finished memstore flush of ~128.0m/134218592, |