Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - All the base region server going down


Copy link to this message
-
Re: All the base region server going down
Azuryy Yu 2013-11-26, 04:20
1) use pastebin would be better for read.
2) This does seem like HDFS issue, which lead to all RS crashed.
On Tue, Nov 26, 2013 at 12:12 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> What version of HBase and hadoop are you using ?
> Are you using Namenode HA ?
>
> Can you look at your namenode log around 18:06:00 ?
>
> Using pastebin would be better than copying screens.
>
> Cheers
>
>
> On Tue, Nov 26, 2013 at 11:27 AM, GuoWei <[EMAIL PROTECTED]> wrote:
>
>> Dear,
>>
>> Please help me to find out why all region servers going down at
>> 2013-11-25 16:20.
>>
>>
>> The logs list below  are logs from master and one slave.
>>
>>
>> From Master:
>>
>> 2013-11-25 18:06:21,741 INFO
>> org.apache.hadoop.hbase.master.AssignmentManager$TimerUpdater:
>> master,60000,1385363388874.timerUpdater exiting
>> 191757 2013-11-25 18:06:21,755 ERRORorg.apache.hadoop.hbase.master.HMaster: Region server
>> ^@^@slave10,60020,1385363390188 reported a fatal error:
>> 191758 ABORTING region server slave10,60020,1385363390188: Unrecoverable
>> exception while closing region
>> productdevice,20131122-1-354890041701600,1385348706791.a587f1a15b4a3b10fc0e87      a804487532.,
>> still finishing close
>> 191759 Cause:
>> 191760 org.apache.hadoop.hbase.DroppedSnapshotException: region:
>> productdevice,20131122-1-354890041701600,1385348706791.a587f1a15b4a3b10fc0e87a804487532.
>> 191761     at
>> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1605)
>> 191762     at
>> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1479)
>> 191763     at
>> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:992)
>> 191764     at
>> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:956)
>> 191765     at
>> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:119)
>> 191766     at
>> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
>> 191767     at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> 191768     at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> 191769     at java.lang.Thread.run(Thread.java:662)
>> 191770 Caused by: java.io.IOException: Failed on local exception:
>> java.io.IOException: Connection reset by peer; Host Details : local host
>> is: "slave10/192.168.1.210"; destination h       ost is: "master":8020;
>> 191771     at
>> org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:763)
>> 191772     at org.apache.hadoop.ipc.Client.call(Client.java:1241)
>> 191773     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>> 191774     at $Proxy16.getFileInfo(Unknown Source)
>> 191775     at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>> 191776     at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> 191777     at java.lang.reflect.Method.invoke(Method.java:597)
>> 191778     at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>> 191779     at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>> 191780     at $Proxy16.getFileInfo(Unknown Source)
>> 191781     at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:629)
>> 191782     at
>> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1545)
>> 191783     at
>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:820)
>> 191784     at
>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:380)
>> 191785     at
>> org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1378)
>> 191786     at
>> org.apache.hadoop.hbase.regionserver.StoreFile$WriterBuilder.build(StoreFile.java:852)
>> 191787     at
>> org.apache.hadoop.hbase.regionserver.Store.createWriterInTmp(Store.java:924)