Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Never ending distributed log split


Copy link to this message
-
Re: Never ending distributed log split
2012/8/3, Jean-Daniel Cryans <[EMAIL PROTECTED]>:
> On Fri, Aug 3, 2012 at 8:15 AM, Jean-Marc Spaggiari
> <[EMAIL PROTECTED]> wrote:
>> Me again ;)
>>
>> I did some more investigation.
>
> It would really help to see the region server log although the fsck
> output might be enough.

I looked under evey directory and only one is containing a file.

http://pastebin.com/8Fea2EnA

It seems to be related to node1. On this server, seems that everything
is started correctly:
hadoop@node1:~$ /usr/local/jdk1.7.0_05/bin/jps
2211 DataNode
2938 Jps
2136 TaskTracker

hbase@node1:~$ /usr/local/jdk1.7.0_05/bin/jps
2419 HRegionServer
3708 Jps

On the Node1 region server logs, I can see the same information, which
is, the file is not hosted anywhere.

2012-08-03 15:01:31,216 WARN org.apache.hadoop.hdfs.DFSClient: DFS
Read: java.io.IOException: Could not obtain block:
blk_4965382127800577452_15852
file=/hbase/.logs/node1,60020,1343908057567-splitting/node1%2C60020%2C1343908057567.1343914548297
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:2266)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2060)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2221)
        at java.io.DataInputStream.read(DataInputStream.java:149)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at java.io.DataInputStream.readFully(DataInputStream.java:169)
        at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1486)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1475)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1470)
        at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:55)
        at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:175)
        at org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:688)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:850)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:763)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFileToTemp(HLogSplitter.java:384)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFileToTemp(HLogSplitter.java:351)
        at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:113)
        at org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:266)
        at org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:197)
        at org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:165)
        at java.lang.Thread.run(Thread.java:722)

> BTW you'll find 0.94.1 RC1 here:
> http://people.apache.org/~larsh/hbase-0.94.1-rc1/

Super, thanks! I will most probably try it instead of the 0.94.0
>> And I found that:
>>
>> http://pastebin.com/Bedm6Ldy
>>
>> Seems that no region is serving my logs. That's strange because all my
>> servers are up and fsck is telling me that FS is clean.
>
> I don't get the "Seems that no region is serving my logs" part. A
> region doesn't serve logs, it serves HFiles. You meant to say
> DataNode?

I was talking about the files under /hbase/.logs . Base on the
directory name I thought it was some logs. What ever this file is
supposed to be for, it seems it's not served by any datanode.
>> Can I just delete those files? What's the impact of such delete? I
>> don't really worrie about loosing some data. It's a test environment.
>> But I really need it to start again.
>
> I wonder if it's related to:
> https://issues.apache.org/jira/browse/HBASE-6401
>
> Did you remove a datanode from the cluster as part of the maintenance?

It might be related to this Jira. You, I stopped all the datanodes for
the maintenance (Had to work on the power suply...). I had to do that
promptly so I "just" stopped everything with init 0.
That's fine. Nothing was appening in the cluster for hours. So I'm not
really expecting to loose anything. So I will try to delete the
file...
Here are the logs where we can see the file creation:
http://pastebin.com/HBc28zab Nothing weird in it I think.

When I removed the file, the region server crashed and had to be restarted.

Restart was not working:
2012-08-03 16:07:49,119 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: remote error
telling master we are up
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hbase.PleaseHoldException: Server
serverName=node1,60020

2012-08-03 16:07:46,112 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: remote error
telling master we are up
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hbase.PleaseHoldException: Server
serverName=node1,60020,1344024290513 rejected; we already have
node1,60020,1343998593757 registered with same hostname and port
        at org.apache.hadoop.hbase.master.ServerManager.checkAlreadySameHostPort(ServerManager.java:194)
        at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:153)
        at org.apache.hadoop.hbase.master.HMaster.regionServerStartup(HMaster.java:860)
        at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)

        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:918)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.jav
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB