Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Never ending "Doing distributed log split" task.,


Copy link to this message
-
Re: Never ending "Doing distributed log split" task.,
Jean-Marc Spaggiari 2013-08-29, 16:50
Hadoop 1.0.4 with HBase 0.94.12-SNAPSHOT

The file name changed since I have restarted HBase but here is what I have:
hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls
hdfs://node3:9000/hbase/.logs/node1,60020,1377793020654/
Found 1 items
-rw-r--r--   3 hbase supergroup          0 2013-08-29 12:17
/hbase/.logs/node1,60020,1377793020654/node1%2C60020%2C1377793020654.1377793021892

And I'm able to access it:
hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -get
/hbase/.logs/node1,60020,1377793020654/node1%2C60020%2C1377793020654.1377793021892
.
hadoop@node3:~/hadoop-1.0.3$

Oh. I just checked the UI again, and it's done. Wow! Took almost 1h. HBCK
report 0 inconsistencies detected. Status: OK

So seems that I'm all fine.

I don't know why it was so long. I will try to take a look at my Ganglia's
metrics to see if I can figure anything...

JM

2013/8/29 Ted Yu <[EMAIL PROTECTED]>

> What is your HBase / Hadoop version ?
>
> Can you check namenode log looking for lines related to
> hdfs://node3:9000/hbase/.logs/node1,60020,1377789460683-
> splitting/node1%2C60020%2C1377789460683.1377789462024 ?
>
> Thanks
>
>
> On Thu, Aug 29, 2013 at 9:03 AM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]> wrote:
>
> > I have restart my cluster and I'm now waiting for this task to end:
> >
> > Doing distributed log split in
> > [hdfs://node3:9000/hbase/.logs/node1,60020,1377789460683-splitting]
> >
> > It's running fir now 30 minutes. There was nothing running on the
> cluster.
> > No reads, no writes, nothing, for days...
> >
> > I got that on the logs:
> >
> > 2013-08-29 11:36:10,862 WARN
> > org.apache.hadoop.hbase.regionserver.SplitLogWorker: log splitting of
> >
> >
> hdfs://node3:9000/hbase/.logs/node1,60020,1377789460683-splitting/node1%2C60020%2C1377789460683.1377789462024
> > interrupted, resigning
> > java.io.InterruptedIOException
> >     at
> >
> >
> org.apache.hadoop.hbase.util.FSHDFSUtils.recoverDFSFileLease(FSHDFSUtils.java:136)
> >     at
> >
> >
> org.apache.hadoop.hbase.util.FSHDFSUtils.recoverFileLease(FSHDFSUtils.java:54)
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:780)
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:414)
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:381)
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:112)
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:280)
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:211)
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:179)
> >     at java.lang.Thread.run(Thread.java:722)
> > Caused by: java.lang.InterruptedException: sleep interrupted
> >     at java.lang.Thread.sleep(Native Method)
> >     at
> >
> >
> org.apache.hadoop.hbase.util.FSHDFSUtils.recoverDFSFileLease(FSHDFSUtils.java:118)
> >     ... 9 more
> > 2013-08-29 11:36:10,950 WARN
> > org.apache.hadoop.hbase.regionserver.SplitLogWorker: Interrupted while
> > trying to assert ownership of
> >
> >
> /hbase/splitlog/hdfs%3A%2F%2Fnode3%3A9000%2Fhbase%2F.logs%2Fnode1%2C60020%2C1377789460683-splitting%2Fnode1%252C60020%252C1377789460683.1377789462024
> > java.lang.InterruptedException
> >     at java.lang.Object.wait(Native Method)
> >     at java.lang.Object.wait(Object.java:503)
> >     at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
> >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
> >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
> >     at
> >
> >
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:361)
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.attemptToOwnTask(SplitLogWorker.java:346)
> >     at
> >
> >
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:264)