Jean-Marc Spaggiari 2013-06-02, 15:09
On Sun, Jun 2, 2013 at 8:09 AM, Jean-Marc Spaggiari <[EMAIL PROTECTED]
> So, 2 things again here.
> 1) Should the region server send more information of the failure to
> the master the the master can display the failure cause on the logs?
Yes. You shouldn't have to work so hard to figure root cause (smile).
> 2) recovered.edits should have not been there. And the failure is
> because the SplitLogWorker tries to create a recovered.edits file but
> it's already existing.
When we used to split, we'd make recovered.edits FILE but then a good while
back we changed it to make a DIRECTORY into which we wrote many files
possibly named for the sequenceid that started the file; we did this to
handle the case where there could be failures during recovery where there
might be multiple recovered.edits made. The complaint seems to be related
to the expectation that we are expecting a directory but there is a 'file'
Could this be a region made long time ago w/ an old version of hbase?
> Now, what should we do. I can manually delete this file and the
> process will take over and everything will be fine, but should we
> update the HLogSplitter or HLogSplitter to validate this file is not
> there first and delete it if required? What's the risk of deleting
> this file automatically? There should not be any other process writing
> it, right? Worst case, we can also rename it so if really required we
> can replay it later?
We should just move it aside w/ a warning, yes.
Good on you JM.
Ted Yu 2013-06-02, 15:46
Jean-Marc Spaggiari 2013-06-02, 17:05