Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Eternal RIT problem when RS tries to access wrong region-folder on HDFS


+
Dimitri Goldin 2013-05-02, 14:17
Copy link to this message
-
Re: Eternal RIT problem when RS tries to access wrong region-folder on HDFS
Hi Dimitry,

  That is interesting.  I have seen this before, can you please send a
hadoop fs -lsr /hbase/documents?  This is going to be caused by a bad
split.  I will let you know what files you need to delete to safely recover
from this error.
On Thu, May 2, 2013 at 10:17 AM, Dimitri Goldin
<[EMAIL PROTECTED]>wrote:

> Hi,
>
> I have a strange RIT problem with a single region of our biggest table.
> After an hbck (wondering why it only discovered it at that time) it
> started trying to assign a region which has been bouncing between
> OFFLINE/PENDING_OPEN/OPENING for two days.
>
> I already tried close_region/unassign with force and even the good-old
> delete /hbase node in zookeeper, but we still experience the same issue.
>
> Interestinly, the full regions id is
> 'documents,**7128586022887322720,**1363696791400.**
> 79c619508659018ff3ef0887611eb8**f7.'
> but in the exception the filename it tries to open is:
> '/hbase/documents/**5b9c16898a371de58f31f0bdf86b1f**8b/d/**
> 0707b1ec4c6b41cf9174e0d2a1785f**e9'.
>
> Rough sequence from the logs seems to be the following:
>
> ==> * Received request to open region:
> documents,7128586022887322720,**1363696791400.**
> 79c619508659018ff3ef0887611eb8**f7.
>
> * Setting up tabledescriptor config now ...
>
> * Opening of region {NAME =>
> 'documents,**7128586022887322720,**1363696791400.**
> 79c619508659018ff3ef0887611eb8**f7.',
>     STARTKEY => '7128586022887322720',
>     ENDKEY => '7130716361635801616',
>     ENCODED => 79c619508659018ff3ef0887611eb8**f7,} failed, marking as
> FAILED_OPEN in ZK
>
> * File does not exist:
>
> /hbase/documents/**5b9c16898a371de58f31f0bdf86b1f**8b/d/**
> 0707b1ec4c6b41cf9174e0d2a1785f**e9 [...]
> ==>
> As the Exception implies, '/hbase/documents/**
> 5b9c16898a371de58f31f0bdf86b1f**8b' does not exist,
> while the '/hbase/documents/**79c619508659018ff3ef0887611eb8**f7' folder
> exists and contains all necessary files.
>
> I've checked .META. thinking that the regions ENCODED field might
> be broken, which is not the case judging by the 3rd. log-message.
> Otherwise, I'm out of ideas how the encoded-region part might get
> switched with another value.
>
> Any ideas what might cause such a behaviour and how to fix it?
>
> HBase version: 0.92.1-cdh4.1.2
>
> Complete log-message including stacktrace of the FileNotFound
> Exception: http://fpaste.org/10005/**04104136/<http://fpaste.org/10005/04104136/>(Sorry for the format)
>
>
> Thanks in advance,
>     Dimitry
>
> --
> ------------------------------**----
> Dimitry Goldin
> Software Developer
>
> Neofonie GmbH
> Robert-Koch-Platz 4
> 10115 Berlin
>
> T: +49 30 246 27 413
>
> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
> http://www.neofonie.de
>
> Handelsregister
> Berlin-Charlottenburg: HRB 67460
>
> Geschäftsführung:
> Thomas Kitlitschko
>

--
Kevin O'Dell
Systems Engineer, Cloudera
+
Dimitri Goldin 2013-05-03, 14:34
+
Kevin Odell 2013-05-03, 14:41
+
Dimitri Goldin 2013-05-07, 15:57
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB