Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Multiple different failures


Copy link to this message
-
Re: Multiple different failures
Are you saying 97 % data was lost or was it offlined until the region
servers came back up ?

Varun
On Sat, Jun 1, 2013 at 6:31 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]
> wrote:

> Hi,
>
> Today I faced a power outage. 4 computers stayed up. The 3 ZK servers,
> the Master, the NN and 2 DN/RS. They was on UPS.
>
> While everything was going back up... Guess what... I faced a 2nd one!
>
> After bringing HBase up, about 97% of my data was missing.  (19M rows
> in my main table)
>
> I ran HBCK which found many issues and fixed, I think, all of them.
> (1013M rows in my main table now).
>
> I have not been able to identify why I lost all of that, but 2 small
> things.
>
> 1) I had about 900 un-assigned regions in a table. Here is a log example:
>
> ERROR: Region { meta =>
> work_proposed,\xC9\x1F\x1F\x0F\x00\x00\x00\x00
> http://www.lawyerlocate.ca/lawyers/city_subs.php?province=5&city=956&category=2&subcategory=202,1366811662932.fdf1d3bf27c7c8bae77711b85473bb2d
> .,
> hdfs =>
> hdfs://node3:9000/hbase/work_proposed/fdf1d3bf27c7c8bae77711b85473bb2d,
> deployed =>  } not deployed on any region server.
> Trying to fix unassigned region...
> 13/06/01 17:37:11 INFO util.HBaseFsckRepair: Region still in
> transition, waiting for it to become assigned: {NAME =>
> 'work_proposed,\xC9\x1F\x1F\x0F\x00\x00\x00\x00
> http://www.lawyerlocate.ca/lawyers/city_subs.php?province=5&city=956&category=2&subcategory=202,1366811662932.fdf1d3bf27c7c8bae77711b85473bb2d.
> ',
> STARTKEY => '\xC9\x1F\x1F\x0F\x00\x00\x00\x00
> http://www.lawyerlocate.ca/lawyers/city_subs.php?province=5&city=956&category=2&subcategory=202
> ',
> ENDKEY => '\xC9\x86\x19\x8E\x00\x00\x00\x00
> http://home.yorkbbs.ca/MemberPostsList.aspx?spaceid=576287',
> ENCODED => fdf1d3bf27c7c8bae77711b85473bb2d,}
>
> So regions got re-assigned on by one... Was SOOOOO long... Should not
> HBCK try to re-assign all those regions in parallel or at least as
> many thread as we have region servers? Today it's waiting for the
> current region to be fully assigned and open to continue, which takes
> a while.
>
>
>
> 2) Might be good for HBCK to display the data/time in all lines. That
> helps to estimate the remaining to. Hole detection is not displaying
> that, and so are some other fixes.
>
> The 2nd point is easy to fix, but the first one might be a bit more
> tricky. What do you thing about it?
>
>
>
> JM
>