Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - Online snapshots progress.


Copy link to this message
-
Re: Online snapshots progress.
Ted Yu 2012-12-14, 17:37
Thanks for the update, Jon.

bq. if splits or balancing occurs while a snapshotting, the region moves
cause the final snapshot verification step to abort

The split or balancing happened during snapshot verification step, right ?

On Fri, Dec 14, 2012 at 9:17 AM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote:

> Hey folks,
>
> I've been testing and finding bugs on a branch of online snapshots for the
> past few days. The good news is that taking an online snapshot seems to be
> fairly robust -- I've been taking online-snapshots as quickly as possible
> on a 5 node cluster being battered by a performance eval random write run.
>
>
> As expected we ran into some hiccups. In my last run of the
> PE/online-snapshotting, it looks like 88/100 snapshots succeeded. This is
> ok, some failures are actually expected (the first cut only claims better
> consistency than 'copytable' and 'only-on-a-sunny-day' semantics). From a
> quick viewing of what cause the failed cases, if splits or balancing
> occurs while a snapshotting, the region moves cause the final snapshot
> verification step to abort because we look for the new regions and don't
> know if we have all regions.  We've also found some problems with splits of
> hfilelinks (HBASE-7339), and we've encountered an occasional failed-hang
> clone attempts (HBASE-7352), and an occasional ZK related slow abort.  As
> they are found and characterized,  I've been filing them under HBASE-6055
> (offline-snapshots) or HBASE-7290 (online-snapshots).
>
> I'm going to switch from bug fixing mode back to patch polishing mode today
> to get some of this committed to the snapshot dev branch.  Here's how I
> hope to deal with them moving forward.
>
> I'll be polishing the pieces I've been testing (there are about 5-7 patches
> in-flight currently) and putting updated pieces up for review.  There is
> non-trivial overhead maintaining this many patches "in the future".   Since
> this is a dev-branch, I'm going to ask reviewing these initial big
> dev-branch reviews focus on understandability and that your +1's would let
> us punt to follow-on jiras and TODOs more frequently than if you were
> reviewing for trunk.  The sooner we get the skeleton in,  the easier
> collaboration with other folks working and testing the same branch.
>  Ideally, getting the large pieces in would allow follow-ons to be easier
> to review and tackle.  The promise here, of course, is that many of  these
> follow-on jiras, bugs (deadlocks, hangs), and testing evidence will be
> blockers before merging to offline snapshots to trunk and merging online
> snapshots to trunk.
>
> Sound good?
>
> We've initially had one snapshot branch (offline snapshots) but I'm
> proposing having two: the offline-snapshot branch and the online-snapshot
> branch.  Jesse's been the master of the offline branch and pushing
> dev-branch patches to that branch (
> https://github.com/jyates/hbase/tree/snapshots).  I'd like to soon begin
> pushing dev-branch *reviewed commits* for online-snapshots to another
> branch. For those following here's an explanation of how I'm working.
>
> * The latest for review patches will be always be in review boards.
> * Branch committed portions (reviewed and +1'ed for the branch patches) for
> online snapshots will live here
> https://github.com/jmhsieh/hbase/tree/snapshots.  My branch will
> periodically be force pushed to deal with rebases onto constantly updating
> trunk, and to include offline-branch committed  patches.
> * The latest working and consolidated online-snapshot branch (commits
> correspond to HBASE jiras) will live at
> https://github.com/jmhsieh/hbase/tree/snapshots-work .  This branch is
> subject to frequent forced pushes.  It is a cleanup step done to prep
> patches for reviews, and match what eventual commits structure would look
> like.   It also contains some patches that may be abandoned or reordered.
> * Rough incremental in-progress branches live here,
> https://github.com/jmhsieh/hbase/tree/snapshot-work-1213  (change 1213