Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Errors after major compaction


Copy link to this message
-
Re: Errors after major compaction
Thanks for the explanation Ted,

I will try to apply HBASE-3789 and hope for the best but my understanding is
that it doesn't really solve the problem, it only reduces the probability of
it happening, at least in one particular scenario. I would hope for a more
robust solution.
My concern is that the region allocation process seems to rely too much on
timing considerations and doesn't seem to take enough measures to guarantee
conflicts do not occur. I understand that in a distributed environment, when
you don't get a timely response from a remote machine you can't know for
sure if it did or did not receive the request, however there are things that
can be done to mitigate this and reduce the conflict time significantly. For
example, when I run dbck it knows that some regions are multiply assigned,
the master could do the same and try to resolve the conflict. Another
approach would be to handle late responses, even if the response from the
remote machine arrives after it was assumed to be dead the master should
have enough information to know it had created a conflict by assigning the
region to another server. An even better solution, I think, is for the RS to
periodically test that it is indeed the rightful owner of every region it
holds and relinquish control over the region if it's not.
Obviously a state where two RSs hold the same region is pathological and can
lead to data loss, as demonstrated in my case. The system should be able to
actively protect itself against such a scenario. It probably doesn't need
saying but there is really nothing worse for a data storage system than data
loss.

In my case the problem didn't happen in the initial phase but after
disabling and enabling a table with about 12K regions.

-eran

On Sun, Jul 3, 2011 at 23:49, Ted Yu <[EMAIL PROTECTED]> wrote:

> Let me try to answer some of your questions.
> The two paragraphs below were written along my reasoning which is in
> reverse
> order of the actual call sequence.
>
> For #4 below, the log indicates that the following was executed:
>  private void assign(final RegionState state, final boolean setOfflineInZK,
>      final boolean forceNewPlan) {
>    for (int i = 0; i < this.maximumAssignmentAttempts; i++) {
>      if (setOfflineInZK && !*setOfflineInZooKeeper*(state)) return;
>
> The above was due to the timeout which you noted in #2 which would have
> caused
> TimeoutMonitor.chore() to run this code (line 1787)
>
>      for (Map.Entry<HRegionInfo, Boolean> e: assigns.entrySet()){
>        assign(e.getKey(), false, e.getValue());
>      }
>
> This means there is lack of coordination between
> assignmentManager.TimeoutMonitor and OpenedRegionHandler
>
> The reason I mention HBASE-3789 is that it is marked as Incompatible change
> and is in TRUNK already.
> The application of HBASE-3789 to 0.90 branch would change the behavior
> (timing) of region assignment.
>
> I think it makes sense to evaluate the effect of HBASE-3789 in 0.90.4
>
> BTW were the incorrect region assignments observed for a table with
> multiple
> initial regions ?
> If so, I have HBASE-4010 in TRUNK which speeds up initial region assignment
> by about 50%.
>
> Cheers
>
> On Sun, Jul 3, 2011 at 12:02 PM, Eran Kutner <[EMAIL PROTECTED]> wrote:
>
> > Ted,
> > So if I understand correctly the the theory is that because of the issue
> > fixed in HBASE-3789 the master took too long to detect that the region
> was
> > successfully opened by the first server so it forced closed it and
> > transitioned to a second server, but there are a few things about this
> > scenario I don't understand, probably because I don't know enough about
> the
> > inner workings of the region transition process and would appreciate it
> if
> > you can help me understand:
> > 1. The RS opened the region at 16:37:49.
> > 2. The master started handling the opened event at 16:39:54 - this delay
> > can
> > probably be explained by HBASE-3789
> > 3. At 16:39:54 the master log says: Opened region gs_raw_events,..... on