Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Errors after major compaction


Copy link to this message
-
Re: Errors after major compaction
Eran Kutner 2011-07-04, 14:57
Sure, I'll do that.

-eran

On Mon, Jul 4, 2011 at 12:30, Ted Yu <[EMAIL PROTECTED]> wrote:

> Thanks for the understanding.
>
> Can you log a JIRA and put your ideas below in it ?
>
>
>
> On Jul 4, 2011, at 12:42 AM, Eran Kutner <[EMAIL PROTECTED]> wrote:
>
> > Thanks for the explanation Ted,
> >
> > I will try to apply HBASE-3789 and hope for the best but my understanding
> is
> > that it doesn't really solve the problem, it only reduces the probability
> of
> > it happening, at least in one particular scenario. I would hope for a
> more
> > robust solution.
> > My concern is that the region allocation process seems to rely too much
> on
> > timing considerations and doesn't seem to take enough measures to
> guarantee
> > conflicts do not occur. I understand that in a distributed environment,
> when
> > you don't get a timely response from a remote machine you can't know for
> > sure if it did or did not receive the request, however there are things
> that
> > can be done to mitigate this and reduce the conflict time significantly.
> For
> > example, when I run dbck it knows that some regions are multiply
> assigned,
> > the master could do the same and try to resolve the conflict. Another
> > approach would be to handle late responses, even if the response from the
> > remote machine arrives after it was assumed to be dead the master should
> > have enough information to know it had created a conflict by assigning
> the
> > region to another server. An even better solution, I think, is for the RS
> to
> > periodically test that it is indeed the rightful owner of every region it
> > holds and relinquish control over the region if it's not.
> > Obviously a state where two RSs hold the same region is pathological and
> can
> > lead to data loss, as demonstrated in my case. The system should be able
> to
> > actively protect itself against such a scenario. It probably doesn't need
> > saying but there is really nothing worse for a data storage system than
> data
> > loss.
> >
> > In my case the problem didn't happen in the initial phase but after
> > disabling and enabling a table with about 12K regions.
> >
> > -eran
> >
> >
> >
> > On Sun, Jul 3, 2011 at 23:49, Ted Yu <[EMAIL PROTECTED]> wrote:
> >
> >> Let me try to answer some of your questions.
> >> The two paragraphs below were written along my reasoning which is in
> >> reverse
> >> order of the actual call sequence.
> >>
> >> For #4 below, the log indicates that the following was executed:
> >> private void assign(final RegionState state, final boolean
> setOfflineInZK,
> >>     final boolean forceNewPlan) {
> >>   for (int i = 0; i < this.maximumAssignmentAttempts; i++) {
> >>     if (setOfflineInZK && !*setOfflineInZooKeeper*(state)) return;
> >>
> >> The above was due to the timeout which you noted in #2 which would have
> >> caused
> >> TimeoutMonitor.chore() to run this code (line 1787)
> >>
> >>     for (Map.Entry<HRegionInfo, Boolean> e: assigns.entrySet()){
> >>       assign(e.getKey(), false, e.getValue());
> >>     }
> >>
> >> This means there is lack of coordination between
> >> assignmentManager.TimeoutMonitor and OpenedRegionHandler
> >>
> >> The reason I mention HBASE-3789 is that it is marked as Incompatible
> change
> >> and is in TRUNK already.
> >> The application of HBASE-3789 to 0.90 branch would change the behavior
> >> (timing) of region assignment.
> >>
> >> I think it makes sense to evaluate the effect of HBASE-3789 in 0.90.4
> >>
> >> BTW were the incorrect region assignments observed for a table with
> >> multiple
> >> initial regions ?
> >> If so, I have HBASE-4010 in TRUNK which speeds up initial region
> assignment
> >> by about 50%.
> >>
> >> Cheers
> >>
> >> On Sun, Jul 3, 2011 at 12:02 PM, Eran Kutner <[EMAIL PROTECTED]> wrote:
> >>
> >>> Ted,
> >>> So if I understand correctly the the theory is that because of the
> issue
> >>> fixed in HBASE-3789 the master took too long to detect that the region
> >> was
> >>> successfully opened by the first server so it forced closed it and