Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Errors after major compaction


Copy link to this message
-
Re: Errors after major compaction
Appreciate it, sorry I didn't get to it sooner. Had some crazy days :)

-eran

On Tue, Jul 5, 2011 at 17:19, Ted Yu <[EMAIL PROTECTED]> wrote:

> Eran:
> I logged https://issues.apache.org/jira/browse/HBASE-4060 for you.
>
> On Mon, Jul 4, 2011 at 2:30 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > Thanks for the understanding.
> >
> > Can you log a JIRA and put your ideas below in it ?
> >
> >
> >
> > On Jul 4, 2011, at 12:42 AM, Eran Kutner <[EMAIL PROTECTED]> wrote:
> >
> > > Thanks for the explanation Ted,
> > >
> > > I will try to apply HBASE-3789 and hope for the best but my
> understanding
> > is
> > > that it doesn't really solve the problem, it only reduces the
> probability
> > of
> > > it happening, at least in one particular scenario. I would hope for a
> > more
> > > robust solution.
> > > My concern is that the region allocation process seems to rely too much
> > on
> > > timing considerations and doesn't seem to take enough measures to
> > guarantee
> > > conflicts do not occur. I understand that in a distributed environment,
> > when
> > > you don't get a timely response from a remote machine you can't know
> for
> > > sure if it did or did not receive the request, however there are things
> > that
> > > can be done to mitigate this and reduce the conflict time
> significantly.
> > For
> > > example, when I run dbck it knows that some regions are multiply
> > assigned,
> > > the master could do the same and try to resolve the conflict. Another
> > > approach would be to handle late responses, even if the response from
> the
> > > remote machine arrives after it was assumed to be dead the master
> should
> > > have enough information to know it had created a conflict by assigning
> > the
> > > region to another server. An even better solution, I think, is for the
> RS
> > to
> > > periodically test that it is indeed the rightful owner of every region
> it
> > > holds and relinquish control over the region if it's not.
> > > Obviously a state where two RSs hold the same region is pathological
> and
> > can
> > > lead to data loss, as demonstrated in my case. The system should be
> able
> > to
> > > actively protect itself against such a scenario. It probably doesn't
> need
> > > saying but there is really nothing worse for a data storage system than
> > data
> > > loss.
> > >
> > > In my case the problem didn't happen in the initial phase but after
> > > disabling and enabling a table with about 12K regions.
> > >
> > > -eran
> > >
> > >
> > >
> > > On Sun, Jul 3, 2011 at 23:49, Ted Yu <[EMAIL PROTECTED]> wrote:
> > >
> > >> Let me try to answer some of your questions.
> > >> The two paragraphs below were written along my reasoning which is in
> > >> reverse
> > >> order of the actual call sequence.
> > >>
> > >> For #4 below, the log indicates that the following was executed:
> > >> private void assign(final RegionState state, final boolean
> > setOfflineInZK,
> > >>     final boolean forceNewPlan) {
> > >>   for (int i = 0; i < this.maximumAssignmentAttempts; i++) {
> > >>     if (setOfflineInZK && !*setOfflineInZooKeeper*(state)) return;
> > >>
> > >> The above was due to the timeout which you noted in #2 which would
> have
> > >> caused
> > >> TimeoutMonitor.chore() to run this code (line 1787)
> > >>
> > >>     for (Map.Entry<HRegionInfo, Boolean> e: assigns.entrySet()){
> > >>       assign(e.getKey(), false, e.getValue());
> > >>     }
> > >>
> > >> This means there is lack of coordination between
> > >> assignmentManager.TimeoutMonitor and OpenedRegionHandler
> > >>
> > >> The reason I mention HBASE-3789 is that it is marked as Incompatible
> > change
> > >> and is in TRUNK already.
> > >> The application of HBASE-3789 to 0.90 branch would change the behavior
> > >> (timing) of region assignment.
> > >>
> > >> I think it makes sense to evaluate the effect of HBASE-3789 in 0.90.4
> > >>
> > >> BTW were the incorrect region assignments observed for a table with
> > >> multiple
> > >> initial regions ?
> >