Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Apparent data loss on 90.4 rc2 after partial zookeeper network partition (on MapR)


+
Jacques 2011-08-02, 22:44
+
Jean-Daniel Cryans 2011-08-03, 22:49
+
Jacques 2011-08-04, 15:38
+
Ryan Rawson 2011-08-04, 16:09
Copy link to this message
-
Re: Apparent data loss on 90.4 rc2 after partial zookeeper network partition (on MapR)
Jacques 2011-08-04, 16:40
Do you have any suggestions of things I should look at to confirm/deny these
possibilities?

The tables are very small and inactive (probably only 50-100 rows changing
per day).

Thanks,
Jacques

On Thu, Aug 4, 2011 at 9:09 AM, Ryan Rawson <[EMAIL PROTECTED]> wrote:

> Another possibility is the logs were not replayed correctly during the
> region startup.  We put in a lot of tests to cover this case, so it
> should not be so.
>
> Essentially the WAL replay looks at the current HFiles state, then
> decides which log entries to replay or skip. This is because a log
> might have more data than what is strictly missing from the HFiles.
>
> If the data that is missing is over 6 hours old, that is a very weird
> bug, it suggests to me that either an hfile is missing for some
> reason, or the WAL replay didnt include some for some reason.
>
> -ryan
>
> On Thu, Aug 4, 2011 at 8:38 AM, Jacques <[EMAIL PROTECTED]> wrote:
> > Thanks for the feedback.  So you're inclined to think it would be at the
> dfs
> > layer?
> >
> > Is it accurate to say the most likely places where the data could have
> been
> > lost were:
> > 1. wal writes didn't actually get written to disk (no log entries to
> suggest
> > any issues)
> > 2. wal corrupted (no log entries suggest any trouble reading the log)
> > 3. not all split logs were read by regionservers  (?? is there any way to
> > ensure this either way... should I look at the filesystem some place?)
> >
> > Do you think the type of network partition I'm talking about is
> adequately
> > covered in existing tests? (Specifically running an external zk cluster?)
> >
> > Have you heard if anyone else is been having problems with the second
> 90.4
> > rc?
> >
> > Thanks again for your help.  I'm following up with the MapR guys as well.
> >
> > Jacques
> >
> > On Wed, Aug 3, 2011 at 3:49 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]
> >wrote:
> >
> >> Hi Jacques,
> >>
> >> Sorry to hear about that.
> >>
> >> Regarding MapR, I personally don't have hands-on experience so it's a
> >> little bit hard for me to help you. You might want to ping them and
> >> ask their opinion (and I know they are watching, Ted? Srivas?)
> >>
> >> What I can do is telling you if things look normal from the HBase
> >> point of view, but I see you're not running with DEBUG so I might miss
> >> some information.
> >>
> >> Looking at the master log, it tells us that it was able to split the
> >> logs correctly.
> >>
> >> Looking at a few regionserver logs, it doesn't seem to say that it had
> >> issues replaying the logs so that's good too.
> >>
> >> About the memstore questions, it's almost purely size-based (64MB). I
> >> say almost because we limit the number of WALs a regionserver can
> >> carry so that when it reaches that limit it force flushes the
> >> memstores with older edits. There's also a thread that rolls the
> >> latest log if it's more than an hour old, so in the extreme case it
> >> could take 32 hours for an edit in the memstore to make it to a
> >> StoreFile. It used to be that without appends rolling those files
> >> often would prevent losses older than 1 hour, but I haven't seen those
> >> issues since we started using appends. But you're not using HDFS, and
> >> I don't have MapR experience, so I can't really go any further...
> >>
> >> J-D
> >>
> >> On Tue, Aug 2, 2011 at 3:44 PM, Jacques <[EMAIL PROTECTED]> wrote:
> >> > Given the hardy reviews and timing, we recently shifted from 90.3
> >> (apache)
> >> > to 90.4rc2 (the July 24th one that Stack posted -- 0.90.4, r1150278).
> >> >
> >> > We had a network switch go down last night which caused an apparent
> >> network
> >> > partition between two of our region servers and one or more zk nodes.
> >> >  (We're still piecing together the situation).  Anyway, things
> *seemed*
> >> to
> >> > recover fine.  However, this morning we realized that we lost some
> data
> >> that
> >> > was generated just before the problems occurred.
> >> >
> >> > It looks like h002 went down nearly immediately at around 8pm while
+
Ryan Rawson 2011-08-04, 16:52
+
Jacques 2011-08-04, 17:48
+
Jean-Daniel Cryans 2011-08-04, 17:34
+
M. C. Srivas 2011-08-05, 03:13
+
Ryan Rawson 2011-08-05, 03:19
+
lohit 2011-08-05, 03:36
+
Todd Lipcon 2011-08-05, 04:01
+
M. C. Srivas 2011-08-05, 14:07
+
M. C. Srivas 2011-08-05, 15:52
+
Jean-Daniel Cryans 2011-08-05, 16:42
+
Ryan Rawson 2011-08-05, 16:45
+
Jean-Daniel Cryans 2011-08-05, 16:56
+
M. C. Srivas 2011-08-05, 17:21
+
Todd Lipcon 2011-08-05, 18:28
+
M. C. Srivas 2011-08-05, 18:57
+
Ramkrishna S Vasudevan 2011-08-05, 04:02