Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Apparent data loss on 90.4 rc2 after partial zookeeper network partition (on MapR)


+
Jacques 2011-08-02, 22:44
+
Jean-Daniel Cryans 2011-08-03, 22:49
+
Jacques 2011-08-04, 15:38
+
Ryan Rawson 2011-08-04, 16:09
+
Jacques 2011-08-04, 16:40
+
Ryan Rawson 2011-08-04, 16:52
Copy link to this message
-
Re: Apparent data loss on 90.4 rc2 after partial zookeeper network partition (on MapR)
Jacques 2011-08-04, 17:48
I will take a look and see what I can figure out.

Thanks for your help.

Jacques

On Thu, Aug 4, 2011 at 9:52 AM, Ryan Rawson <[EMAIL PROTECTED]> wrote:

> The regionserver logs that talk about the hlog replay might shed some
> light, it should tell you what entries were skipped, etc.  Having a
> look at the hfile structure of the regions, see if there are holes,
> the HFile.main tool can come in handy here, you can run it as:
> hbase org.apache.hadoop.hbase.io.hfile.HFile
>
> it will give you usage.
>
> Mapr might be able to give you audit logs of the time in question,
> that could be useful as well.
>
>
>
> On Thu, Aug 4, 2011 at 9:40 AM, Jacques <[EMAIL PROTECTED]> wrote:
> > Do you have any suggestions of things I should look at to confirm/deny
> these
> > possibilities?
> >
> > The tables are very small and inactive (probably only 50-100 rows
> changing
> > per day).
> >
> > Thanks,
> > Jacques
> >
> > On Thu, Aug 4, 2011 at 9:09 AM, Ryan Rawson <[EMAIL PROTECTED]> wrote:
> >
> >> Another possibility is the logs were not replayed correctly during the
> >> region startup.  We put in a lot of tests to cover this case, so it
> >> should not be so.
> >>
> >> Essentially the WAL replay looks at the current HFiles state, then
> >> decides which log entries to replay or skip. This is because a log
> >> might have more data than what is strictly missing from the HFiles.
> >>
> >> If the data that is missing is over 6 hours old, that is a very weird
> >> bug, it suggests to me that either an hfile is missing for some
> >> reason, or the WAL replay didnt include some for some reason.
> >>
> >> -ryan
> >>
> >> On Thu, Aug 4, 2011 at 8:38 AM, Jacques <[EMAIL PROTECTED]> wrote:
> >> > Thanks for the feedback.  So you're inclined to think it would be at
> the
> >> dfs
> >> > layer?
> >> >
> >> > Is it accurate to say the most likely places where the data could have
> >> been
> >> > lost were:
> >> > 1. wal writes didn't actually get written to disk (no log entries to
> >> suggest
> >> > any issues)
> >> > 2. wal corrupted (no log entries suggest any trouble reading the log)
> >> > 3. not all split logs were read by regionservers  (?? is there any way
> to
> >> > ensure this either way... should I look at the filesystem some place?)
> >> >
> >> > Do you think the type of network partition I'm talking about is
> >> adequately
> >> > covered in existing tests? (Specifically running an external zk
> cluster?)
> >> >
> >> > Have you heard if anyone else is been having problems with the second
> >> 90.4
> >> > rc?
> >> >
> >> > Thanks again for your help.  I'm following up with the MapR guys as
> well.
> >> >
> >> > Jacques
> >> >
> >> > On Wed, Aug 3, 2011 at 3:49 PM, Jean-Daniel Cryans <
> [EMAIL PROTECTED]
> >> >wrote:
> >> >
> >> >> Hi Jacques,
> >> >>
> >> >> Sorry to hear about that.
> >> >>
> >> >> Regarding MapR, I personally don't have hands-on experience so it's a
> >> >> little bit hard for me to help you. You might want to ping them and
> >> >> ask their opinion (and I know they are watching, Ted? Srivas?)
> >> >>
> >> >> What I can do is telling you if things look normal from the HBase
> >> >> point of view, but I see you're not running with DEBUG so I might
> miss
> >> >> some information.
> >> >>
> >> >> Looking at the master log, it tells us that it was able to split the
> >> >> logs correctly.
> >> >>
> >> >> Looking at a few regionserver logs, it doesn't seem to say that it
> had
> >> >> issues replaying the logs so that's good too.
> >> >>
> >> >> About the memstore questions, it's almost purely size-based (64MB). I
> >> >> say almost because we limit the number of WALs a regionserver can
> >> >> carry so that when it reaches that limit it force flushes the
> >> >> memstores with older edits. There's also a thread that rolls the
> >> >> latest log if it's more than an hour old, so in the extreme case it
> >> >> could take 32 hours for an edit in the memstore to make it to a
> >> >> StoreFile. It used to be that without appends rolling those files
+
Jean-Daniel Cryans 2011-08-04, 17:34
+
M. C. Srivas 2011-08-05, 03:13
+
Ryan Rawson 2011-08-05, 03:19
+
lohit 2011-08-05, 03:36
+
Todd Lipcon 2011-08-05, 04:01
+
M. C. Srivas 2011-08-05, 14:07
+
M. C. Srivas 2011-08-05, 15:52
+
Jean-Daniel Cryans 2011-08-05, 16:42
+
Ryan Rawson 2011-08-05, 16:45
+
Jean-Daniel Cryans 2011-08-05, 16:56
+
M. C. Srivas 2011-08-05, 17:21
+
Todd Lipcon 2011-08-05, 18:28
+
M. C. Srivas 2011-08-05, 18:57
+
Ramkrishna S Vasudevan 2011-08-05, 04:02