That's a long list of tests. Impressive.
I will add this one:
> -Meanness tests:
> 1. Killing the master
> 2. Performing a compaction
> 3. Table enable/disable
4. Moving a/some region/s while snapshot is running.
Killing the master or a RS will occur in some regions moved, but also,
some bulk imports then load balancing can induct massive regions moves
2013/1/15, Nicolas Liochon <[EMAIL PROTECTED]>:
> I would be +1 on killing datanodes during the tests. I think we tend to
> under analyze the impact on an HDFS error in HBase.
> See for example
> in the distributed log, we were considering a task as dead if the split was
> not done in 25s. If you were going to the dead DN to read the WAL, 25s was
> far from enough, and we were ending up doing the same split on multiple
> HDFS is a nice buddy, but it can't hide everything.
> On Tue, Jan 15, 2013 at 9:55 AM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote:
>> My counter-argument here is that this would be a bug in HDFS as
>> opposed to HBase. It is good to know, but ideally shouldn't be exposed
>> at the HBase level. This test won't really make sense if there was a
>> different FS underneath.
>> That said, if you insist we can add and and report on this (lower
>> priority than the hbase-level problems though).
>> On Mon, Jan 14, 2013 at 6:47 PM, Andrew Purtell <[EMAIL PROTECTED]>
>> > If a datanode goes down and it has an indirect bad effect on snapshots,
>> > this would be useful to know.
>> > For the HA NN item, I threw that in there for completeness sake.
>> > Ideally
>> > client like HBase wouldn't notice.
>> > On Mon, Jan 14, 2013 at 5:27 PM, Jonathan Hsieh <[EMAIL PROTECTED]>
>> >> I think the killing data nodes and killing HA NN is out of scope form
>> >> an HBase point of view.
>> > --
>> > Best regards,
>> > - Andy
>> > Problems worthy of attack prove their worth by hitting back. - Piet
>> > Hein
>> > (via Tom White)
>> // Jonathan Hsieh (shay)
>> // Software Engineer, Cloudera
>> // [EMAIL PROTECTED]