Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> Mocking framework

Copy link to this message
Re: Mocking framework
On Fri, Oct 28, 2011 at 8:19 AM, John W Vines <[EMAIL PROTECTED]> wrote:

> I will start off by saying that myself, as well as most of the committers,
> are no familiar with EasyMock, PowerMock, or really any Mocking tools. But
> I've done some reading to determine what the scope of mocking can get us. So
> pardon me if I bore you because I'm just going to lay out what I know in
> hopes of A. making sure my understanding is correct and B. to see if what I
> think we should use it for is within the scope of Mocking as well as
> feasible.
> 1. With exception for PowerMock, mocking tools are nothing more than tools
> which give the ability to define a pre-set set of responses to an interface.
> That is, if we wanted to implement a low level iterator to replace the
> Reader we use to read items off disk/out of the in memory map, we could, but
> we could only give it a set of responses to give when queried. We could not
> (through just the mocking tools) set it up to give programmatic responses,
> such as returning the result of another method, unless we 'pre-load' the
> method result. That is, we couldn't have one mock method which put an item
> in a map and another mock method pull it out again, unless we define the
> pull method to return that specific value for the Xth call to the pull
> method to which that correlated.

That isn't exactly correct. You can actually pass in the data structure so
when someone does a call (e.g. put(Thing)) you can catch that call and and
actually pass that into a data structure which is accessed by reference. For
example, check out EasyMock's IAnswer where you can programatically
determine a 'return' result.

> 2. PowerMock lets us essentially do the same as things like EasyMock,
> except we're not bound to interfaces. This way we can use a defined class
> and only overwrite the methods we want with predefined results.
> 3. What I want to see us doing, at a very high level, is to have the
> ability to mock an entire TServer to the extent where we will use something
> to replace Zookeeper (We should probably turn our ZK work with an interface)
> with a MockZookeeper (not generated through a Mock util) which is nothing
> more than a Map. Same thing with the FileReader, except a SortedMap, the
> loggers, and the master. This way we could fully implement a whole TServer
> without worry about HDFS and Zookeeper. To a similar extent I would like to
> see this done for all core components, but mocking the various connectors we
> use to get done what we need to. I see a few sets of Mock class we will have
> to create. But with less chance of divergence in behavior then we currently
> experience with our MockAccumulo setup.
I feel like this kind of thing (particularly with a fully mocked TServer)
would be part of
Accumulo-14<https://issues.apache.org/jira/browse/ACCUMULO-14>and the
testing suite. Otherwise, you end up with divergent implementations
and have the same mess where the mocks don't have any real correlation to
how the system works.

 However, a lot of times you really don't want to (and shouldn't be) mocking
the full interface for a component (eg. a full mock TSever) since then you
are not really testing much of anything other than you are making the calls
to the mock that you can see you are making from the code (I'm thinking here
particularly of code that basically does pass through of calls to other
objects). In that case you really need to have an integration test to make
sure that each piece works with others, since they would be clearly tightly

The mock is really optimal for cases where you have loose coupling between
objects and only need to cover a few calls to an external class. You are
really looking to test the logic/computation of the object and only cover a
few elements of interaction with other objects via mocking.
> 4. My principal concern about 3 above is divergence with the code (less so
> than our current setup but with emulating thrift interfaces we could get
This is already the problem with the MockAccumulo (and everything under
client.impl.mock) so we want to avoid re-redoing that - divergent behavior
is only bad :)

In short, mocking is really powerful and great for doing isolated testing,
but in the end is only as good as the people who use it. It is very easy to
go overboard with the mocking and basically mock out an entire, tightly
coupled interface that has no reflection on reality. This leads to people
thinking their code works because it matches the mock responses, which
actually look nothing like what reality would serve up to their code (one
case where I have been bitten by this in the past if mocking out what I
_thought_ ZooKeeper would return).
Jesse Yates
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB