Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Bigtop >> mail # dev >> proper approach to smoke tests development

Copy link to this message
Re: proper approach to smoke tests development
Hi mikhail !

---- Regarding the need for a cluster, IMO its imperative to run in some
kind of  cluster, The reason is that:

1) You want your test to transfer a file  into a RDBMS, and you want it to
test that the distributed cluster is actually breaking the work up

2) Also we would want to confirm that the cluster is able to distribute
exporting (copying from the distributed file system into a rdbms) as well.

TL;DR : Definetly validate your sqoop smokes in a distributed cluster for
sqoop.    It can be  couple of VMs ... nothing fancy, but more than one
machine.  For other tests (like wordcount, for example) I guess testing on
a single node  is probably a little more acceptable.

** If you dont have a cluster just point me at a patch ! we can test it for
you .. **

--- Now regarding sqoop tests:

By the way ... Are you working on BIGTOP-1019?  If so thanks !  One
possible implementation is using HSQL in file mode, and then putting the
file as a locally readable file on all machines.  I havent figured out a
way to do that in hadoop with hdfs yet, but in gluster, we typically FUSE
mount, so I have a very simple smoke test that works nicely for sqoop  on
gluster... you might be interested to check it out if you can find a way to
make a single file locally available to all nodes on a hadoop cluster, it
will work (and be easy to maintain : no network connection required for the
JDBC stuff - just pure JDBC ETL)...  Its shell scripted here (but we could
port the shell commands easily to Itest commands in groovy)...

- You can see that it runs hsql in "file" mode, and stores the database as
a file in a gluster mounted directory.

-That directory is available to every node of my cluster ..

- So that is actually a very concrete example of how, if i didnt run the
test in distributed mode, and there was a bug in my code

On Sun, Jan 12, 2014 at 8:37 PM, Mikhail Antonov <[EMAIL PROTECTED]>wrote:

> Hello everyone,
> I have one question on the "proper" approach to test development. While
> working on (for example) tests for Sqoop (migrating them from mysql to
> mocked jdbc driver), what's the "right" approach for the iteration?
> Say I have host machine with Fedora, where I have build environment and
> build Bigtop. Shall I have 1 manually-created VM with minimal CentOS, where
> I copy built rpms, install them, and run smokes (or generate VM in
> automated way)? Or shell I have cluster of VMs to run fullly-distributed
> cluster? Or is it more or less fine to just run smokes on host machine?
> --
> Thanks,
> Mikhail Antonov

Jay Vyas