Hi mikhail !
---- Regarding the need for a cluster, IMO its imperative to run in some
kind of cluster, The reason is that:
1) You want your test to transfer a file into a RDBMS, and you want it to
test that the distributed cluster is actually breaking the work up
2) Also we would want to confirm that the cluster is able to distribute
exporting (copying from the distributed file system into a rdbms) as well.
TL;DR : Definetly validate your sqoop smokes in a distributed cluster for
sqoop. It can be couple of VMs ... nothing fancy, but more than one
machine. For other tests (like wordcount, for example) I guess testing on
a single node is probably a little more acceptable.
** If you dont have a cluster just point me at a patch ! we can test it for
you .. **
--- Now regarding sqoop tests:
By the way ... Are you working on BIGTOP-1019? If so thanks ! One
possible implementation is using HSQL in file mode, and then putting the
file as a locally readable file on all machines. I havent figured out a
way to do that in hadoop with hdfs yet, but in gluster, we typically FUSE
mount, so I have a very simple smoke test that works nicely for sqoop on
gluster... you might be interested to check it out if you can find a way to
make a single file locally available to all nodes on a hadoop cluster, it
will work (and be easy to maintain : no network connection required for the
JDBC stuff - just pure JDBC ETL)... Its shell scripted here (but we could
port the shell commands easily to Itest commands in groovy)...
- You can see that it runs hsql in "file" mode, and stores the database as
a file in a gluster mounted directory.
-That directory is available to every node of my cluster ..
- So that is actually a very concrete example of how, if i didnt run the
test in distributed mode, and there was a bug in my code
On Sun, Jan 12, 2014 at 8:37 PM, Mikhail Antonov <[EMAIL PROTECTED]>wrote:
> Hello everyone,
> I have one question on the "proper" approach to test development. While
> working on (for example) tests for Sqoop (migrating them from mysql to
> mocked jdbc driver), what's the "right" approach for the iteration?
> Say I have host machine with Fedora, where I have build environment and
> build Bigtop. Shall I have 1 manually-created VM with minimal CentOS, where
> I copy built rpms, install them, and run smokes (or generate VM in
> automated way)? Or shell I have cluster of VMs to run fullly-distributed
> cluster? Or is it more or less fine to just run smokes on host machine?
> Mikhail Antonov