Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # dev - Re: Review Request 10964: SQOOP-927 Sqoop2: Integration: Mapreduce specific tests should be running on MiniCluster


Copy link to this message
-
Re: Review Request 10964: SQOOP-927 Sqoop2: Integration: Mapreduce specific tests should be running on MiniCluster
Jarek Cecho 2013-07-11, 20:49

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10964/#review23015
-----------------------------------------------------------
Hi Abe,
thank you very much for incorporating all my suggestions! I do have couple more polishing suggestions:
test/src/main/java/org/apache/sqoop/test/hadoop/HadoopLocalRunner.java
<https://reviews.apache.org/r/10964/#comment46781>

    Nit: Local cluster != "jobs locally"
    
    When hadoop is running jobs locally, they are run with LocalJobRunner without any cluster. This is very important distinction because the code path is totally different. E.g. in local mode there is totally different MR implementation then in the cluster mode.
    
    Suggested rewording:
    
    HadoopRunner implementation that is using LocalJobRunner for executing mapreduce jobs and local filesystem instead of HDFS.

test/src/main/java/org/apache/sqoop/test/hadoop/HadoopLocalRunner.java
<https://reviews.apache.org/r/10964/#comment46797>

    It seems to me that here is a type and the getDataDir() should be replaced with getTemporaryDirectory()?

test/src/main/java/org/apache/sqoop/test/hadoop/HadoopRunner.java
<https://reviews.apache.org/r/10964/#comment46783>

    Proposed javadoc comment:
    
    Hadoop cluster runner for testing purpose.
    
    Runner provides methods needed to bootstrapping and using Hadoop cluster. This abstract implementation is agnostic about in what mode the Hadoop is running. Each mode will have it's own concrete implementation (for example LocalJobRunner, MiniCluster or Real existing cluster).

test/src/main/java/org/apache/sqoop/test/hadoop/HadoopRunner.java
<https://reviews.apache.org/r/10964/#comment46798>

    Suggested rewording:
    
    Temporary path that can be used as a root for other directories storing various data like logs or stored HDFS files.

test/src/main/java/org/apache/sqoop/test/hadoop/HadoopRunner.java
<https://reviews.apache.org/r/10964/#comment46784>

    Nit: s/mini//

test/src/main/java/org/apache/sqoop/test/hadoop/HadoopRunner.java
<https://reviews.apache.org/r/10964/#comment46785>

    Nit: s/mini//

test/src/main/java/org/apache/sqoop/test/hadoop/HadoopRunner.java
<https://reviews.apache.org/r/10964/#comment46794>

    Suggested rewording:
    
    Return working directory on HDFS instance that this HadoopRunner is using.
    
    This directory might be on local filesystem in case of local mode.

test/src/main/java/org/apache/sqoop/test/hadoop/HadoopRunner.java
<https://reviews.apache.org/r/10964/#comment46799>

    Shouldn't this be also based on the temporaryPath as the getDataDir() method?

test/src/main/java/org/apache/sqoop/test/hadoop/HadoopRunner.java
<https://reviews.apache.org/r/10964/#comment46793>

    Suggested rewording:
    
    Return directory on local filesystem where logs and other data generated by the Hadoop Cluster should be stored.

test/src/main/java/org/apache/sqoop/test/hadoop/HadoopRunner.java
<https://reviews.apache.org/r/10964/#comment46796>

    Nit: It seems that on most places we are using the HdfsUtils.joinPathFragments instead of File() object. I would suggest to be consistent.

test/src/main/java/org/apache/sqoop/test/testcases/TomcatTestCase.java
<https://reviews.apache.org/r/10964/#comment46789>

    Please do not remove the tmpPath variable. TomcatTestCase is the root test case that everything else is inheriting and therefore should be owner of this variable. Also in addition the path should be available to all children in case that there is need to store additional temporal information on the local file system.

test/src/main/java/org/apache/sqoop/test/testcases/TomcatTestCase.java
<https://reviews.apache.org/r/10964/#comment46790>

    I believe that the timestamp is not necessary as the temporary path on local file system is guaranteed to be unique.

test/src/main/java/org/apache/sqoop/test/testcases/TomcatTestCase.java
<https://reviews.apache.org/r/10964/#comment46792>

    The temporary path needs to be exposed for downstream test cases that will need to store additional temporary files for given test.

test/src/main/java/org/apache/sqoop/test/testcases/TomcatTestCase.java
<https://reviews.apache.org/r/10964/#comment46791>

    Nit: Changing method order do not seem to be necessary here.
Jarcec

- Jarek Cecho
On July 10, 2013, 11:27 p.m., Abraham Elmahrek wrote: