Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # dev - Review Request: Fix for SQOOP-937


Copy link to this message
-
Re: Review Request: Fix for SQOOP-937
Venkat Ranganathan 2013-03-19, 21:25

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9909/
-----------------------------------------------------------

(Updated March 19, 2013, 9:25 p.m.)
Review request for Sqoop and Jarek Cecho.
Changes
-------

Removed the generation of the ORM files completely for the cases mentioned.   Right now DirectNetezzaManager is the only exercising it as a sample.   Once this is goes in, we can update all the similar managers to avoid the generation of unnecessary ORM files.   The earlier approach of generating a dummy file had its merits in some cases, but this one elmininates the generation completely.

Ran all tests, Netezza direct mode tests to validate the functionality and added unit tests for new functionalty
Description
-------

SQOOP generates an ORM file that represents a record in a table for a
given connector implememntation.  The generated class has methods to
read field values off ResultSet, set bind values in PreparedStatements,
handle LOB objects in a DB specific way (that the connector represents),
parse input fields etc.

This file is then compiled and archived using jar and used with the SQOOP job.

This ORM instance is generated in all cases, and used to make sure that
the data is read and processed in a more or less uniform way.

Unfortunately, this generated class is not used by a class of connectors which
manage the reading and processing of records themselves.   In essence, the
whole ORM class is unusable in these instances and is simply ignored.  These
are typically "direct" connectors which use a DB specific highspeed path.

The generation of the ORM class in these cases causes confusion to users.

This patch tries to solve this by

  1)  Providing a capability for the connection managers to declare that
they don't depend on the ORM jar file.
  2) Generating a dummy ORM jar file with explicit message during generation
so that
     a)  users are aware that the class generated is a dummy one
     b)  there is a record that this jar file was generated explicitly
     c)  we don't have to change a whole lot of the codebase to disable
         the generaration and loading of the jar file.
This patch also adds one test.
Diffs (updated)
-----

  src/java/org/apache/sqoop/manager/ConnManager.java 1b32dc9
  src/java/org/apache/sqoop/manager/DirectNetezzaManager.java 0a1e605
  src/java/org/apache/sqoop/manager/ExportJobContext.java 5699e2f
  src/java/org/apache/sqoop/manager/ImportJobContext.java 09a7abe
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java ff84974
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java f766532
  src/java/org/apache/sqoop/mapreduce/JobBase.java 4e7723f
  src/java/org/apache/sqoop/orm/ClassWriter.java 982e444
  src/java/org/apache/sqoop/tool/CodeGenTool.java 8a4aa42
  src/test/com/cloudera/sqoop/orm/TestClassWriter.java 3b77571

Diff: https://reviews.apache.org/r/9909/diff/
Testing
-------

Added one tests.  All unit tests and check style tests passed with no new checkstyle issues
Thanks,

Venkat Ranganathan