Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request 12936: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format


Copy link to this message
-
Re: Review Request 12936: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12936/#review24408
-----------------------------------------------------------
Thanks for working on this and looks good.   The ability to have an intermediate format is a good thing (I am mimicking somewhat similar targeted work for Sqoop 1 for some new changes).
common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java
<https://reviews.apache.org/r/12936/#comment48298>

    Do you think this should be writeContent (or conversely the method in DataReader should be changed to readRecord instead of Content?)
- Venkat Ranganathan
On Aug. 1, 2013, 3:41 a.m., Hari Shreedharan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12936/
> -----------------------------------------------------------
>
> (Updated Aug. 1, 2013, 3:41 a.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-777
>     https://issues.apache.org/jira/browse/SQOOP-777
>
>
> Repository: sqoop-sqoop2
>
>
> Description
> -------
>
> Implemented a pluggable intermediate data format that decouples the internal representation of the data from the connector and the output formats. Connectors can choose to implement and support a format that is more efficient for them. Also separated the SqoopWritable so that we can use the intermediate data format independent of (current) Hadoop.
>
> I ran a full build - all tests including integration tests pass. I have not added any new tests, yet. I will add unit tests for the new classes. Also, I have not tried running this on an actual cluster - so things may be broken. I'd like some initial feedback based on the current patch.
>
> I also implemented escaping of characters. There is some work remaining to support binary format, but it is mostly integration, the basic implementation is in place.
>
>
> Diffs
> -----
>
>   common/pom.xml db11b5b
>   common/src/main/java/org/apache/sqoop/etl/io/DataReader.java 3e1adc7
>   common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java d81364e
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java 8b630b2
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java e0da80f
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java 7212843
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java 96818ba
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java PRE-CREATION
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java aa1c4ff
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java a7ed6ba
>   connector/connector-sdk/pom.xml 4056e14
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java PRE-CREATION
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java PRE-CREATION
>   connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java PRE-CREATION
>   core/src/main/java/org/apache/sqoop/framework/JobManager.java d0a087d
>   core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java 53d0039
>   execution/mapreduce/pom.xml f9a2a0e
>   execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 767080c
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java 7fd9a01
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java 1978ec6
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java a07c511
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java 4621942