Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request: Fix for SQOOP-937


Copy link to this message
-
Re: Review Request: Fix for SQOOP-932


> On March 20, 2013, 9:16 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/manager/DirectNetezzaManager.java, lines 78-83
> > <https://reviews.apache.org/r/10018/diff/2/?file=271931#file271931line78>
> >
> >     My Netezza knowledge is a bit rusty these day, but I do have feeling that the external table parameter "NULLVALUE" is used only for string based columns (varchar, ...). For all other column types (int, float, ...) empty string is used to encode NULL value. On precondition that this is still the case, shouldn't the condition be more if(nullNonStrValue != null) { error; }?
>
> Venkat Ranganathan wrote:
>     Thanks Jarcec
>    
>     The Netezza external table feature supports NULLVALUE for nonstring columns also.  The unit tests tests it with an Integer column if you see.
>    
>     Thanks
>     Venkat
>
> Jarek Cecho wrote:
>     Hi Venkat,
>     thank you for your feedback. I was actually looking at that particular test before, but I came to a conclusion that it's passing simply because "\N" is invalid value for integer based column. To verify my theory, I've added another test to DirectNetezzaExportManualTest that is using string "1" for null escape character (e.g. value that is valid for integer based column):
>    
>       @Test
>       public void testNullStringExport2() throws Exception {
>    
>         String [] extraArgs = {
>             "--input-null-string", "1",
>             "--input-null-non-string", "1",
>             "--input-escaped-by", "\\",
>         };
>         ColumnGenerator[] extraCols = new ColumnGenerator[] {
>            new ColumnGenerator() {
>              @Override
>              public String getExportText(int rowNum) {
>                return "1";
>              }
>    
>              @Override
>              public String getVerifyText(int rowNum) {
>                return null;
>              }
>    
>              @Override
>              public String getType() {
>                return "INTEGER";
>              }
>            },
>         };
>    
>         String[] argv = getArgv(true, 10, 10, extraArgs);
>         runNetezzaTest(getTableName(), argv, extraCols);
>       }
>    
>     And this particular test is failing for me:
>    
>     Testcase: testNullStringExport2 took 2.528 sec
>     FAILED
>     Got unexpected column value expected:<null> but was:<1>
>     junit.framework.ComparisonFailure: Got unexpected column value expected:<null> but was:<1>
>     at com.cloudera.sqoop.TestExport.assertColValForRowId(TestExport.java:380)
>     at com.cloudera.sqoop.TestExport.assertColMinAndMax(TestExport.java:398)
>     at com.cloudera.sqoop.manager.DirectNetezzaExportManualTest.runNetezzaTest(DirectNetezzaExportManualTest.java:131)
>     at com.cloudera.sqoop.manager.DirectNetezzaExportManualTest.testNullStringExport2(DirectNetezzaExportManualTest.java:205)
>    
>    
>     I've also tried similar test for direct import by adding following test to NetezzaImportManualTest:
>    
>       @Test
>       public void testDirectNullStringValue() throws Exception {
>    
>    
>          String [] extraArgs = {
>              "--null-string", "\\\\N",
>              "--null-non-string", "\\\\N",
>           };
>    
>          String[] expectedResultsWithNulls >            getExpectedResultsWithNulls();
>          String tableNameWithNull = getTableName() + "_W_N";
>    
>          runNetezzaTest(true, tableNameWithNull, expectedResultsWithNulls,
>             extraArgs);
>       }
>    
>     Generated output seems to be suggesting that the substitution character is not being used for the integer column:
>    
>     22218 [main] INFO com.cloudera.sqoop.manager.NetezzaImportManualTest  - Line read from file = 1,Aaron,2009-05-14,1000000,T,engineering,,1
>     22218 [main] INFO com.cloudera.sqoop.manager.NetezzaImportManualTest  - Line read from file = 3,Fred,2009-01-23,15,F,marketing,,3
>     22218 [main] INFO com.cloudera.sqoop.manager.NetezzaImportManualTest  - Line read from file = 2,Bob,2009-04-20,400,T,sales,,2

Thanks Jarcec for the examples

On the export - I think it may be a bug in NZ load that some valid values for NULL representation (like 1 in your case) are passed as is.  If \N is not treated as NULL, then we will have the records treated as bad and written to bad records, right?

On the import, I saw the issue, but it was with any columns.  I think it might also be tied to the JDBC driver version and  the NZ version.   For example, when I was researching Netezza forums, I saw reports that the NULL string should be 0-4 chars and only ASCII chars and no special chars, but then it has been changed in the later documentation to UTF-8 chars.

I agree it is not consistent.
Thanks

Venkat
- Venkat
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10018/#review18174
On March 21, 2013, 12:08 a.m., Venkat Ranganathan wrote: