Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> RE: [jira] [Commented] (SQOOP-976) Incorrect SQL when incremental criteria is text column


Copy link to this message
-
RE: [jira] [Commented] (SQOOP-976) Incorrect SQL when incremental criteria is text column
Jarek,

Yes, there are problems with the text splitter.  I am facing a data base with EBCDIC characters and character based keys.  I would need a splitter employing different code points to make it work.  However, if we are going to restrict its use we ought to do it up front.  That is do not start the map/reduce and put out a meaningful message.  We should not generate bad SQL and then die ignominiously.

Waldyn

-----Original Message-----
From: Jarek Jarcec Cecho (JIRA) [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 02, 2013 1:07 PM
To: [EMAIL PROTECTED]
Subject: [jira] [Commented] (SQOOP-976) Incorrect SQL when incremental criteria is text column
    [ https://issues.apache.org/jira/browse/SQOOP-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620081#comment-13620081 ]

Jarek Jarcec Cecho commented on SQOOP-976:
------------------------------------------

Hi [~waldyn],
thank you very much for your feedback. I do understand the simplicity of the fix. My concern is that I'm not entirely convinced that we should allow this in the first place.

As you've mentioned using scheme like {{CUST1}} ... {{CUST8}} will fail as soon as someone will accidentally insert {{CUST11}}. The database itself won't prevent anyone from doing so and Sqoop will never see this row as newly inserted. Which at the end might lead to a data corruption. Also comparing strings is not efficient from the database perspective and it's actually quite dangerous due to various encoding that can be applied (and especially when they are changed). We've experienced a lot of troubles with {{TextSplitter}} in the past, so I'm afraid that by allowing using string based columns for incremental updates, we're just opening another Pandora box.

Jarcec
                
> Incorrect SQL when incremental criteria is text column
> ------------------------------------------------------
>
>                 Key: SQOOP-976
>                 URL: https://issues.apache.org/jira/browse/SQOOP-976
>             Project: Sqoop
>          Issue Type: Bug
>          Components: tools
>    Affects Versions: 1.4.3
>         Environment: incremental import on table using text column
>            Reporter: Waldyn Benbenek
>             Fix For: 1.4.4
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira