Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Error while importing data from oracle database when using split-by timestamp column


Copy link to this message
-
Re: Error while importing data from oracle database when using split-by timestamp column
Sqoop uses where clauses with SELECT query to set upper and lower
bound of the split. It seems the format of the date value in the where
clause of the SELECT Query does not match that in the table. You will
be able to see the query fired by Sqoop in your Oracle Logs. Can you
check if the date format matches?
If no, then it may be a limitation of Sqoop. You can file a JIRA for
this issue and use some other column for splitting the data.

Let me know if this doesn't work out.

Thanks,
Abhijeet

On 9/11/12, Sadasiva Guntupalli <[EMAIL PROTECTED]> wrote:
>  Hi Jarcec,
>
> Thank you. There is no primary key on MESSAGE_ID column. The following is
> the table structure. This table has approximately 100 million rows in each
> partition. I am trying to import one partition at a time.
> Please find the sqoop log below. There is an index defined on the column
> HYDRO_DATETIME.
> I am using split-by HYDRO_DATETIME to balance the load on all the 6 nodes.
>
>  RMS_DXC_HYDRO_CONTENT
>  ------------------------------------------------
>   MESSAGE_ID                     NUMBER                  NOT NULL
>   HYDRO_DATETIME        TIMESTAMP(3)            NOT NULL
>   PSI_10_VAL            FLOAT(126)
>
> The following is the sqoop log after --verbose option is enabled.
>
> [mapr@lxhadoop6 rms]$ sqoop import --connect
> jdbc:oracle:thin:@//rmslt-scan:1521/RACRMSLT
> --query  'SELECT R.MESSAGE_ID, R.HYDRO_DATETIME, R.PSI_10_VAL FROM
> RMS_DXC_HYDRO_CONTENT PARTITION(DXC_HYDRO_CONTENT_P20120826) R WHERE
> $CONDITIONS'  --split-by HYDRO_DATETIME  --username RMS -P  --target-dir
> /user/hive/HYDRO_CONTENT1 --hive-table RMS_DXC_HYDRO_CONTENT --hive-import
> --hive-partition-key  HYDRO_PART_DATE  --hive-partition-value "2012-08-26"
> -m 2 --verbose
> Enter password:
> 12/09/11 09:52:26 INFO tool.BaseSqoopTool: Using Hive-specific delimiters
> for output. You can override
> 12/09/11 09:52:26 INFO tool.BaseSqoopTool: delimiters with
> --fields-terminated-by, etc.
> 12/09/11 09:52:26 INFO manager.SqlManager: Using default fetchSize of 1000
> 12/09/11 09:52:26 INFO tool.CodeGenTool: Beginning code generation
> 12/09/11 09:52:28 INFO manager.OracleManager: Time zone has been set to GMT
> 12/09/11 09:52:28 INFO manager.SqlManager: Executing SQL statement: SELECT
> R.MESSAGE_ID, R.HYDRO_DATETIME, R.PSI_10_VAL FROM RMS_DXC_HYDRO_CONTENT
> PARTITION(DXC_HYDRO_CONTENT_P20120826) R WHERE  (1 = 0)
> 12/09/11 09:52:28 INFO manager.SqlManager: Executing SQL statement: SELECT
> R.MESSAGE_ID, R.HYDRO_DATETIME, R.PSI_10_VAL FROM RMS_DXC_HYDRO_CONTENT
> PARTITION(DXC_HYDRO_CONTENT_P20120826) R WHERE  (1 = 0)
> 12/09/11 09:52:28 INFO orm.CompilationManager: HADOOP_HOME is
> /opt/mapr/hadoop/hadoop-0.20.2/bin/..
> Note:
> /tmp/sqoop-mapr/compile/3276fada0f9443f93a4b3b64ee4fb126/QueryResult.java
> uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 12/09/11 09:52:29 ERROR orm.CompilationManager: Could not rename
> /tmp/sqoop-mapr/compile/3276fada0f9443f93a4b3b64ee4fb126/QueryResult.java
> to /home/mapr/rms/./QueryResult.java
> java.io.IOException: Destination '/home/mapr/rms/./QueryResult.java'
> already exists
>         at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:1811)
>         at
> org.apache.sqoop.orm.CompilationManager.compile(CompilationManager.java:227)
>         at
> org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:83)
>         at
> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:367)
>         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453)
>         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
>         at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
> 12/09/11 09:52:29 INFO orm.CompilationManager: Writing jar file:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB