Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> import from Oracle to Hive : 2 errors


+
Jérôme Verdier 2013-06-17, 13:46
+
Jarek Jarcec Cecho 2013-06-17, 14:58
+
Jérôme Verdier 2013-06-17, 15:31
+
Jarek Jarcec Cecho 2013-06-17, 16:54
+
Jérôme Verdier 2013-06-17, 15:59
+
Jarek Jarcec Cecho 2013-06-17, 16:56
Copy link to this message
-
Re: import from Oracle to Hive : 2 errors
Hi Jarcec,

Thanks for your explanations, it help me understand how Sqoop works.

i'm trying import 1000 Rows for a quite Oracle big table which is divided
in partitions to keep reasonable query time.

i am using this Sqoop script, with a query to select only the first 1000
rows :

sqoop import --connect jdbc:oracle:thin:@xx.xx.xx.xx:1521/D_BI --username
xx --password xx --create-hive-table --query 'SELECT * FROM
DT_PILOTAGE.DEMARQUE_MAG_JOUR WHERE ROWNUM<1000 AND $CONDITIONS'
--target-dir /home/hduser --split-by DEMARQUE_MAG_JOUR.CO_SOCIETE
--hive-table default.DEMARQUE_MAG_JOUR

the M/R job is working quite, but as we can see in the result below, datas
are not moved to Hive.

Warning: /usr/lib/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
13/06/18 12:05:21 WARN tool.BaseSqoopTool: Setting your password on the
command-line is insecure. Consider using -P instead.
13/06/18 12:05:21 WARN tool.BaseSqoopTool: It seems that you've specified
at least one of following:
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --hive-home
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --hive-overwrite
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --create-hive-table
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --hive-table
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --hive-partition-key
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --hive-partition-value
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --map-column-hive
13/06/18 12:05:21 WARN tool.BaseSqoopTool: Without specifying parameter
--hive-import. Please note that
13/06/18 12:05:21 WARN tool.BaseSqoopTool: those arguments will not be used
in this session. Either
13/06/18 12:05:21 WARN tool.BaseSqoopTool: specify --hive-import to apply
them correctly or remove them
13/06/18 12:05:21 WARN tool.BaseSqoopTool: from command line to remove this
warning.
13/06/18 12:05:21 INFO manager.SqlManager: Using default fetchSize of 1000
13/06/18 12:05:21 INFO tool.CodeGenTool: Beginning code generation
13/06/18 12:05:40 INFO manager.OracleManager: Time zone has been set to GMT
13/06/18 12:05:40 INFO manager.SqlManager: Executing SQL statement: SELECT
* FROM DT_PILOTAGE.DEMARQUE_MAG_JOUR WHERE ROWNUM<1000 AND  (1 = 0)
13/06/18 12:05:40 INFO manager.SqlManager: Executing SQL statement: SELECT
* FROM DT_PILOTAGE.DEMARQUE_MAG_JOUR WHERE ROWNUM<1000 AND  (1 = 0)
13/06/18 12:05:40 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
/usr/local/hadoop
Note:
/tmp/sqoop-hduser/compile/b2b0decece541a7abda95580d7b1f0d2/QueryResult.java
uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/06/18 12:05:41 INFO orm.CompilationManager: Writing jar file:
/tmp/sqoop-hduser/compile/b2b0decece541a7abda95580d7b1f0d2/QueryResult.jar
13/06/18 12:05:41 INFO mapreduce.ImportJobBase: Beginning query import.
13/06/18 12:05:42 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
SELECT MIN(t1.CO_SOCIETE), MAX(t1.CO_SOCIETE) FROM (SELECT * FROM
DT_PILOTAGE.DEMARQUE_MAG_JOUR WHERE ROWNUM<1000 AND  (1 = 1) ) t1
13/06/18 12:05:42 WARN db.BigDecimalSplitter: Set BigDecimal splitSize to
MIN_INCREMENT
13/06/18 12:05:42 INFO mapred.JobClient: Running job: job_201306180922_0005
13/06/18 12:05:43 INFO mapred.JobClient:  map 0% reduce 0%
13/06/18 12:05:50 INFO mapred.JobClient:  map 100% reduce 0%
13/06/18 12:05:51 INFO mapred.JobClient: Job complete: job_201306180922_0005
13/06/18 12:05:51 INFO mapred.JobClient: Counters: 18
13/06/18 12:05:51 INFO mapred.JobClient:   Job Counters
13/06/18 12:05:51 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=6570
13/06/18 12:05:51 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
13/06/18 12:05:51 INFO mapred.JobClient:     Total time spent by all maps
waiting after reserving slots (ms)=0
13/06/18 12:05:51 INFO mapred.JobClient:     Launched map tasks=1
13/06/18 12:05:51 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/06/18 12:05:51 INFO mapred.JobClient:   File Output Format Counters
13/06/18 12:05:52 INFO mapred.JobClient:     Bytes Written=174729
13/06/18 12:05:52 INFO mapred.JobClient:   FileSystemCounters
13/06/18 12:05:52 INFO mapred.JobClient:     HDFS_BYTES_READ=147
13/06/18 12:05:52 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=61242
13/06/18 12:05:52 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=174729
13/06/18 12:05:52 INFO mapred.JobClient:   File Input Format Counters
13/06/18 12:05:52 INFO mapred.JobClient:     Bytes Read=0
13/06/18 12:05:52 INFO mapred.JobClient:   Map-Reduce Framework
13/06/18 12:05:52 INFO mapred.JobClient:     Map input records=999
13/06/18 12:05:52 INFO mapred.JobClient:     Physical memory (bytes)
snapshot=43872256
13/06/18 12:05:52 INFO mapred.JobClient:     Spilled Records=0
13/06/18 12:05:52 INFO mapred.JobClient:     CPU time spent (ms)=830
13/06/18 12:05:52 INFO mapred.JobClient:     Total committed heap usage
(bytes)=16252928
13/06/18 12:05:52 INFO mapred.JobClient:     Virtual memory (bytes)
snapshot=373719040
13/06/18 12:05:52 INFO mapred.JobClient:     Map output records=999
13/06/18 12:05:52 INFO mapred.JobClient:     SPLIT_RAW_BYTES=147
13/06/18 12:05:52 INFO mapreduce.ImportJobBase: Transferred 170,6338 KB in
10,6221 seconds (16,0641 KB/sec)
13/06/18 12:05:52 INFO mapreduce.ImportJobBase: Retrieved 999 records.

Why SQOOP doesn't move data to Hive, is there a problem with partitioned
table ??

Thanks.
2013/6/17 Jarek Jarcec Cecho <[EMAIL PROTECTED]>
*Jérôme VERDIER*
06.72.19.17.31
[EMAIL PROTECTED]
+
Venkat 2013-06-18, 13:38
+
Jérôme Verdier 2013-06-18, 14:02
+
Jérôme Verdier 2013-06-18, 14:14