|
Ajit Kumar Shreevastava
2013-03-13, 13:43
Jarek Jarcec Cecho
2013-03-16, 22:58
Ajit Kumar Shreevastava
2013-03-19, 08:13
Jarek Jarcec Cecho
2013-03-19, 21:25
Ajit Kumar Shreevastava
2013-03-20, 07:30
Jarek Jarcec Cecho
2013-03-21, 00:48
Ajit Kumar Shreevastava
2013-03-21, 06:34
Jarek Jarcec Cecho
2013-03-21, 15:33
Venkat
2013-03-21, 18:30
Jarek Jarcec Cecho
2013-03-22, 02:54
Venkat Ranganathan
2013-03-22, 03:24
Jarek Jarcec Cecho
2013-03-22, 03:46
Ajit Kumar Shreevastava
2013-03-22, 07:37
|
-
Exporting hive table data into oracle give date format errorAjit Kumar Shreevastava 2013-03-13, 13:43
Hi All,
Can you please let me know how can I bypass this error. I am currently using Apache SQOOP version 1.4.2. [hadoop@NHCLT-PC44-2 sqoop-oper]$ sqoop export --connect jdbc:oracle:thin:@10.99.42.11:1521/clouddb --username HDFSUSER --table BTTN_BKP_TEST --export-dir /home/hadoop/user/hive/warehouse/bttn_bkp -P -m 1 --input-fields-terminated-by '\0001' --verbose --input-null-string '\\N' --input-null-non-string '\\N' Please set $HBASE_HOME to the root of your HBase installation. 13/03/13 18:20:42 DEBUG tool.BaseSqoopTool: Enabled debug logging. Enter password: 13/03/13 18:20:47 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory 13/03/13 18:20:47 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory 13/03/13 18:20:47 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:oracle:thin:@10.99.42.11 13/03/13 18:20:47 DEBUG manager.OracleManager$ConnCache: Instantiated new connection cache. 13/03/13 18:20:47 INFO manager.SqlManager: Using default fetchSize of 1000 13/03/13 18:20:47 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.OracleManager@74b23210 13/03/13 18:20:47 INFO tool.CodeGenTool: Beginning code generation 13/03/13 18:20:47 DEBUG manager.OracleManager: Using column names query: SELECT t.* FROM BTTN_BKP_TEST t WHERE 1=0 13/03/13 18:20:47 DEBUG manager.OracleManager: Creating a new connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb, using username: HDFSUSER 13/03/13 18:20:47 DEBUG manager.OracleManager: No connection paramenters specified. Using regular API for making connection. 13/03/13 18:20:47 INFO manager.OracleManager: Time zone has been set to GMT 13/03/13 18:20:47 DEBUG manager.SqlManager: Using fetchSize for next query: 1000 13/03/13 18:20:47 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM BTTN_BKP_TEST t WHERE 1=0 13/03/13 18:20:47 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER 13/03/13 18:20:47 DEBUG orm.ClassWriter: selected columns: 13/03/13 18:20:47 DEBUG orm.ClassWriter: BTTN_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: DATA_INST_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: SCR_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: BTTN_NU 13/03/13 18:20:47 DEBUG orm.ClassWriter: CAT 13/03/13 18:20:47 DEBUG orm.ClassWriter: WDTH 13/03/13 18:20:47 DEBUG orm.ClassWriter: HGHT 13/03/13 18:20:47 DEBUG orm.ClassWriter: KEY_SCAN 13/03/13 18:20:47 DEBUG orm.ClassWriter: KEY_SHFT 13/03/13 18:20:47 DEBUG orm.ClassWriter: FRGND_CPTN_COLR 13/03/13 18:20:47 DEBUG orm.ClassWriter: FRGND_CPTN_COLR_PRSD 13/03/13 18:20:47 DEBUG orm.ClassWriter: BKGD_CPTN_COLR 13/03/13 18:20:47 DEBUG orm.ClassWriter: BKGD_CPTN_COLR_PRSD 13/03/13 18:20:47 DEBUG orm.ClassWriter: BLM_FL 13/03/13 18:20:47 DEBUG orm.ClassWriter: LCLZ_FL 13/03/13 18:20:47 DEBUG orm.ClassWriter: MENU_ITEM_NU 13/03/13 18:20:47 DEBUG orm.ClassWriter: BTTN_ASGN_LVL_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: ON_ATVT 13/03/13 18:20:47 DEBUG orm.ClassWriter: ON_CLIK 13/03/13 18:20:47 DEBUG orm.ClassWriter: ENBL_FL 13/03/13 18:20:47 DEBUG orm.ClassWriter: BLM_SET_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: BTTN_ASGN_LVL_NAME 13/03/13 18:20:47 DEBUG orm.ClassWriter: MKT_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: CRTE_TS 13/03/13 18:20:47 DEBUG orm.ClassWriter: CRTE_USER_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: UPDT_TS 13/03/13 18:20:47 DEBUG orm.ClassWriter: UPDT_USER_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: DEL_TS 13/03/13 18:20:47 DEBUG orm.ClassWriter: DEL_USER_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: DLTD_FL 13/03/13 18:20:47 DEBUG orm.ClassWriter: MENU_ITEM_NA 13/03/13 18:20:47 DEBUG orm.ClassWriter: PRD_CD 13/03/13 18:20:47 DEBUG orm.ClassWriter: BLM_SET_NA 13/03/13 18:20:47 DEBUG orm.ClassWriter: SOUND_FILE_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: IS_DYNMC_BTTN 13/03/13 18:20:47 DEBUG orm.ClassWriter: FRGND_CPTN_COLR_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: FRGND_CPTN_COLR_PRSD_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: BKGD_CPTN_COLR_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: BKGD_CPTN_COLR_PRSD_ID 13/03/13 18:20:47 DEBUG orm.ClassWriter: Writing source file: /tmp/sqoop-hadoop/compile/69b6a9d2ebb99cebced808e559528531/BTTN_BKP_TEST.java 13/03/13 18:20:47 DEBUG orm.ClassWriter: Table name: BTTN_BKP_TEST 13/03/13 18:20:47 DEBUG orm.ClassWriter: Columns: BTTN_ID:2, DATA_INST_ID:2, SCR_ID:2, BTTN_NU:2, CAT:2, WDTH:2, HGHT:2, KEY_SCAN:2, KEY_SHFT:2, FRGND_CPTN_COLR:12, FRGND_CPTN_COLR_PRSD:12, BKGD_CPTN_COLR:12, BKGD_CPTN_COLR_PRSD:12, BLM_FL:2, LCLZ_FL:2, MENU_ITEM_NU:2, BTTN_ASGN_LVL_ID:2, ON_ATVT:2, ON_CLIK:2, ENBL_FL:2, BLM_SET_ID:2, BTTN_ASGN_LVL_NAME:12, MKT_ID:2, CRTE_TS:93, CRTE_USER_ID:12, UPDT_TS:93, UPDT_USER_ID:12, DEL_TS:93, DEL_USER_ID:12, DLTD_FL:2, MENU_ITEM_NA:12, PRD_CD:2, BLM_SET_NA:12, SOUND_FILE_ID:2, IS_DYNMC_BTTN:2, FRGND_CPTN_COLR_ID:2, FRGND_CPTN_COLR_PRSD_ID:2, BKGD_CPTN_COLR_ID:2, BKGD_CPTN_COLR_PRSD_ID:2, 13/03/13 18:20:47 DEBUG orm.ClassWriter: sourceFilename is BTTN_BKP_TEST.java 13/03/13 18:20:47 DEBUG orm.CompilationManager: Found existing /tmp/sqoop-hadoop/compile/69b6a9d2ebb99cebced808e559528531/ 13/03/13 18:20:47 INFO orm.CompilationManager: HADOOP_HOME is /home/hadoop/hadoop-1.0.3/libexec/.. 13/03/13 18:20:47 DEBUG orm.CompilationManager: Adding source file: /tmp/sqoop-hadoop/compile/69b6a9d2ebb99cebced808e559528531/BTTN_BKP_TEST.java 13/03/13 18:20:47 DEBUG orm.CompilationManager: Invoking javac with args: 13/03/13 18:20:47 DEBUG orm.CompilationManager: -sourcepath 13/03/13 18:20:47 DEBUG orm.CompilationManager: /tmp/sqoop-hadoop/compile/69b6a9d2ebb99cebced808e559528531/ 13/03/13 18:20:47 DEBUG orm.CompilationManager: -d 13/03/13 18:20:47 DEBUG orm.CompilationManager: /tmp/sqoop-hadoop/compile/69b6a9d2ebb99cebced808e559528531/ 13/03/13 18:20:47 DEBUG orm.CompilationManager: -
-
Re: Exporting hive table data into oracle give date format errorJarek Jarcec Cecho 2013-03-16, 22:58
[-CC [EMAIL PROTECTED]]
Hi Ajit, would you mind upgrading to Sqoop 1.4.3? We've improved the logging for this particular exception, so it should significantly help in triangulating your issue. Jarcec On Wed, Mar 13, 2013 at 01:43:11PM +0000, Ajit Kumar Shreevastava wrote: > Hi All, > > Can you please let me know how can I bypass this error. I am currently using Apache SQOOP version 1.4.2. > > > [hadoop@NHCLT-PC44-2 sqoop-oper]$ sqoop export --connect jdbc:oracle:thin:@10.99.42.11:1521/clouddb --username HDFSUSER --table BTTN_BKP_TEST --export-dir /home/hadoop/user/hive/warehouse/bttn_bkp -P -m 1 --input-fields-terminated-by '\0001' --verbose --input-null-string '\\N' --input-null-non-string '\\N' > > Please set $HBASE_HOME to the root of your HBase installation. > 13/03/13 18:20:42 DEBUG tool.BaseSqoopTool: Enabled debug logging. > Enter password: > 13/03/13 18:20:47 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory > 13/03/13 18:20:47 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory > 13/03/13 18:20:47 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:oracle:thin:@10.99.42.11 > 13/03/13 18:20:47 DEBUG manager.OracleManager$ConnCache: Instantiated new connection cache. > 13/03/13 18:20:47 INFO manager.SqlManager: Using default fetchSize of 1000 > 13/03/13 18:20:47 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.OracleManager@74b23210 > 13/03/13 18:20:47 INFO tool.CodeGenTool: Beginning code generation > 13/03/13 18:20:47 DEBUG manager.OracleManager: Using column names query: SELECT t.* FROM BTTN_BKP_TEST t WHERE 1=0 > 13/03/13 18:20:47 DEBUG manager.OracleManager: Creating a new connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb, using username: HDFSUSER > 13/03/13 18:20:47 DEBUG manager.OracleManager: No connection paramenters specified. Using regular API for making connection. > 13/03/13 18:20:47 INFO manager.OracleManager: Time zone has been set to GMT > 13/03/13 18:20:47 DEBUG manager.SqlManager: Using fetchSize for next query: 1000 > 13/03/13 18:20:47 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM BTTN_BKP_TEST t WHERE 1=0 > 13/03/13 18:20:47 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER > 13/03/13 18:20:47 DEBUG orm.ClassWriter: selected columns: > 13/03/13 18:20:47 DEBUG orm.ClassWriter: BTTN_ID > 13/03/13 18:20:47 DEBUG orm.ClassWriter: DATA_INST_ID > 13/03/13 18:20:47 DEBUG orm.ClassWriter: SCR_ID > 13/03/13 18:20:47 DEBUG orm.ClassWriter: BTTN_NU > 13/03/13 18:20:47 DEBUG orm.ClassWriter: CAT > 13/03/13 18:20:47 DEBUG orm.ClassWriter: WDTH > 13/03/13 18:20:47 DEBUG orm.ClassWriter: HGHT > 13/03/13 18:20:47 DEBUG orm.ClassWriter: KEY_SCAN > 13/03/13 18:20:47 DEBUG orm.ClassWriter: KEY_SHFT > 13/03/13 18:20:47 DEBUG orm.ClassWriter: FRGND_CPTN_COLR > 13/03/13 18:20:47 DEBUG orm.ClassWriter: FRGND_CPTN_COLR_PRSD > 13/03/13 18:20:47 DEBUG orm.ClassWriter: BKGD_CPTN_COLR > 13/03/13 18:20:47 DEBUG orm.ClassWriter: BKGD_CPTN_COLR_PRSD > 13/03/13 18:20:47 DEBUG orm.ClassWriter: BLM_FL > 13/03/13 18:20:47 DEBUG orm.ClassWriter: LCLZ_FL > 13/03/13 18:20:47 DEBUG orm.ClassWriter: MENU_ITEM_NU > 13/03/13 18:20:47 DEBUG orm.ClassWriter: BTTN_ASGN_LVL_ID > 13/03/13 18:20:47 DEBUG orm.ClassWriter: ON_ATVT > 13/03/13 18:20:47 DEBUG orm.ClassWriter: ON_CLIK > 13/03/13 18:20:47 DEBUG orm.ClassWriter: ENBL_FL > 13/03/13 18:20:47 DEBUG orm.ClassWriter: BLM_SET_ID > 13/03/13 18:20:47 DEBUG orm.ClassWriter: BTTN_ASGN_LVL_NAME > 13/03/13 18:20:47 DEBUG orm.ClassWriter: MKT_ID > 13/03/13 18:20:47 DEBUG orm.ClassWriter: CRTE_TS > 13/03/13 18:20:47 DEBUG orm.ClassWriter: CRTE_USER_ID > 13/03/13 18:20:47 DEBUG orm.ClassWriter: UPDT_TS > 13/03/13 18:20:47 DEBUG orm.ClassWriter: UPDT_USER_ID > 13/03/13 18:20:47 DEBUG orm.ClassWriter: DEL_TS > 13/03/13 18:20:47 DEBUG orm.ClassWriter: DEL_USER_ID
-
RE: Exporting hive table data into oracle give date format errorAjit Kumar Shreevastava 2013-03-19, 08:13
Hi Jercec,
Thank you for your valuable suggestions. I have applied the below suggestion and re-do all the process again with the SQOOP1.4.3 (sqoop-1.4.3.bin__hadoop-1.0.0.tar.gz) but I have face same below error again. Please suggest me. Here I have created table in hive as suggested by you. hive> create table bttn_bkp_testing like bttn_bkp_test; hive> insert OVERWRITE table bttn_bkp_testing > select * from bttn_bkp_test; I am also attaching the error file generated by task tracker for your analysis. It fails for bttn_id = 194628 I have queried both the table and records are like hive> select * from bttn_bkp_testing > where bttn_id=194628; 194628.0 577019.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-04-19 23:25:48.78 ei009724 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 706360.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-05-21 01:01:53.629 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 1620395.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-08-10 04:34:00.203 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 1694103.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-11-08 01:09:15.136 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 1831767.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-12-19 23:44:44.241 e0025129 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 And hive> select * from bttn_bkp_test_new > where bttn_id=194628; 194628.0 577019.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-04-19 23:25:48.78 ei009724 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 706360.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-05-21 01:01:53.629 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 1620395.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-08-10 04:34:00.203 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 1694103.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-11-08 01:09:15.136 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 1831767.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-12-19 23:44:44.241 e0025129 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 Regards, Ajit Kumar Shreevastava From: Jarek Jarcec Cecho [mailto:[EMAIL PROTECTED]] Sent: Sunday, March 17, 2013 4:29 AM To: [EMAIL PROTECTED] Subject: Re: Exporting hive table data into oracle give date format error [-CC [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>] Hi Ajit, would you mind upgrading to Sqoop 1.4.3? We've improved the logging for this particular exception, so it should significantly help in triangulating your issue. Jarcec On Wed, Mar 13, 2013 at 01:43:11PM +0000, Ajit Kumar Shreevastava wrote:
-
Re: Exporting hive table data into oracle give date format errorJarek Jarcec Cecho 2013-03-19, 21:25
Hi Ajit,
thank you for sharing the additional data. I've noticed in your data that some of the columns are using \N to denote the NULL value, however some other columns are using string constant "null" (that do not denote NULL in Hive). This also seems to be the case for column DEL_TS. My guess is that Sqoop is trying to decode the "null" string as the timestamp and failing on the "Timestamp format must be..." exception. I would recommend to unify the null representation tokens and run Sqoop export with appropriate one. Jarcec On Tue, Mar 19, 2013 at 08:13:01AM +0000, Ajit Kumar Shreevastava wrote: > Hi Jercec, > > > > Thank you for your valuable suggestions. > > > > I have applied the below suggestion and re-do all the process again with the SQOOP1.4.3 (sqoop-1.4.3.bin__hadoop-1.0.0.tar.gz) but I have face same below error again. Please suggest me. > > > > Here I have created table in hive as suggested by you. > > > > hive> create table bttn_bkp_testing like bttn_bkp_test; > > hive> insert OVERWRITE table bttn_bkp_testing > > > select * from bttn_bkp_test; > > > > I am also attaching the error file generated by task tracker for your analysis. > > It fails for bttn_id = 194628 > > > > I have queried both the table and records are like > > > > hive> select * from bttn_bkp_testing > > > where bttn_id=194628; > > > > 194628.0 577019.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-04-19 23:25:48.78 ei009724 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 > > 194628.0 706360.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-05-21 01:01:53.629 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 > > 194628.0 1620395.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-08-10 04:34:00.203 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 > > 194628.0 1694103.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-11-08 01:09:15.136 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 > > 194628.0 1831767.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-12-19 23:44:44.241 e0025129 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 > > > > And > > hive> select * from bttn_bkp_test_new > > > where bttn_id=194628; > > > > 194628.0 577019.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-04-19 23:25:48.78 ei009724 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 > > 194628.0 706360.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-05-21 01:01:53.629 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0
-
RE: Exporting hive table data into oracle give date format errorAjit Kumar Shreevastava 2013-03-20, 07:30
Hi Jarcec,
Thanking you for your valuable input and your suggestion seems to be valid. But I have some doubt about the SQOOP behavior :--> 1. If null create the confusion then some value similar to the below mentioned value are inserted to the oracle table with null treated as string. hive> select * from bttn_bkp_testing > where bttn_id=39126; 39126.0 32436.0 3276.0 3.0 28.0 1.0 1.0 1.0 1.0 #FFFFFF #0000FF #0000FF #FFFFFF 0.0 0.0 NULL NULL 1.0 1.0 1.0 NULL null 20.0 2010-05-04 14:31:17.0 dbmigration 2013-01-18 09:11:18.37 DP_CQ4540 2010-11-29 15:45:03.976 ei009724 1.0 null NULL null NULL 0.0 61253.0 61124.0 61124.0 61253.0 39126.0 50805.0 3276.0 3.0 28.0 1.0 1.0 1.0 1.0 #FFFFFF #0000FF #0000FF #FFFFFF 0.0 0.0 NULL NULL 1.0 1.0 1.0 NULL null 20.0 2010-05-23 23:18:54.604 ei103215 2013-01-18 09:11:18.37 DP_CQ4540 2010-11-29 15:45:03.976 ei009724 1.0 null NULL null NULL 0.0 61253.0 61124.0 61124.0 61253.0 39126.0 63196.0 3276.0 3.0 28.0 1.0 1.0 1.0 1.0 #FFFFFF #0000FF #0000FF #FFFFFF 0.0 0.0 NULL NULL 1.0 1.0 1.0 NULL null 20.0 2010-11-04 18:25:23.956 ei103215 2013-01-18 09:11:18.37 DP_CQ4540 2010-11-29 15:45:03.976 ei009724 1.0 null NULL null NULL 0.0 61253.0 61124.0 61124.0 61253.0 These values are inserted into the oracle table BTTN_BKP_TEST as follows:--> . SQL> Select * from BTTN_BKP_TEST where bttn_id=39126; 39126 32436 3276 3 28 1 1 1 1 #FFFFFF #0000FF #0000FF #FFFFFF 0 0 1 1 1 null 20 05/04/2010 2:31:17.000000 PM dbmigration 01/18/2013 9:11:18.370000 AM DP_CQ4540 11/29/2010 3:45:03.976000 PM ei009724 1 null null 0 61253 61124 61124 61253 39126 50805 3276 3 28 1 1 1 1 #FFFFFF #0000FF #0000FF #FFFFFF 0 0 1 1 1 null 20 05/23/2010 11:18:54.604000 PM ei103215 01/18/2013 9:11:18.370000 AM DP_CQ4540 11/29/2010 3:45:03.976000 PM ei009724 1 null null 0 61253 61124 61124 61253 39126 63196 3276 3 28 1 1 1 1 #FFFFFF #0000FF #0000FF #FFFFFF 0 0 1 1 1 null 20 11/04/2010 6:25:23.956000 PM ei103215 01/18/2013 9:11:18.370000 AM DP_CQ4540 11/29/2010 3:45:03.976000 PM ei009724 1 null null 0 61253 61124 61124 61253 But the raised exception for below value:--> hive> select * from bttn_bkp_testing > where bttn_id= 194628.0; 194628.0 577019.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-04-19 23:25:48.78 ei009724 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 706360.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-05-21 01:01:53.629 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 1620395.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-08-10 04:34:00.203 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 1694103.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-11-08 01:09:15.136 ei103215 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 194628.0 1831767.0 8910.0 19.0 1.0 1.0 1.0 0.0 0.0 #FFFFFF #FF0000 #FF0000 #FFFFFF 0.0 0.0 1646.0 NULL NULL NULL 1.0 NULL null 20.0 2012-12-19 23:44:44.241 e0025129 2013-01-18 09:11:30.245 DP_CQ4540 null null 0.0 BLUEBERRY MUFFIN 7836.0 null NULL 0.0 61259.0 61230.0 61230.0 61259.0 2. For your information I have mentioned two interesting fact here for you regarding SQOOP behavior. First, I have imported Bttn table from Oracle into Hive bttn_bkp_test_new table using following command [hadoop@NHCLT-PC44-2 ~]$ sqoop import --connect jdbc:oracle:thin:@10.99.42.
-
Re: Exporting hive table data into oracle give date format errorJarek Jarcec Cecho 2013-03-21, 00:48
Hi Ajit,
thank you for your nice summary. You seems to be missing Sqoop arguments --null-string '\\N', --null-non-string '\\N' on import job and --input-null-string '\\N', --input-null-non-string '\\N' on export. Would you mind adding them and rerunning your work flow? Jarcec On Wed, Mar 20, 2013 at 07:30:29AM +0000, Ajit Kumar Shreevastava wrote: > Hi Jarcec, > > > > Thanking you for your valuable input and your suggestion seems to be valid. But I have some doubt about the SQOOP behavior :--> > > 1. If null create the confusion then some value similar to the below mentioned value are inserted to the oracle table with null treated as string. > > hive> select * from bttn_bkp_testing > > > where bttn_id=39126; > > > > 39126.0 32436.0 3276.0 3.0 28.0 1.0 1.0 1.0 1.0 #FFFFFF #0000FF #0000FF #FFFFFF 0.0 0.0 NULL NULL 1.0 1.0 1.0 NULL null 20.0 2010-05-04 14:31:17.0 dbmigration 2013-01-18 09:11:18.37 DP_CQ4540 2010-11-29 15:45:03.976 ei009724 1.0 null NULL null NULL 0.0 61253.0 61124.0 61124.0 61253.0 > > 39126.0 50805.0 3276.0 3.0 28.0 1.0 1.0 1.0 1.0 #FFFFFF #0000FF #0000FF #FFFFFF 0.0 0.0 NULL NULL 1.0 1.0 1.0 NULL null 20.0 2010-05-23 23:18:54.604 ei103215 2013-01-18 09:11:18.37 DP_CQ4540 2010-11-29 15:45:03.976 ei009724 1.0 null NULL null NULL 0.0 61253.0 61124.0 61124.0 61253.0 > > 39126.0 63196.0 3276.0 3.0 28.0 1.0 1.0 1.0 1.0 #FFFFFF #0000FF #0000FF #FFFFFF 0.0 0.0 NULL NULL 1.0 1.0 1.0 NULL null 20.0 2010-11-04 18:25:23.956 ei103215 2013-01-18 09:11:18.37 DP_CQ4540 2010-11-29 15:45:03.976 ei009724 1.0 null NULL null NULL 0.0 61253.0 61124.0 61124.0 61253.0 > > > > These values are inserted into the oracle table BTTN_BKP_TEST as follows:--> . > > SQL> Select * from BTTN_BKP_TEST where bttn_id=39126; > > > > 39126 32436 3276 3 28 1 1 1 1 #FFFFFF #0000FF #0000FF #FFFFFF 0 0 1 1 1 null 20 05/04/2010 2:31:17.000000 PM dbmigration 01/18/2013 9:11:18.370000 AM DP_CQ4540 11/29/2010 3:45:03.976000 PM ei009724 1 null null 0 61253 61124 61124 61253 > > 39126 50805 3276 3 28 1 1 1 1 #FFFFFF #0000FF #0000FF #FFFFFF 0 0 1 1 1 null 20 05/23/2010 11:18:54.604000 PM ei103215 01/18/2013 9:11:18.370000 AM DP_CQ4540 11/29/2010 3:45:03.976000 PM ei009724 1 null null 0 61253 61124 61124 61253 > > 39126 63196 3276 3 28 1 1 1 1 #FFFFFF #0000FF #0000FF #FFFFFF 0 0 1 1 1 null 20 11/04/2010 6:25:23.956000 PM ei103215 01/18/2013 9:11:18.370000 AM DP_CQ4540 11/29/2010 3:45:03.976000 PM ei009724 1 null null 0 61253 61124 61124 61253
-
RE: Exporting hive table data into oracle give date format errorAjit Kumar Shreevastava 2013-03-21, 06:34
Hi Jarek,
Thanks a lot. Its working fine. In both the case in the chain mail. I also want to know the reason for SQOOP behavior for the data its imported and exporting from oracle without the arguments --null-string '\\N', --null-non-string '\\N' on import job. When I import the data without the arguments --null-string '\\N', --null-non-string '\\N' on import job to a Hive table. Then I am able to export successfully back to the Oracle table without any error and data mismatch. But copied the same table structure and data to another hive table I am not able to do so. Is there any reason for that? Is SQOOP store its data definition or data-formatting for the same? If you explained the internal behavior of SQOOP to clear my concept for importing and exporting the table from a relation database to Hive and vice versa . I have explained the scenario in my chained mail for your reference and also highlighted the facts below. I am looking for your valuable comments on the below highlighted scenario. Thanking You, Regards' Ajit -----Original Message----- From: Jarek Jarcec Cecho [mailto:[EMAIL PROTECTED]] Sent: Thursday, March 21, 2013 6:19 AM To: [EMAIL PROTECTED] Subject: Re: Exporting hive table data into oracle give date format error Hi Ajit, thank you for your nice summary. You seems to be missing Sqoop arguments --null-string '\\N', --null-non-string '\\N' on import job and --input-null-string '\\N', --input-null-non-string '\\N' on export. Would you mind adding them and rerunning your work flow? Jarcec On Wed, Mar 20, 2013 at 07:30:29AM +0000, Ajit Kumar Shreevastava wrote: > Hi Jarcec, > > > > Thanking you for your valuable input and your suggestion seems to be > valid. But I have some doubt about the SQOOP behavior :--> > > 1. If null create the confusion then some value similar to the below mentioned value are inserted to the oracle table with null treated as string. > > hive> select * from bttn_bkp_testing > > > where bttn_id=39126; > > > > 39126.0 32436.0 3276.0 3.0 28.0 1.0 1.0 1.0 1.0 #FFFFFF #0000FF #0000FF #FFFFFF 0.0 0.0 NULL NULL 1.0 1.0 1.0 NULL null 20.0 2010-05-04 14:31:17.0 dbmigration 2013-01-18 09:11:18.37 DP_CQ4540 2010-11-29 15:45:03.976 ei009724 1.0 null NULL null NULL 0.0 61253.0 61124.0 61124.0 61253.0 > > 39126.0 50805.0 3276.0 3.0 28.0 1.0 1.0 1.0 1.0 #FFFFFF #0000FF #0000FF #FFFFFF 0.0 0.0 NULL NULL 1.0 1.0 1.0 NULL null 20.0 2010-05-23 23:18:54.604 ei103215 2013-01-18 09:11:18.37 DP_CQ4540 2010-11-29 15:45:03.976 ei009724 1.0 null NULL null NULL 0.0 61253.0 61124.0 61124.0 61253.0 > > 39126.0 63196.0 3276.0 3.0 28.0 1.0 1.0 1.0 1.0 #FFFFFF #0000FF #0000FF #FFFFFF 0.0 0.0 NULL NULL 1.0 1.0 1.0 NULL null 20.0 2010-11-04 18:25:23.956 ei103215 2013-01-18 09:11:18.37 DP_CQ4540 2010-11-29 15:45:03.976 ei009724 1.0 null NULL null NULL 0.0 61253.0 61124.0 61124.0 61253.0 > > > > These values are inserted into the oracle table BTTN_BKP_TEST as follows:--> . > > SQL> Select * from BTTN_BKP_TEST where bttn_id=39126; > > > > 39126 32436 3276 3 28 1 1 1 1 #FFFFFF #0000FF #0000FF #FFFFFF 0 0 1 1 1 null 20 05/04/2010 2:31:17.000000 PM dbmigration 01/18/2013 9:11:18.370000 AM DP_CQ4540 11/29/2010 3:45:03.976000 PM ei009724 1 null null 0 61253 61124 61124 61253
-
Re: Exporting hive table data into oracle give date format errorJarek Jarcec Cecho 2013-03-21, 15:33
Hi Ajit,
let me try to explain what I think is happening in your use case. There are multiple moving pieces, so let me firstly summarize couple of behaviour characteristics of the components: 1) Sqoop by default will use string "null" (lower case) to encode NULL values from database. This can be changed via --(input-)null-(non-)string arguments. 2) Hive by default uses \N for encoding NULL value. 3) When parsing input file, Hive will use NULL in case that it fails to read some value rather than throwing exception and killing your query. Now let's specifically focus on your work flow. To make the explanation a bit simpler, let's consider table "create table example(i int, t varchar(50));" with one single row where each column is NULL. a) Sqooping in this table without custom --null-(non-)string argument will lead to HDFS file with exactly line (one input row) where both columns will be encoded as 'null' (default substitution string for NULL values). Result: null,null b) Executing simple "select * from example" in Hive will lead to following output row: NULL null Let me explain what is happening here a bit more. Hive will read input file and split it into columns. The first column is of type "integer" and contains value "null". As string constant "null" is not a valid number for integer column, this value is converted into NULL. Second column is of type string, constant "null" is fully valid string and thus this string is returned - there is no conversion to NULL value! c) Exporting table "example" will work correctly as the file on HDFS still contains expected "null,null". d) Now let's explore what will happen during creation of second table with query "CREATE TABLE example2 AS SELECT * FROM example". As a part of the query Hive will read all input rows and parse their values as was described in b). Output will be serialized into output table example2. First column was parsed as NULL, so it will be written out as \N (default NULL substitution character for Hive). Second column was however parsed as a valid string value and thus it will be serialized "as is". Resulting in file with one single line "\N,null". Please notice that this select statement has changed the on disk data! e) Exporting table "example2" can't obviously lead to consistent state as the input file has been changed. Please do not hesitate to contact me if you still have any open questions! Jarcec On Thu, Mar 21, 2013 at 06:34:35AM +0000, Ajit Kumar Shreevastava wrote: > Hi Jarek, > > > > Thanks a lot. Its working fine. > > In both the case in the chain mail. > > > > I also want to know the reason for SQOOP behavior for the data its imported and exporting from oracle without the arguments --null-string '\\N', --null-non-string '\\N' on import job. > > > > When I import the data without the arguments --null-string '\\N', --null-non-string '\\N' on import job to a Hive table. Then I am able to export successfully back to the Oracle table without any error and data mismatch. But copied the same table structure and data to another hive table I am not able to do so. Is there any reason for that? Is SQOOP store its data definition or data-formatting for the same? If you explained the internal behavior of SQOOP to clear my concept for importing and exporting the table from a relation database to Hive and vice versa . I have explained the scenario in my chained mail for your reference and also highlighted the facts below. > > > > I am looking for your valuable comments on the below highlighted scenario. > > > > Thanking You, > > Regards' > > Ajit > > > > > > -----Original Message----- > From: Jarek Jarcec Cecho [mailto:[EMAIL PROTECTED]] > Sent: Thursday, March 21, 2013 6:19 AM > To: [EMAIL PROTECTED] > Subject: Re: Exporting hive table data into oracle give date format error > > > > Hi Ajit, > > thank you for your nice summary. You seems to be missing Sqoop arguments --null-string '\\N', --null-non-string '\\N' on import job and --input-null-string '\\N', --input-null-non-string '\\N' on export. Would you mind adding them and rerunning your work flow?
-
Re: Exporting hive table data into oracle give date format errorVenkat 2013-03-21, 18:30
Hi Ajit/Jarcec
I think the whole null string handling seems to need a special section in the documentation (witness the discussionso n Netezza null hnalding for direct loads that we had) We may have to come up with a specific recommendations on consistently handling NULL in all cases (String, non-string cases). From the database perspectives, different DBs have different behavior on what they support and not support for null string (it differs by versions also). And the hive handling is also another potential issue that you have explained. Venkat On Thu, Mar 21, 2013 at 8:33 AM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote: > Hi Ajit, > let me try to explain what I think is happening in your use case. There > are multiple moving pieces, so let me firstly summarize couple of behaviour > characteristics of the components: > > 1) Sqoop by default will use string "null" (lower case) to encode NULL > values from database. This can be changed via --(input-)null-(non-)string > arguments. > > 2) Hive by default uses \N for encoding NULL value. > > 3) When parsing input file, Hive will use NULL in case that it fails to > read some value rather than throwing exception and killing your query. > > Now let's specifically focus on your work flow. To make the explanation a > bit simpler, let's consider table "create table example(i int, t > varchar(50));" with one single row where each column is NULL. > > a) Sqooping in this table without custom --null-(non-)string argument will > lead to HDFS file with exactly line (one input row) where both columns will > be encoded as 'null' (default substitution string for NULL values). Result: > > null,null > > b) Executing simple "select * from example" in Hive will lead to following > output row: > > NULL null > > Let me explain what is happening here a bit more. Hive will read input > file and split it into columns. The first column is of type "integer" and > contains value "null". As string constant "null" is not a valid number for > integer column, this value is converted into NULL. Second column is of type > string, constant "null" is fully valid string and thus this string is > returned - there is no conversion to NULL value! > > c) Exporting table "example" will work correctly as the file on HDFS still > contains expected "null,null". > > d) Now let's explore what will happen during creation of second table with > query "CREATE TABLE example2 AS SELECT * FROM example". As a part of the > query Hive will read all input rows and parse their values as was described > in b). Output will be serialized into output table example2. First column > was parsed as NULL, so it will be written out as \N (default NULL > substitution character for Hive). Second column was however parsed as a > valid string value and thus it will be serialized "as is". Resulting in > file with one single line "\N,null". Please notice that this select > statement has changed the on disk data! > > e) Exporting table "example2" can't obviously lead to consistent state as > the input file has been changed. > > Please do not hesitate to contact me if you still have any open questions! > > Jarcec > > On Thu, Mar 21, 2013 at 06:34:35AM +0000, Ajit Kumar Shreevastava wrote: > > Hi Jarek, > > > > > > > > Thanks a lot. Its working fine. > > > > In both the case in the chain mail. > > > > > > > > I also want to know the reason for SQOOP behavior for the data its > imported and exporting from oracle without the arguments --null-string > '\\N', --null-non-string '\\N' on import job. > > > > > > > > When I import the data without the arguments --null-string '\\N', > --null-non-string '\\N' on import job to a Hive table. Then I am able to > export successfully back to the Oracle table without any error and data > mismatch. But copied the same table structure and data to another hive > table I am not able to do so. Is there any reason for that? Is SQOOP store > its data definition or data-formatting for the same? If you explained the Regards Venkat
-
Re: Exporting hive table data into oracle give date format errorJarek Jarcec Cecho 2013-03-22, 02:54
Yup agreed, the documentation with regards to the escape strings and various edge conditions can be definitely improved. Would you mind creating a JIRA?
Jarcec On Thu, Mar 21, 2013 at 11:30:22AM -0700, Venkat wrote: > Hi Ajit/Jarcec > > I think the whole null string handling seems to need a special section in > the documentation (witness the discussionso n Netezza null hnalding for > direct loads that we had) > > We may have to come up with a specific recommendations on consistently > handling NULL in all cases (String, non-string cases). From the database > perspectives, different DBs have different behavior on what they support > and not support for null string (it differs by versions also). > > And the hive handling is also another potential issue that you have > explained. > > Venkat > > > On Thu, Mar 21, 2013 at 8:33 AM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote: > > > Hi Ajit, > > let me try to explain what I think is happening in your use case. There > > are multiple moving pieces, so let me firstly summarize couple of behaviour > > characteristics of the components: > > > > 1) Sqoop by default will use string "null" (lower case) to encode NULL > > values from database. This can be changed via --(input-)null-(non-)string > > arguments. > > > > 2) Hive by default uses \N for encoding NULL value. > > > > 3) When parsing input file, Hive will use NULL in case that it fails to > > read some value rather than throwing exception and killing your query. > > > > Now let's specifically focus on your work flow. To make the explanation a > > bit simpler, let's consider table "create table example(i int, t > > varchar(50));" with one single row where each column is NULL. > > > > a) Sqooping in this table without custom --null-(non-)string argument will > > lead to HDFS file with exactly line (one input row) where both columns will > > be encoded as 'null' (default substitution string for NULL values). Result: > > > > null,null > > > > b) Executing simple "select * from example" in Hive will lead to following > > output row: > > > > NULL null > > > > Let me explain what is happening here a bit more. Hive will read input > > file and split it into columns. The first column is of type "integer" and > > contains value "null". As string constant "null" is not a valid number for > > integer column, this value is converted into NULL. Second column is of type > > string, constant "null" is fully valid string and thus this string is > > returned - there is no conversion to NULL value! > > > > c) Exporting table "example" will work correctly as the file on HDFS still > > contains expected "null,null". > > > > d) Now let's explore what will happen during creation of second table with > > query "CREATE TABLE example2 AS SELECT * FROM example". As a part of the > > query Hive will read all input rows and parse their values as was described > > in b). Output will be serialized into output table example2. First column > > was parsed as NULL, so it will be written out as \N (default NULL > > substitution character for Hive). Second column was however parsed as a > > valid string value and thus it will be serialized "as is". Resulting in > > file with one single line "\N,null". Please notice that this select > > statement has changed the on disk data! > > > > e) Exporting table "example2" can't obviously lead to consistent state as > > the input file has been changed. > > > > Please do not hesitate to contact me if you still have any open questions! > > > > Jarcec > > > > On Thu, Mar 21, 2013 at 06:34:35AM +0000, Ajit Kumar Shreevastava wrote: > > > Hi Jarek, > > > > > > > > > > > > Thanks a lot. Its working fine. > > > > > > In both the case in the chain mail. > > > > > > > > > > > > I also want to know the reason for SQOOP behavior for the data its > > imported and exporting from oracle without the arguments --null-string > > '\\N', --null-non-string '\\N' on import job. > > > > > > > > > > > > When I import the data without the arguments --null-string '\\N',
-
Re: Exporting hive table data into oracle give date format errorVenkat Ranganathan 2013-03-22, 03:24
I created SQOOP-963 earlier for this
Venkat On Thu, Mar 21, 2013 at 7:54 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]> wrote: > Yup agreed, the documentation with regards to the escape strings and various edge conditions can be definitely improved. Would you mind creating a JIRA? > > Jarcec > > On Thu, Mar 21, 2013 at 11:30:22AM -0700, Venkat wrote: >> Hi Ajit/Jarcec >> >> I think the whole null string handling seems to need a special section in >> the documentation (witness the discussionso n Netezza null hnalding for >> direct loads that we had) >> >> We may have to come up with a specific recommendations on consistently >> handling NULL in all cases (String, non-string cases). From the database >> perspectives, different DBs have different behavior on what they support >> and not support for null string (it differs by versions also). >> >> And the hive handling is also another potential issue that you have >> explained. >> >> Venkat >> >> >> On Thu, Mar 21, 2013 at 8:33 AM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote: >> >> > Hi Ajit, >> > let me try to explain what I think is happening in your use case. There >> > are multiple moving pieces, so let me firstly summarize couple of behaviour >> > characteristics of the components: >> > >> > 1) Sqoop by default will use string "null" (lower case) to encode NULL >> > values from database. This can be changed via --(input-)null-(non-)string >> > arguments. >> > >> > 2) Hive by default uses \N for encoding NULL value. >> > >> > 3) When parsing input file, Hive will use NULL in case that it fails to >> > read some value rather than throwing exception and killing your query. >> > >> > Now let's specifically focus on your work flow. To make the explanation a >> > bit simpler, let's consider table "create table example(i int, t >> > varchar(50));" with one single row where each column is NULL. >> > >> > a) Sqooping in this table without custom --null-(non-)string argument will >> > lead to HDFS file with exactly line (one input row) where both columns will >> > be encoded as 'null' (default substitution string for NULL values). Result: >> > >> > null,null >> > >> > b) Executing simple "select * from example" in Hive will lead to following >> > output row: >> > >> > NULL null >> > >> > Let me explain what is happening here a bit more. Hive will read input >> > file and split it into columns. The first column is of type "integer" and >> > contains value "null". As string constant "null" is not a valid number for >> > integer column, this value is converted into NULL. Second column is of type >> > string, constant "null" is fully valid string and thus this string is >> > returned - there is no conversion to NULL value! >> > >> > c) Exporting table "example" will work correctly as the file on HDFS still >> > contains expected "null,null". >> > >> > d) Now let's explore what will happen during creation of second table with >> > query "CREATE TABLE example2 AS SELECT * FROM example". As a part of the >> > query Hive will read all input rows and parse their values as was described >> > in b). Output will be serialized into output table example2. First column >> > was parsed as NULL, so it will be written out as \N (default NULL >> > substitution character for Hive). Second column was however parsed as a >> > valid string value and thus it will be serialized "as is". Resulting in >> > file with one single line "\N,null". Please notice that this select >> > statement has changed the on disk data! >> > >> > e) Exporting table "example2" can't obviously lead to consistent state as >> > the input file has been changed. >> > >> > Please do not hesitate to contact me if you still have any open questions! >> > >> > Jarcec >> > >> > On Thu, Mar 21, 2013 at 06:34:35AM +0000, Ajit Kumar Shreevastava wrote: >> > > Hi Jarek, >> > > >> > > >> > > >> > > Thanks a lot. Its working fine. >> > > >> > > In both the case in the chain mail. >> > > >> > > >> > > >> > > I also want to know the reason for SQOOP behavior for the data its
-
Re: Exporting hive table data into oracle give date format errorJarek Jarcec Cecho 2013-03-22, 03:46
Thank you!
Jarcec On Thu, Mar 21, 2013 at 08:24:48PM -0700, Venkat Ranganathan wrote: > I created SQOOP-963 earlier for this > > Venkat > > On Thu, Mar 21, 2013 at 7:54 PM, Jarek Jarcec Cecho <[EMAIL PROTECTED]> wrote: > > Yup agreed, the documentation with regards to the escape strings and various edge conditions can be definitely improved. Would you mind creating a JIRA? > > > > Jarcec > > > > On Thu, Mar 21, 2013 at 11:30:22AM -0700, Venkat wrote: > >> Hi Ajit/Jarcec > >> > >> I think the whole null string handling seems to need a special section in > >> the documentation (witness the discussionso n Netezza null hnalding for > >> direct loads that we had) > >> > >> We may have to come up with a specific recommendations on consistently > >> handling NULL in all cases (String, non-string cases). From the database > >> perspectives, different DBs have different behavior on what they support > >> and not support for null string (it differs by versions also). > >> > >> And the hive handling is also another potential issue that you have > >> explained. > >> > >> Venkat > >> > >> > >> On Thu, Mar 21, 2013 at 8:33 AM, Jarek Jarcec Cecho <[EMAIL PROTECTED]>wrote: > >> > >> > Hi Ajit, > >> > let me try to explain what I think is happening in your use case. There > >> > are multiple moving pieces, so let me firstly summarize couple of behaviour > >> > characteristics of the components: > >> > > >> > 1) Sqoop by default will use string "null" (lower case) to encode NULL > >> > values from database. This can be changed via --(input-)null-(non-)string > >> > arguments. > >> > > >> > 2) Hive by default uses \N for encoding NULL value. > >> > > >> > 3) When parsing input file, Hive will use NULL in case that it fails to > >> > read some value rather than throwing exception and killing your query. > >> > > >> > Now let's specifically focus on your work flow. To make the explanation a > >> > bit simpler, let's consider table "create table example(i int, t > >> > varchar(50));" with one single row where each column is NULL. > >> > > >> > a) Sqooping in this table without custom --null-(non-)string argument will > >> > lead to HDFS file with exactly line (one input row) where both columns will > >> > be encoded as 'null' (default substitution string for NULL values). Result: > >> > > >> > null,null > >> > > >> > b) Executing simple "select * from example" in Hive will lead to following > >> > output row: > >> > > >> > NULL null > >> > > >> > Let me explain what is happening here a bit more. Hive will read input > >> > file and split it into columns. The first column is of type "integer" and > >> > contains value "null". As string constant "null" is not a valid number for > >> > integer column, this value is converted into NULL. Second column is of type > >> > string, constant "null" is fully valid string and thus this string is > >> > returned - there is no conversion to NULL value! > >> > > >> > c) Exporting table "example" will work correctly as the file on HDFS still > >> > contains expected "null,null". > >> > > >> > d) Now let's explore what will happen during creation of second table with > >> > query "CREATE TABLE example2 AS SELECT * FROM example". As a part of the > >> > query Hive will read all input rows and parse their values as was described > >> > in b). Output will be serialized into output table example2. First column > >> > was parsed as NULL, so it will be written out as \N (default NULL > >> > substitution character for Hive). Second column was however parsed as a > >> > valid string value and thus it will be serialized "as is". Resulting in > >> > file with one single line "\N,null". Please notice that this select > >> > statement has changed the on disk data! > >> > > >> > e) Exporting table "example2" can't obviously lead to consistent state as > >> > the input file has been changed. > >> > > >> > Please do not hesitate to contact me if you still have any open questions! > >> > > >> > Jarcec > >> > > >> > On Thu, Mar 21, 2013 at 06:34:35AM +0000, Ajit Kumar Shreevastava wrote:
-
RE: Exporting hive table data into oracle give date format errorAjit Kumar Shreevastava 2013-03-22, 07:37
Thanks Jarek for your explanations .
With Regards, Ajit -----Original Message----- From: Jarek Jarcec Cecho [mailto:[EMAIL PROTECTED]] Sent: Thursday, March 21, 2013 9:03 PM To: [EMAIL PROTECTED] Subject: Re: Exporting hive table data into oracle give date format error Hi Ajit, let me try to explain what I think is happening in your use case. There are multiple moving pieces, so let me firstly summarize couple of behaviour characteristics of the components: 1) Sqoop by default will use string "null" (lower case) to encode NULL values from database. This can be changed via --(input-)null-(non-)string arguments. 2) Hive by default uses \N for encoding NULL value. 3) When parsing input file, Hive will use NULL in case that it fails to read some value rather than throwing exception and killing your query. Now let's specifically focus on your work flow. To make the explanation a bit simpler, let's consider table "create table example(i int, t varchar(50));" with one single row where each column is NULL. a) Sqooping in this table without custom --null-(non-)string argument will lead to HDFS file with exactly line (one input row) where both columns will be encoded as 'null' (default substitution string for NULL values). Result: null,null b) Executing simple "select * from example" in Hive will lead to following output row: NULL null Let me explain what is happening here a bit more. Hive will read input file and split it into columns. The first column is of type "integer" and contains value "null". As string constant "null" is not a valid number for integer column, this value is converted into NULL. Second column is of type string, constant "null" is fully valid string and thus this string is returned - there is no conversion to NULL value! c) Exporting table "example" will work correctly as the file on HDFS still contains expected "null,null". d) Now let's explore what will happen during creation of second table with query "CREATE TABLE example2 AS SELECT * FROM example". As a part of the query Hive will read all input rows and parse their values as was described in b). Output will be serialized into output table example2. First column was parsed as NULL, so it will be written out as \N (default NULL substitution character for Hive). Second column was however parsed as a valid string value and thus it will be serialized "as is". Resulting in file with one single line "\N,null". Please notice that this select statement has changed the on disk data! e) Exporting table "example2" can't obviously lead to consistent state as the input file has been changed. Please do not hesitate to contact me if you still have any open questions! Jarcec On Thu, Mar 21, 2013 at 06:34:35AM +0000, Ajit Kumar Shreevastava wrote: > Hi Jarek, > > > > Thanks a lot. Its working fine. > > In both the case in the chain mail. > > > > I also want to know the reason for SQOOP behavior for the data its imported and exporting from oracle without the arguments --null-string '\\N', --null-non-string '\\N' on import job. > > > > When I import the data without the arguments --null-string '\\N', --null-non-string '\\N' on import job to a Hive table. Then I am able to export successfully back to the Oracle table without any error and data mismatch. But copied the same table structure and data to another hive table I am not able to do so. Is there any reason for that? Is SQOOP store its data definition or data-formatting for the same? If you explained the internal behavior of SQOOP to clear my concept for importing and exporting the table from a relation database to Hive and vice versa . I have explained the scenario in my chained mail for your reference and also highlighted the facts below. > > > > I am looking for your valuable comments on the below highlighted scenario. > > > > Thanking You, > > Regards' > > Ajit > > > > > > -----Original Message----- > From: Jarek Jarcec Cecho [mailto:[EMAIL PROTECTED]] > Sent: Thursday, March 21, 2013 6:19 AM |