Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # user - Extracting Updated records using Sqoop


Copy link to this message
-
Extracting Updated records using Sqoop
burberry blues 2013-11-08, 08:11
Hi Harsh

I am trying to extract the modified records apart from the incremental
updates through Sqoop from Oracle database into Hive table.

However I am getting duplicate entries when trying to extract on a
particular last value attribute.

Below is my sqoop commnad

sqoop import  --connect jdbc:oracle:thin:xxx:xxx:xxx --username xxx
--password xxx --hive-import --table xxx --target-dir xxx  --hive-table xxx
--incremental append --check-column COLUMN_3  --split-by COLUMN_2 --columns
COLUMN_1,COLUMN_2,COLUMN_3 --last-value "2013-11-05 00:00:00"
My output is as follows
Column_1

Column_2

Column_3

new change1

1.0

2013-11-07 11:05:55.0

change3

3.0

2013-11-07 11:19:25.0

change1

1.0

2013-11-05 11:15:50.0

new change1

2.0

2013-11-07 11:18:55.0

NULL

4.0

2013-11-07 12:13:00.0

change2

2.0

2013-11-05 11:15:55.0
The highlighted record is getting inserted again instead of updating the
existing record
Is there any command for this?

Thanks,

Burberry