Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop >> mail # user >> Extracting Updated records using Sqoop


Copy link to this message
-
Extracting Updated records using Sqoop
Hi Harsh

I am trying to extract the modified records apart from the incremental
updates through Sqoop from Oracle database into Hive table.

However I am getting duplicate entries when trying to extract on a
particular last value attribute.

Below is my sqoop commnad

sqoop import  --connect jdbc:oracle:thin:xxx:xxx:xxx --username xxx
--password xxx --hive-import --table xxx --target-dir xxx  --hive-table xxx
--incremental append --check-column COLUMN_3  --split-by COLUMN_2 --columns
COLUMN_1,COLUMN_2,COLUMN_3 --last-value "2013-11-05 00:00:00"
My output is as follows
Column_1

Column_2

Column_3

new change1

1.0

2013-11-07 11:05:55.0

change3

3.0

2013-11-07 11:19:25.0

change1

1.0

2013-11-05 11:15:50.0

new change1

2.0

2013-11-07 11:18:55.0

NULL

4.0

2013-11-07 12:13:00.0

change2

2.0

2013-11-05 11:15:55.0
The highlighted record is getting inserted again instead of updating the
existing record
Is there any command for this?

Thanks,

Burberry
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB