Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # user - Data mismatch when importing data from Oracle to Hive through Sqoop without an error


Copy link to this message
-
Re: Data mismatch when importing data from Oracle to Hive through Sqoop without an error
Venkat Ranganathan 2013-03-06, 16:06
Hi Ajit

Do you know if rest of the columns also are null when the three non
null columns are null

Venkat

On Wed, Mar 6, 2013 at 12:35 AM, Ajit Kumar Shreevastava
<[EMAIL PROTECTED]> wrote:
> Hi Abhijeet,
>
>
>
> Thanks for your response.
>
> If values that don’t fit in double must be getting inserted as Null is the
> case then count should not be mis-match in both the case.
>
> Here the null value inserted are extra value apart from the other value
> which is already present in both Oracle Table and Hive table.
>
>
>
> Correct me if I am wrong in interpretation.
>
>
>
> Thanks and Regards,
>
> Ajit Kumar Shreevastava
>
>
>
> From: abhijeet gaikwad [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, March 06, 2013 1:46 PM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: Data mismatch when importing data from Oracle to Hive through
> Sqoop without an error
>
>
>
> Sqoop maps numeric and decimal types (RDBMS) to double (Hive). I think the
> values that don't fit in double must be getting inserted as NULL.
> You can see this warning in your logs.
>
> Thanks,
> Abhijeet
>
> On Wed, Mar 6, 2013 at 1:32 PM, Ajit Kumar Shreevastava
> <[EMAIL PROTECTED]> wrote:
>
> Hi all,
>
> I have notice one interesting thing in the below result-set.
>
> I have fired one query in both Oracle and Hive shell and found the following
> result set:à
>
>
>
> SQL> select count(1) from bttn
>
>   2  where bttn_id is null or data_inst_id is null or scr_id is null;
>
>
>
>   COUNT(1)
>
> ----------
>
>          0
>
> hive> select count(1) from bttn
>
>     > where bttn_id is null or data_inst_id is null or scr_id is null;
>
> Total MapReduce jobs = 1
>
> Launching Job 1 out of 1
>
> Number of reduce tasks determined at compile time: 1
>
> In order to change the average load for a reducer (in bytes):
>
>   set hive.exec.reducers.bytes.per.reducer=<number>
>
> In order to limit the maximum number of reducers:
>
>   set hive.exec.reducers.max=<number>
>
> In order to set a constant number of reducers:
>
>   set mapred.reduce.tasks=<number>
>
> Starting Job = job_201303051835_0020, Tracking URL > http://NHCLT-PC44-2:50030/jobdetails.jsp?jobid=job_201303051835_0020
>
> Kill Command = /home/hadoop/hadoop-1.0.3/bin/hadoop job  -kill
> job_201303051835_0020
>
> Hadoop job information for Stage-1: number of mappers: 1; number of
> reducers: 1
>
> 2013-03-06 13:22:56,908 Stage-1 map = 0%,  reduce = 0%
>
> 2013-03-06 13:23:05,928 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
> sec
>
> 2013-03-06 13:23:06,931 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
> sec
>
> 2013-03-06 13:23:07,934 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
> sec
>
> 2013-03-06 13:23:08,938 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
> sec
>
> 2013-03-06 13:23:09,941 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
> sec
>
> 2013-03-06 13:23:10,944 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
> sec
>
> 2013-03-06 13:23:11,947 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
> sec
>
> 2013-03-06 13:23:12,956 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
> sec
>
> 2013-03-06 13:23:13,959 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
> sec
>
> 2013-03-06 13:23:14,962 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU
> 5.2 sec
>
> 2013-03-06 13:23:15,965 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU
> 5.2 sec
>
> 2013-03-06 13:23:16,969 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU
> 5.2 sec
>
> 2013-03-06 13:23:17,974 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec
>
> 2013-03-06 13:23:18,977 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec
>
> 2013-03-06 13:23:19,981 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec
>
> 2013-03-06 13:23:20,985 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec
>
> 2013-03-06 13:23:21,988 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec
>
> 2013-03-06 13:23:22,995 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU