Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Hive JOINs not working as expected (returns 0 rows)


+
John Morrison 2012-11-20, 15:43
Copy link to this message
-
Re: Hive JOINs not working as expected (returns 0 rows)
Did you install v0.9.0 on your MapR cluster? I suspect there's a bug in the
interaction between the two. I'm not sure which version of Hive is the
latest shipped with MapR.

I just tried your example in an older MapR M5 virtual machine instance that
came with Hive v0.7.1. It worked correctly. I then repeated the experiment
in Hive v0.9.0 on a generic Apache Hadoop setup I have in another VM. It
also worked correctly.  (I don't have Hive v0.9.0 available in the MapR VM.)

Hope this helps.

dean

On Tue, Nov 20, 2012 at 9:43 AM, John Morrison <
[EMAIL PROTECTED]> wrote:

>  Hi,****
>
> ** **
>
> New to hive in last few weeks and could use some help with JOINs.****
>
> ** **
>
> Using MapR (version 0.9.0)    /usr/bin/hive ->
> /opt/mapr/hive/hive-0.9.0/bin/hive****
>
> ** **
>
> I have 2 tables I am wanting to join by date (order_t and date_t).  DDL at
> bottom.****
>
> ** **
>
> I have reduced this to 1 column and 1 row and still can’t get things to
> work.****
>
> ** **
>
> Any help will be appreciated.****
>
> ** **
>
> -John****
>
> ** **
>
> Details:****
>
> ** **
>
> # data in each table****
>
> hive> select * from ext_order_1 ;****
>
> OK****
>
> 20081203****
>
> Time taken: 0.076 seconds****
>
> hive> select * from ext_date_1 ;****
>
> OK****
>
> 20081203****
>
> Time taken: 0.068 seconds****
>
> ** **
>
> # Trying to join results in 0 rows (see yellow below)****
>
> #****
>
> hive> select count(*) from ext_order_1 a join ext_date_1 b on (a.cal_dt > b.cal_dt) ;****
>
> Total MapReduce jobs = 2****
>
> Launching Job 1 out of 2****
>
> Number of reduce tasks not specified. Estimated from input data size: 1***
> *
>
> In order to change the average load for a reducer (in bytes):****
>
>   set hive.exec.reducers.bytes.per.reducer=<number>****
>
> In order to limit the maximum number of reducers:****
>
>   set hive.exec.reducers.max=<number>****
>
> In order to set a constant number of reducers:****
>
>   set mapred.reduce.tasks=<number>****
>
> Starting Job = job_201211050921_0232, Tracking URL =  xxxx****
>
> Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job
> -Dmapred.job.tracker=maprfs:/// -kill job_201211050921_0232****
>
> Hadoop job information for Stage-1: number of mappers: 1; number of
> reducers: 1****
>
> 2012-11-20 10:08:48,797 Stage-1 map = 0%,  reduce = 0%****
>
> 2012-11-20 10:08:53,823 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 1.25 sec****
>
> 2012-11-20 10:08:54,829 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 1.25 sec****
>
> 2012-11-20 10:08:55,835 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 1.25 sec****
>
> 2012-11-20 10:08:56,842 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 1.25 sec****
>
> 2012-11-20 10:08:57,848 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 1.25 sec****
>
> 2012-11-20 10:08:58,854 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 1.25 sec****
>
> 2012-11-20 10:08:59,860 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 1.25 sec****
>
> 2012-11-20 10:09:00,866 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 3.23 sec****
>
> MapReduce Total cumulative CPU time: 3 seconds 230 msec****
>
> Ended Job = job_201211050921_0232****
>
> Launching Job 2 out of 2****
>
> Number of reduce tasks determined at compile time: 1****
>
> In order to change the average load for a reducer (in bytes):****
>
>   set hive.exec.reducers.bytes.per.reducer=<number>****
>
> In order to limit the maximum number of reducers:****
>
>   set hive.exec.reducers.max=<number>****
>
> In order to set a constant number of reducers:****
>
>   set mapred.reduce.tasks=<number>****
>
> Starting Job = job_201211050921_0233, Tracking URL =  xxxx****
>
> Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job
> -Dmapred.job.tracker=maprfs:/// -kill job_201211050921_0233****
>
> Hadoop job information for Stage-2: number of mappers: 1; number of
> reducers: 1****
>
> 2012-11-20 10:09:02,058 Stage-2 map = 0%,  reduce = 0%****
>
> 2012-11-20 10:09
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330
+
Edward Capriolo 2012-11-20, 16:33
+
John Morrison 2012-11-20, 18:14