Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Hive JOINs not working as expected (returns 0 rows)


Copy link to this message
-
Hive JOINs not working as expected (returns 0 rows)
Hi,

New to hive in last few weeks and could use some help with JOINs.

Using MapR (version 0.9.0)    /usr/bin/hive -> /opt/mapr/hive/hive-0.9.0/bin/hive

I have 2 tables I am wanting to join by date (order_t and date_t).  DDL at bottom.

I have reduced this to 1 column and 1 row and still can't get things to work.

Any help will be appreciated.

-John

Details:

# data in each table
hive> select * from ext_order_1 ;
OK
20081203
Time taken: 0.076 seconds
hive> select * from ext_date_1 ;
OK
20081203
Time taken: 0.068 seconds

# Trying to join results in 0 rows (see yellow below)
#
hive> select count(*) from ext_order_1 a join ext_date_1 b on (a.cal_dt = b.cal_dt) ;
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201211050921_0232, Tracking URL =  xxxx
Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job  -Dmapred.job.tracker=maprfs:/// -kill job_201211050921_0232
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2012-11-20 10:08:48,797 Stage-1 map = 0%,  reduce = 0%
2012-11-20 10:08:53,823 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.25 sec
2012-11-20 10:08:54,829 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.25 sec
2012-11-20 10:08:55,835 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.25 sec
2012-11-20 10:08:56,842 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.25 sec
2012-11-20 10:08:57,848 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.25 sec
2012-11-20 10:08:58,854 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.25 sec
2012-11-20 10:08:59,860 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.25 sec
2012-11-20 10:09:00,866 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.23 sec
MapReduce Total cumulative CPU time: 3 seconds 230 msec
Ended Job = job_201211050921_0232
Launching Job 2 out of 2
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201211050921_0233, Tracking URL =  xxxx
Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job  -Dmapred.job.tracker=maprfs:/// -kill job_201211050921_0233
Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
2012-11-20 10:09:02,058 Stage-2 map = 0%,  reduce = 0%
2012-11-20 10:09:07,084 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.89 sec
2012-11-20 10:09:08,091 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.89 sec
2012-11-20 10:09:09,101 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.89 sec
2012-11-20 10:09:10,106 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.89 sec
2012-11-20 10:09:11,112 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.89 sec
2012-11-20 10:09:12,119 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.89 sec
2012-11-20 10:09:13,125 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 1.5 sec
MapReduce Total cumulative CPU time: 1 seconds 500 msec
Ended Job = job_201211050921_0233
MapReduce Jobs Launched:
Job 0: Map: 1  Reduce: 1   Cumulative CPU: 3.23 sec   MAPRFS Read: 396 MAPRFS Write: 222 SUCCESS
Job 1: Map: 1  Reduce: 1   Cumulative CPU: 1.5 sec   MAPRFS Read: 628 MAPRFS Write: 72 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 730 msec
OK
0
Time taken: 25.727 seconds

# Hive DDL
#
$> cat date_t_1_row.ddl

CREATE external TABLE  ext_date_1
(
     cal_dt                      INT
)
  ROW FORMAT
  DELIMITED FIELDS TERMINATED BY '~'
  LINES TERMINATED BY '\n' STORED AS TEXTFILE
  LOCATION '/Work/Hive/tables/date_1'
;
$>  cat order_1.ddl

CREATE external TABLE ext_order_1
(
cal_dt             int
)
  ROW FORMAT
  DELIMITED FIELDS TERMINATED BY '~'
  LINES TERMINATED BY '\n' STORED AS TEXTFILE
  LOCATION '/Work/Hive/tables/order_1'
;
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB