|
|
-
Re: some queries actually fail in hiveIgor Tatarinov 2012-06-25, 16:43
1) You have to SELECT every column you ORDER BY (id in this case)
2) the query is the same as in 1) - I assume you actually ran a different query igor decide.com On Mon, Jun 25, 2012 at 4:16 AM, Soham Sardar <[EMAIL PROTECTED]>wrote: > 1)hive> desc users_info; > OK > id int > name string > age int > country string > gender string > bday string > > hive> desc users_audit; > OK > id int > userid int > logtime string > Time taken: 0.079 seconds > > so both of my tables are fine and has data now the first query which > is failing is > > hive> select users_info.name from users_info inner join users_audit > > on users_audit.userid=users_info.id > > where month(users_audit.logtime)>10 > > order by users_info.id; > FAILED: Error in semantic analysis: Line 4:20 Invalid column reference 'id' > > now my question is why it should fail in id .(id is a primary key for > users_info table) > > > 2) hive> select users_info.name from users_info inner join users_audit > > on users_audit.userid=users_info.id > > where month(users_audit.logtime)>10 > > order by users_info.id; > > for the same above table when i put the following query it fails at > half way down of mapping . > > Total MapReduce jobs = 2 > Launching Job 1 out of 2 > Number of reduce tasks not specified. Estimated from input data size: 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer=<number> > In order to limit the maximum number of reducers: > set hive.exec.reducers.max=<number> > In order to set a constant number of reducers: > set mapred.reduce.tasks=<number> > 12/06/25 16:45:08 WARN conf.Configuration: mapred.job.name is > deprecated. Instead, use mapreduce.job.name > 12/06/25 16:45:08 WARN conf.Configuration: mapred.system.dir is > deprecated. Instead, use mapreduce.jobtracker.system.dir > 12/06/25 16:45:08 WARN conf.Configuration: mapred.local.dir is > deprecated. Instead, use mapreduce.cluster.local.dir > 12/06/25 16:45:08 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH > WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. > Please use org.apache.hadoop.log.metrics.EventCounter in all the > log4j.properties files. > Execution log at: > /tmp/hduser/hduser_20120625164545_3c0a9948-f43f-428e-9d8f-ff89fe2f4937.log > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > > [jar:file:/home/hduser/cloudera/hadoop-2.0.0-cdh4.0.0/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > > [jar:file:/home/hduser/cloudera/hive-0.8.1-cdh4.0.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > Job running in-process (local Hadoop) > Hadoop job information for null: number of mappers: 2; number of reducers: > 1 > 2012-06-25 16:45:18,112 null map = 0%, reduce = 0% > 2012-06-25 16:45:25,886 null map = 50%, reduce = 0%, Cumulative CPU 4.1 > sec > 2012-06-25 16:45:26,951 null map = 50%, reduce = 0%, Cumulative CPU 4.1 > sec > 2012-06-25 16:45:28,007 null map = 50%, reduce = 0%, Cumulative CPU 4.1 > sec > 2012-06-25 16:45:29,069 null map = 83%, reduce = 0%, Cumulative CPU 10.92 > sec > 2012-06-25 16:45:30,118 null map = 83%, reduce = 0%, Cumulative CPU 10.92 > sec > 2012-06-25 16:45:31,192 null map = 100%, reduce = 17%, Cumulative CPU > 14.64 sec > 2012-06-25 16:45:32,251 null map = 100%, reduce = 17%, Cumulative CPU > 14.64 sec > 2012-06-25 16:45:33,300 null map = 100%, reduce = 17%, Cumulative CPU > 14.64 sec > 2012-06-25 16:45:34,369 null map = 100%, reduce = 100%, Cumulative > CPU 19.42 sec > MapReduce Total cumulative CPU time: 19 seconds 420 msec > Ended Job = job_1340607580565_0023 > Execution completed successfully > Mapred Local Task Succeeded . Convert the Join into MapJoin > Launching Job 2 out of 2 > Number of reduce tasks determined at compile time: 1 |