Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Basic statement problems


Copy link to this message
-
RE: Basic statement problems
The LOCATION clause has to specify the directory that contains (only) your data files.

-----Original Message-----
From: Keith Wiley [mailto:[EMAIL PROTECTED]]
Sent: Friday, March 09, 2012 3:44 PM
To: [EMAIL PROTECTED]
Subject: Basic statement problems

I successfully installed and used Hive to create basic tables (on one of my two machines; another discussion describes the problems I'm having with the other machine).  However, basic queries aren't working.  I brought a typical CSV file into a Hive table and it seemed fine.  Here's how I did it:

CREATE EXTERNAL TABLE stringmap (ObjectTypeCode INT, AttributeName STRING, AttributeValue INT, LangId INT, Value STRING, DisplayOrder INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '/crm/records/all/hiveTest/StringMap.csv';

"show tables" and "describe stringmap" return correct results.  However, if I run a really simple query, it returns incorrect results.  For example, a row count query returns a 0.  Observe:

hive> select count(*) from stringmap;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201202221500_0103, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201202221500_0103
Kill Command = /media/sdb1/kwiley/hadoop/hadoop-0.20.2-cdh3u3/bin/hadoop job  -Dmapred.job.tracker=localhost:9001 -kill job_201202221500_0103
2012-03-09 16:20:26,243 Stage-1 map = 0%,  reduce = 0%
2012-03-09 16:20:29,258 Stage-1 map = 0%,  reduce = 100%
2012-03-09 16:20:32,278 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201202221500_0103
OK
0
Time taken: 15.969 seconds

The Hadoop job runs without error, but it returns a 0.  The job tracker indicates that Hive create a job with 0 mappers and 1 reducer.  I don't see any useful output in the reducer task log however.  The following is the from the hive log:

2012-03-09 16:20:17,057 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved.
2012-03-09 16:20:17,057 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved.
2012-03-09 16:20:17,059 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved.
2012-03-09 16:20:17,059 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved.
2012-03-09 16:20:17,060 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved.
2012-03-09 16:20:17,060 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved.
2012-03-09 16:20:22,781 WARN  mapred.JobClient (JobClient.java:copyAndConfigureFiles(649)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2012-03-09 16:20:22,946 WARN  snappy.LoadSnappy (LoadSnappy.java:<clinit>(36)) - Snappy native library is available

I admit, that does look rather erroneous, but I'm not sure what to make of it.  I looked those errors up online but didn't find much that seemed to suggest a cause or solution.

Any ideas?

________________________________________________________________________________
Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com

"You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
scratch. All together this implies: He scratched the itch from the scratch that
itched but would never itch the scratch from the itch that scratched."
                                           --  Keith Wiley
________________________________________________________________________________
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB