Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Setting up stats database


Copy link to this message
-
Setting up stats database
hi,

I'm try to use postgres as stats database. And made following settings
in hive-site.xml
<property>
  <name>hive.stats.dbclass</name>
  <value>jdbc:postgresql</value>
  <description>The default database that stores temporary hive
statistics.</description>
</property>

<property>
  <name>hive.stats.autogather</name>
  <value>true</value>
  <description>A flag to gather statistics automatically during the
INSERT OVERWRITE command.</description>
</property>

<property>
  <name>hive.stats.jdbcdriver</name>
  <value>org.postgresql.Driver</value>
  <description>The JDBC driver for the database that stores temporary
hive statistics.</description>
</property>

<property>
  <name>hive.stats.dbconnectionstring</name>
  <value>jdbc:postgresql://localhost/hive_statsdb?createDatabaseIfNotExist=true;user=hive;password=pwd</value>
  <description>The default connection string for the database that
stores temporary hive statistics.</description>
</property>

I use postgres as hive meta database, so there is a
postgresql-9.0-801.jdbc4.jar file in lib.

After run 'analyse table t1 partitions(dt) comput statistics;' in hive
cli, it will output some stats info in cli, but nothing in db. And I
can found there is the flowing errors

1-08-15 14:54:54,767 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: Stats Gathering
found a new partition spec = dt=20110805
2011-08-15 14:54:54,767 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1 rows
2011-08-15 14:54:54,767 INFO ExecMapper: ExecMapper: processing 1
rows: used memory = 39953640
2011-08-15 14:54:54,768 INFO
org.apache.hadoop.hive.ql.exec.MapOperator: 1 finished. closing...
2011-08-15 14:54:54,768 INFO
org.apache.hadoop.hive.ql.exec.MapOperator: 1 forwarded 2 rows
2011-08-15 14:54:54,768 INFO
org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
2011-08-15 14:54:54,768 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished.
closing...
2011-08-15 14:54:54,768 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 2 rows
2011-08-15 14:54:54,772 ERROR
org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher: Error during
JDBC connection to
jdbc:postgresql://localhost/hive_statsdb?createDatabaseIfNotExist=true;user=hive;password=pwd.
java.lang.ClassNotFoundException: org.postgresql.Driver
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.connect(JDBCStatsPublisher.java:55)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.publishStats(TableScanOperator.java:202)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.closeOp(TableScanOperator.java:164)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
2011-08-15 14:54:54,774 INFO
org.apache.hadoop.hive.ql.exec.TableScanOperator: StatsPublishing
error: cannot connect to database.
2011-08-15 14:54:54,774 INFO
org.apache.hadoop.hive.ql.exec.MapOperator: 1 Close done
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB