Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Setting up stats database


Copy link to this message
-
Re: Setting up stats database
oh, found hive only support mysql and hbase. I'll try hbase.

On Mon, Aug 15, 2011 at 3:09 PM, wd <[EMAIL PROTECTED]> wrote:
> hi,
>
> I'm try to use postgres as stats database. And made following settings
> in hive-site.xml
>
>
> <property>
>  <name>hive.stats.dbclass</name>
>  <value>jdbc:postgresql</value>
>  <description>The default database that stores temporary hive
> statistics.</description>
> </property>
>
> <property>
>  <name>hive.stats.autogather</name>
>  <value>true</value>
>  <description>A flag to gather statistics automatically during the
> INSERT OVERWRITE command.</description>
> </property>
>
> <property>
>  <name>hive.stats.jdbcdriver</name>
>  <value>org.postgresql.Driver</value>
>  <description>The JDBC driver for the database that stores temporary
> hive statistics.</description>
> </property>
>
> <property>
>  <name>hive.stats.dbconnectionstring</name>
>  <value>jdbc:postgresql://localhost/hive_statsdb?createDatabaseIfNotExist=true;user=hive;password=pwd</value>
>  <description>The default connection string for the database that
> stores temporary hive statistics.</description>
> </property>
>
> I use postgres as hive meta database, so there is a
> postgresql-9.0-801.jdbc4.jar file in lib.
>
> After run 'analyse table t1 partitions(dt) comput statistics;' in hive
> cli, it will output some stats info in cli, but nothing in db. And I
> can found there is the flowing errors
>
> 1-08-15 14:54:54,767 INFO
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Stats Gathering
> found a new partition spec = dt=20110805
> 2011-08-15 14:54:54,767 INFO
> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1 rows
> 2011-08-15 14:54:54,767 INFO ExecMapper: ExecMapper: processing 1
> rows: used memory = 39953640
> 2011-08-15 14:54:54,768 INFO
> org.apache.hadoop.hive.ql.exec.MapOperator: 1 finished. closing...
> 2011-08-15 14:54:54,768 INFO
> org.apache.hadoop.hive.ql.exec.MapOperator: 1 forwarded 2 rows
> 2011-08-15 14:54:54,768 INFO
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
> 2011-08-15 14:54:54,768 INFO
> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished.
> closing...
> 2011-08-15 14:54:54,768 INFO
> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 2 rows
> 2011-08-15 14:54:54,772 ERROR
> org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher: Error during
> JDBC connection to
> jdbc:postgresql://localhost/hive_statsdb?createDatabaseIfNotExist=true;user=hive;password=pwd.
> java.lang.ClassNotFoundException: org.postgresql.Driver
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>        at java.lang.Class.forName0(Native Method)
>        at java.lang.Class.forName(Class.java:169)
>        at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.connect(JDBCStatsPublisher.java:55)
>        at org.apache.hadoop.hive.ql.exec.TableScanOperator.publishStats(TableScanOperator.java:202)
>        at org.apache.hadoop.hive.ql.exec.TableScanOperator.closeOp(TableScanOperator.java:164)
>        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
>        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
>        at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> 2011-08-15 14:54:54,774 INFO
> org.apache.hadoop.hive.ql.exec.TableScanOperator: StatsPublishing
> error: cannot connect to database.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB