Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Setting up stats database


Copy link to this message
-
Re: Setting up stats database
wd 2011-08-15, 07:17
oh, found hive only support mysql and hbase. I'll try hbase.

On Mon, Aug 15, 2011 at 3:09 PM, wd <[EMAIL PROTECTED]> wrote:
> hi,
>
> I'm try to use postgres as stats database. And made following settings
> in hive-site.xml
>
>
> <property>
>  <name>hive.stats.dbclass</name>
>  <value>jdbc:postgresql</value>
>  <description>The default database that stores temporary hive
> statistics.</description>
> </property>
>
> <property>
>  <name>hive.stats.autogather</name>
>  <value>true</value>
>  <description>A flag to gather statistics automatically during the
> INSERT OVERWRITE command.</description>
> </property>
>
> <property>
>  <name>hive.stats.jdbcdriver</name>
>  <value>org.postgresql.Driver</value>
>  <description>The JDBC driver for the database that stores temporary
> hive statistics.</description>
> </property>
>
> <property>
>  <name>hive.stats.dbconnectionstring</name>
>  <value>jdbc:postgresql://localhost/hive_statsdb?createDatabaseIfNotExist=true;user=hive;password=pwd</value>
>  <description>The default connection string for the database that
> stores temporary hive statistics.</description>
> </property>
>
> I use postgres as hive meta database, so there is a
> postgresql-9.0-801.jdbc4.jar file in lib.
>
> After run 'analyse table t1 partitions(dt) comput statistics;' in hive
> cli, it will output some stats info in cli, but nothing in db. And I
> can found there is the flowing errors
>
> 1-08-15 14:54:54,767 INFO
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Stats Gathering
> found a new partition spec = dt=20110805
> 2011-08-15 14:54:54,767 INFO
> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1 rows
> 2011-08-15 14:54:54,767 INFO ExecMapper: ExecMapper: processing 1
> rows: used memory = 39953640
> 2011-08-15 14:54:54,768 INFO
> org.apache.hadoop.hive.ql.exec.MapOperator: 1 finished. closing...
> 2011-08-15 14:54:54,768 INFO
> org.apache.hadoop.hive.ql.exec.MapOperator: 1 forwarded 2 rows
> 2011-08-15 14:54:54,768 INFO
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
> 2011-08-15 14:54:54,768 INFO
> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished.
> closing...
> 2011-08-15 14:54:54,768 INFO
> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 2 rows
> 2011-08-15 14:54:54,772 ERROR
> org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher: Error during
> JDBC connection to
> jdbc:postgresql://localhost/hive_statsdb?createDatabaseIfNotExist=true;user=hive;password=pwd.
> java.lang.ClassNotFoundException: org.postgresql.Driver
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>        at java.lang.Class.forName0(Native Method)
>        at java.lang.Class.forName(Class.java:169)
>        at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.connect(JDBCStatsPublisher.java:55)
>        at org.apache.hadoop.hive.ql.exec.TableScanOperator.publishStats(TableScanOperator.java:202)
>        at org.apache.hadoop.hive.ql.exec.TableScanOperator.closeOp(TableScanOperator.java:164)
>        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
>        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
>        at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> 2011-08-15 14:54:54,774 INFO
> org.apache.hadoop.hive.ql.exec.TableScanOperator: StatsPublishing
> error: cannot connect to database.