Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Setting up stats database


Copy link to this message
-
Re: Setting up stats database
wd 2011-08-15, 08:06
HBase Publisher/Aggregator classes cannot be loaded.

need to configure publisher/aggregator for hbase...there is only one
way, that is use mysql ..

does stats database will optimize hive query? Consider whether or not
setup a mysql for this.

On Mon, Aug 15, 2011 at 3:17 PM, wd <[EMAIL PROTECTED]> wrote:
> oh, found hive only support mysql and hbase. I'll try hbase.
>
> On Mon, Aug 15, 2011 at 3:09 PM, wd <[EMAIL PROTECTED]> wrote:
>> hi,
>>
>> I'm try to use postgres as stats database. And made following settings
>> in hive-site.xml
>>
>>
>> <property>
>>  <name>hive.stats.dbclass</name>
>>  <value>jdbc:postgresql</value>
>>  <description>The default database that stores temporary hive
>> statistics.</description>
>> </property>
>>
>> <property>
>>  <name>hive.stats.autogather</name>
>>  <value>true</value>
>>  <description>A flag to gather statistics automatically during the
>> INSERT OVERWRITE command.</description>
>> </property>
>>
>> <property>
>>  <name>hive.stats.jdbcdriver</name>
>>  <value>org.postgresql.Driver</value>
>>  <description>The JDBC driver for the database that stores temporary
>> hive statistics.</description>
>> </property>
>>
>> <property>
>>  <name>hive.stats.dbconnectionstring</name>
>>  <value>jdbc:postgresql://localhost/hive_statsdb?createDatabaseIfNotExist=true;user=hive;password=pwd</value>
>>  <description>The default connection string for the database that
>> stores temporary hive statistics.</description>
>> </property>
>>
>> I use postgres as hive meta database, so there is a
>> postgresql-9.0-801.jdbc4.jar file in lib.
>>
>> After run 'analyse table t1 partitions(dt) comput statistics;' in hive
>> cli, it will output some stats info in cli, but nothing in db. And I
>> can found there is the flowing errors
>>
>> 1-08-15 14:54:54,767 INFO
>> org.apache.hadoop.hive.ql.exec.TableScanOperator: Stats Gathering
>> found a new partition spec = dt=20110805
>> 2011-08-15 14:54:54,767 INFO
>> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1 rows
>> 2011-08-15 14:54:54,767 INFO ExecMapper: ExecMapper: processing 1
>> rows: used memory = 39953640
>> 2011-08-15 14:54:54,768 INFO
>> org.apache.hadoop.hive.ql.exec.MapOperator: 1 finished. closing...
>> 2011-08-15 14:54:54,768 INFO
>> org.apache.hadoop.hive.ql.exec.MapOperator: 1 forwarded 2 rows
>> 2011-08-15 14:54:54,768 INFO
>> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
>> 2011-08-15 14:54:54,768 INFO
>> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished.
>> closing...
>> 2011-08-15 14:54:54,768 INFO
>> org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 2 rows
>> 2011-08-15 14:54:54,772 ERROR
>> org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher: Error during
>> JDBC connection to
>> jdbc:postgresql://localhost/hive_statsdb?createDatabaseIfNotExist=true;user=hive;password=pwd.
>> java.lang.ClassNotFoundException: org.postgresql.Driver
>>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>>        at java.lang.Class.forName0(Native Method)
>>        at java.lang.Class.forName(Class.java:169)
>>        at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.connect(JDBCStatsPublisher.java:55)
>>        at org.apache.hadoop.hive.ql.exec.TableScanOperator.publishStats(TableScanOperator.java:202)
>>        at org.apache.hadoop.hive.ql.exec.TableScanOperator.closeOp(TableScanOperator.java:164)
>>        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
>>        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
>>        at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
>>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)