Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Hive Hbase 0.94 ClassNotFoundException com.google.protobuf.Message


Copy link to this message
-
Re: Hive Hbase 0.94 ClassNotFoundException com.google.protobuf.Message
On Thu, Oct 25, 2012 at 8:31 AM, Nick maillard
<[EMAIL PROTECTED]> wrote:
> Hi jean-Daniel
>
> Ok I'll sent it in the env thanks for the advice.
> Are their other libs I might need to add?

The usual client libs... doesn't seem like we documented them
anywhere... it's pretty much what you have in now.

> Could just tell hive to use it's lib directory or hbase's lib directory in it's
> classpath in some way?

That's a question for the hive ML.

> I could just set it in the bashrc but that's not very elegant.

I really meant that you should use HIVE_AUX_JARS_PATH in hive-env.sh

>
> Another thing I am testing my 3 machine hadoop cluster.
> I have queried 'select * from myTestTable' which has 1719428 entries.
> The 7 map tasks and 1 reducer took almost 5 minutes to compute, I am right to
> think it is a little slow?

You have a 1-2 minutes overhead in there because you are using
MapReduce, then usually one should set hbase.client.scanner.caching to
a better value than 1. It's client-side so hive needs to have it. But
everything will seem slow when using MR on such a small dataset, a
single client running a scan would be faster in this case.

> How could I make this go faster, more map tasks, more nodes?

Is select count(*) really the use case you want to optimize? Have you
read this? http://hbase.apache.org/book.html#performance

>
> True I would never scan a whole table usually but I could easily have queries
> that MR over a set of this size.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB