Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> About performance issue of Hive/HBase vs Hive/HDFS


Copy link to this message
-
Re: About performance issue of Hive/HBase vs Hive/HDFS
Ok... Just my random thoughts...
There definitely is overhead in HBase that doesn't exist when you are doing direct access against a hive table. 4 to 5 times slower? I'd question how you tuned your HBase.

Having said that, I would imagine that there are still some potential improvements that could be done on hive to work better w HBase.
Also why LZO and not Snappy?
Sent from a remote device. Please excuse any typos...

Mike Segel

On Dec 21, 2011, at 1:14 AM, Bruce Bian <[EMAIL PROTECTED]> wrote:

> Hi there,
> After I read these two posts on the mailing list
> http://search-hadoop.com/m/nVaw59rFlY1/Performance+between+Hive+queries+vs.+Hive+over+HBase+queries&subj=Performance+between+Hive+queries+vs+Hive+over+HBase+queries
> http://search-hadoop.com/m/X1rzQ1QDSaf2/Hive%252BHBase+performance+is+much+poorer+than+Hive%252BHDFS&subj=Hive+HBase+performance+is+much+poorer+than+Hive+HDFS
> Seems like a 4~5X performance downgrade of Hive/HBase vs Hive/HDFS is
> expected due to hbase built another layer on top of HDFS. If this is the
> issue here, is it possible to bypass the HBase layer to read the HFiles
> stored on HDFS directly?
> Another possibility maybe the fact that for the same table, the storage is
> much larger in HBase(around 5X in my test case, both uncompressed)than in
> Hive, as hbase stores each KV pair for one column which causes the key to
> be repeated several times. But after I tried compress the Hbase table using
> LZO(now nearly the same as in hive uncompressed table), there's no
> performance gain for queries like select count(*) from xtable;
> Is there anyone working on this?Not sure whether I should put this post to
> Hive's mailing list but there seems to be no progress on issues like
> https://issues.apache.org/jira/browse/HIVE-1231
>
> Regards,
> Bruce
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB