Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase as a transformation engine


Copy link to this message
-
Re: HBase as a transformation engine
Hi,

We have done this kind of thing using HBase 0.92.1 + Pig, but we
finally had to limit the size of the tables and move the biggest
data to HDFS: loading data directly from HBase is much slower than
from HDFS, and doing it using M/R overloads HBase region servers,
since several maps jobs scan table regions at the same time: so the
bigger your tables are, the higher the load is (usually Pig creates
1 map per region, I don't know about Hive).

This may not be an issue if your HBase cluster is dedicated to this
kind of job, but if you also have to ensure a good random read
latency at the same time, forget it.

Regards,

Le 11/11/2013 13:10, JC a �crit :
> We are looking to use hbase as a transformation engine. In other words, take
> data already loaded into hbase, run some large calculation/aggregation on
> that data and then load it back into a rdbms for our BI analytic tools to
> use. I was curious about what the communities experience is on this and if
> there are some best practices. Some thoughts we are kicking around is using
> Mapreduce 2 and Yarn and writing files to HDFS to be loaded into the rdbms.
> Not sure what all the pieces are needed for the complete application though.
>
> Thanks in advance for your help,
> JC
>
>
>
> --
> View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-as-a-transformation-engine-tp4052670.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB