Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Scan performance on a big table as combination of multiple logic tables


Copy link to this message
-
Re: Scan performance on a big table as combination of multiple logic tables
Hi Thomas,

The issue with combining multiple tables into different CFs of one
table is that the tables will get tied together for flush/compact
operations. If the workload between them differs significantly you
might introduce bad inefficiency for one or the other. See HBASE-3149.

-Todd

On Wed, Feb 15, 2012 at 1:57 PM, Pan, Thomas <[EMAIL PROTECTED]> wrote:
>
> Since Hbase is tailored to handle one table very well, we are thinking to put multiple tables into one big table but on different column family sets. Our use case is full table scan against single column value filters. As records from different "logical tables" are at different column families, could we speed up the scan performance by simply checking the column family referenced by these single column value filters first before really going through all the underlying K-V pairs? It would be great if the Hbase code is already coded that way.
>
>
> $0.02,
> Thomas
>

--
Todd Lipcon
Software Engineer, Cloudera
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB