Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Question on the number of column families


Copy link to this message
-
RE: Question on the number of column families
Hi Ted,

Now I finished reading the filtering section and the source code of TestJoinedScanners(0.94).

Facts learned:

- While scanning, an entire row will be read even for a rowkey filtering. (Since a rowkey is not a physically separate entity and stored in KeyValue object, it's natural. Am I right?)
- The key API for the essential column family support is setLoadColumnFamiliesOnDemand().

So, now I have questions:

On rowkey filtering, which column family's KeyValue object is read?
If HBase just reads a KeyValue from a randomly selected (or just the first) column family, how is setLoadColumnFamiliesOnDemand() affected? Can HBase select a smaller column family intelligently?

If setLoadColumnFamiliesOnDemand() can be applied to a rowkey filtering, a 'dummy' column family can be used to minimize the scan cost.

Thank you.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB