Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Question on the number of column families

Copy link to this message
RE: Question on the number of column families
Hi Ted,

Now I finished reading the filtering section and the source code of TestJoinedScanners(0.94).

Facts learned:

- While scanning, an entire row will be read even for a rowkey filtering. (Since a rowkey is not a physically separate entity and stored in KeyValue object, it's natural. Am I right?)
- The key API for the essential column family support is setLoadColumnFamiliesOnDemand().

So, now I have questions:

On rowkey filtering, which column family's KeyValue object is read?
If HBase just reads a KeyValue from a randomly selected (or just the first) column family, how is setLoadColumnFamiliesOnDemand() affected? Can HBase select a smaller column family intelligently?

If setLoadColumnFamiliesOnDemand() can be applied to a rowkey filtering, a 'dummy' column family can be used to minimize the scan cost.

Thank you.