Thank you all.
- Having 130 column families is too much. Don't do that.
- While scanning, an entire row will be read for filtering, unless HBASE-5416 technique is applied which makes only relevant column family is loaded. (But it seems that still one can't load just a column needed while scanning)
- Big row size is maybe not good.
Currently it seems appropriate to follow the one-column solution that Alok Singh suggested, in part since currently there is no reasonable grouping of the fields.
Here is my current thinking:
- One column family, one column. Field name will be included in rowkey.
- Eliminate filtering altogether (in most case) by properly ordering rowkey components.
- If a filtering is absolutely needed, add a 'dummy' column family and apply HBASE-5416 technique to minimize disk read, since the field value can be large(~5MB). (This dummy column thing may not be right, I'm not sure, since I have not read the filtering section of the book I'm reading yet)
Hope that I am not missing or misunderstanding something...
(I'm a total newbie. I've started to read a HBase book since last week...)