-Custom versioning best practices
David Koch 2012-11-22, 13:55
I was thinking of using versions with custom timestamps to store the
evolution of a column value - as opposed to creating several (time_t,
value_at_time_t) qualifier-value pairs. The value to be stored is a single
integer. Fast ad-hoc retrieval of multiple versions based on a row key +
filter  (i.e through a web service) is important, the number of row keys
will be between 10^6 and 10^9.
a) If the number of versions (timestamps) is moderate, can I expect
read/filtering performance to be better than when using multiple
b) For a larger number of versions, say 365, what if any precautions should
I take with respect to the HBase/table setup.
I looked around a bit and found the following:
The documentation  mentions that the maximum number of versions should
not be too high ("in the hundreds"). The HBase o'Reilly book  on the
other hand mentions that Facebook use(d) versions to store inbox messages
in order. Clearly, the number of messages may grow quite large (>> 100). Is
 still valid with more recent versions of HBase?
 1st edition, page 384
Michael Segel 2012-11-22, 16:11
David Koch 2012-11-22, 20:47
anil gupta 2012-11-22, 21:12