-Re: Custom versioning best practices
Michael Segel 2012-11-22, 16:11
IMHO, the best practice is not to do this.
Its an abuse of versioning and if you really want to store temporal data, make it part of the column name.
On Nov 22, 2012, at 7:55 AM, David Koch <[EMAIL PROTECTED]> wrote:
> I was thinking of using versions with custom timestamps to store the
> evolution of a column value - as opposed to creating several (time_t,
> value_at_time_t) qualifier-value pairs. The value to be stored is a single
> integer. Fast ad-hoc retrieval of multiple versions based on a row key +
> filter  (i.e through a web service) is important, the number of row keys
> will be between 10^6 and 10^9.
> a) If the number of versions (timestamps) is moderate, can I expect
> read/filtering performance to be better than when using multiple
> qualifier/value pairs?
> b) For a larger number of versions, say 365, what if any precautions should
> I take with respect to the HBase/table setup.
> I looked around a bit and found the following:
> The documentation  mentions that the maximum number of versions should
> not be too high ("in the hundreds"). The HBase o'Reilly book  on the
> other hand mentions that Facebook use(d) versions to store inbox messages
> in order. Clearly, the number of messages may grow quite large (>> 100). Is
>  still valid with more recent versions of HBase?
> Thank you,
>  http://hbase.apache.org/book/schema.versions.html
>  1st edition, page 384