-RE: Disable timestamp in HBase Table a.k.a Disable Versioning in HBase Table
As HBASE-4676 is not available as of now, may be you can check other enoders, DiffKeyDeltaEncoder or FastDiffDeltaEncoder.
Pls go through the javadoc of these and see what they do apart from compressing the timestamp parts. These do other nice stiff too which will make your data stored on disk to be smaller size.
When HBASE-4676 comes you can try using that as it would be more close to your need I think.
Also pls make sure to set timestamp as 0L in all your Puts. If you don't do that then HBase will set the curtime in millis as the timestamp for each Put.
From: Matt Corgan [[EMAIL PROTECTED]]
Sent: Wednesday, May 30, 2012 5:16 AM
To: [EMAIL PROTECTED]
Subject: Re: Disable timestamp in HBase Table a.k.a Disable Versioning in HBase Table
> Is this feature going to be part of any future release of HBase?
i couldn't get it finished in time for 0.94, but i think it's very likely
to be in 0.96, possibly with a backport to .94. Scan speed should improve
if i have time to optimize the cell comparators and collators
On Tue, May 29, 2012 at 4:29 PM, anil gupta <[EMAIL PROTECTED]> wrote:
> Hi All,
> Sorry for late reply as i got stuck in other task at work on Friday and
> skimming through the HBase-4676 took me a while.
> HBase-6093 seems to be very close to my suggestion. The only difference is
> that Matt mentioned in the description that it can only be used when all
> inserts are type=Put. Is aforementioned restriction due to HFileV2? I think
> deleting an entire row wouldn't be a problem. right? I have very little
> knowledge about HFileV2. I will try to read about HFileV2 soon.
> HBASE-4676 seems really cool. IMHO, currently the issue is that write and
> scan(slower by ~2x as compared to NONE if we assume that Trie compresses by
> ~2-3x) are slow and as per the jira if ratio of value/Key is big then trie
> wont have any impact. Is this feature going to be part of any future
> release of HBase? Awesome stuff Matt.
> @Anoop: You meant that i should use the feature in HBase-4676 and pass the
> timestamp as 0L in each put. Right?
> Thanks all for your valuable time and inputs.
> On Thu, May 24, 2012 at 11:22 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:
> > Hi Anil,
> > I created HBASE-6093
> > <https://issues.apache.org/jira/browse/HBASE-6093>with an idea that
> > could solve this problem. It could be a simple
> > implementation for simple workloads, but gets harder to support for
> > with TTL's, maxVersion > 1, Deletes, etc... Maybe it can only be enabled
> > if the other ColumnFamily settings are compatible.
> > Matt
> > On Thu, May 24, 2012 at 9:37 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> > > What Anoop said is in 0.94.0
> > >
> > > For trunk, HBASE-4676 provides trie data block encoding.
> > > It suits write-once read-many use case very well.
> > >
> > > Cheers
> > >
> > > On Thu, May 24, 2012 at 5:57 PM, Anoop Sam John <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > Hi Anil,
> > > > There is no way you can avoid the timestamp with KVs. In
> > > > case you can think of using data block encoding? You can see
> > > > FastDiffDeltaEncoder and DiffKeyDeltaEncoder. This includes way of
> > > avoiding
> > > > writing the 8 bytes into each KV for timestamp. Still some bytes will
> > be
> > > > written though and this will be done at the block level. Also pls
> > > that
> > > > these encoders will do much more things than the timestamp space
> > > > optimization. Also you need to make sure to pass some timestamp in
> > > > Puts. May be better make as 0L. Else in RS side HBase will assign the
> > cur
> > > > time as the timestamp. Hope when u read the javadoc for these
> > > > classes, u will be more clear.
> > > >
> > > > The one you are telling abt having a feature to fully avoid the
> > timestamp
> > > > is a topic to discuss
> > > >
> > > > Hope I make it clear to you