anil gupta 2012-05-24, 21:51
Anoop Sam John 2012-05-25, 00:57
Ted Yu 2012-05-25, 04:37
Matt Corgan 2012-05-25, 06:22
-Re: Disable timestamp in HBase Table a.k.a Disable Versioning in HBase Table
anil gupta 2012-05-29, 23:29
Sorry for late reply as i got stuck in other task at work on Friday and
skimming through the HBase-4676 took me a while.
HBase-6093 seems to be very close to my suggestion. The only difference is
that Matt mentioned in the description that it can only be used when all
inserts are type=Put. Is aforementioned restriction due to HFileV2? I think
deleting an entire row wouldn't be a problem. right? I have very little
knowledge about HFileV2. I will try to read about HFileV2 soon.
HBASE-4676 seems really cool. IMHO, currently the issue is that write and
scan(slower by ~2x as compared to NONE if we assume that Trie compresses by
~2-3x) are slow and as per the jira if ratio of value/Key is big then trie
wont have any impact. Is this feature going to be part of any future
release of HBase? Awesome stuff Matt.
@Anoop: You meant that i should use the feature in HBase-4676 and pass the
timestamp as 0L in each put. Right?
Thanks all for your valuable time and inputs.
On Thu, May 24, 2012 at 11:22 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:
> Hi Anil,
> I created HBASE-6093
> <https://issues.apache.org/jira/browse/HBASE-6093>with an idea that
> could solve this problem. It could be a simple
> implementation for simple workloads, but gets harder to support for tables
> with TTL's, maxVersion > 1, Deletes, etc... Maybe it can only be enabled
> if the other ColumnFamily settings are compatible.
> On Thu, May 24, 2012 at 9:37 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> > What Anoop said is in 0.94.0
> > For trunk, HBASE-4676 provides trie data block encoding.
> > It suits write-once read-many use case very well.
> > Cheers
> > On Thu, May 24, 2012 at 5:57 PM, Anoop Sam John <[EMAIL PROTECTED]>
> > wrote:
> > > Hi Anil,
> > > There is no way you can avoid the timestamp with KVs. In your
> > > case you can think of using data block encoding? You can see
> > > FastDiffDeltaEncoder and DiffKeyDeltaEncoder. This includes way of
> > avoiding
> > > writing the 8 bytes into each KV for timestamp. Still some bytes will
> > > written though and this will be done at the block level. Also pls note
> > that
> > > these encoders will do much more things than the timestamp space
> > > optimization. Also you need to make sure to pass some timestamp in your
> > > Puts. May be better make as 0L. Else in RS side HBase will assign the
> > > time as the timestamp. Hope when u read the javadoc for these encoder
> > > classes, u will be more clear.
> > >
> > > The one you are telling abt having a feature to fully avoid the
> > > is a topic to discuss
> > >
> > > Hope I make it clear to you
> > >
> > > -Anoop-
> > > ________________________________________
> > > From: anil gupta [[EMAIL PROTECTED]]
> > > Sent: Friday, May 25, 2012 3:21 AM
> > > To: [EMAIL PROTECTED]
> > > Subject: Disable timestamp in HBase Table a.k.a Disable Versioning in
> > > HBase Table
> > >
> > > Hi All,
> > >
> > > We are planning to store data in HBase. Currently, in one of our use
> > > once a row is written into HBase Table we wont be modifying the data of
> > > that row. Since, for every cell(right?) in HBase a timestamp(long
> > is
> > > stored; this would take up extra 8 bytes. I was thinking is there a way
> > to
> > > disable timestamp on HBase table when versioning is not required. I
> > > through the documentation and searched mailing list for same but could
> > not
> > > find anything relevant. Since we are talking about billions of cells,
> > this
> > > would add up to significant amount of space.(around 7.45 GigaBytes for
> > > billion cells). Does this sounds like a feature HBase is missing?
> > >
> > > Please share your thoughts.
> > >
> > > --
> > > Thanks & Regards,
> > > Anil Gupta
> > >
Thanks & Regards,
Matt Corgan 2012-05-29, 23:46
Anoop Sam John 2012-05-30, 04:26
anil gupta 2012-05-30, 19:57