Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Disable timestamp in HBase Table a.k.a Disable Versioning in HBase Table


Copy link to this message
-
Re: Disable timestamp in HBase Table a.k.a Disable Versioning in HBase Table
Hi All,

Sorry for late reply as i got stuck in other task at work on Friday and
skimming through the HBase-4676 took me a while.

HBase-6093 seems to be very close to my suggestion. The only difference is
that Matt mentioned in the description that it can only be used when all
inserts are type=Put. Is aforementioned restriction due to HFileV2? I think
deleting an entire row wouldn't be a problem. right? I have very little
knowledge about HFileV2. I will try to read about HFileV2 soon.

HBASE-4676 seems really cool. IMHO, currently the issue is that write and
scan(slower by ~2x as compared to NONE if we assume that Trie compresses by
~2-3x) are slow and as per the jira if ratio of value/Key is big then trie
wont have any impact. Is this feature going to be part of any future
release of HBase?  Awesome stuff Matt.

@Anoop: You meant that i should use the feature in HBase-4676 and pass the
timestamp as 0L in each put. Right?

Thanks all for your valuable time and inputs.
-Anil
On Thu, May 24, 2012 at 11:22 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:

> Hi Anil,
>
> I created HBASE-6093
> <https://issues.apache.org/jira/browse/HBASE-6093>with an idea that
> could solve this problem.  It could be a simple
> implementation for simple workloads, but gets harder to support for tables
> with TTL's, maxVersion > 1, Deletes, etc...  Maybe it can only be enabled
> if the other ColumnFamily settings are compatible.
>
> Matt
>
>
> On Thu, May 24, 2012 at 9:37 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > What Anoop said is in 0.94.0
> >
> > For trunk, HBASE-4676 provides trie data block encoding.
> > It suits write-once read-many use case very well.
> >
> > Cheers
> >
> > On Thu, May 24, 2012 at 5:57 PM, Anoop Sam John <[EMAIL PROTECTED]>
> > wrote:
> >
> > > Hi Anil,
> > >           There is no way you can avoid the timestamp with KVs. In your
> > > case you can think of using data block encoding? You can see
> > > FastDiffDeltaEncoder and DiffKeyDeltaEncoder. This includes way of
> > avoiding
> > > writing the 8 bytes into each KV for timestamp. Still some bytes will
> be
> > > written though and this will be done at the block level. Also pls note
> > that
> > > these encoders will do much more things than the timestamp space
> > > optimization. Also you need to make sure to pass some timestamp in your
> > > Puts. May be better make as 0L. Else in RS side HBase will assign the
> cur
> > > time as the timestamp.  Hope when u read the javadoc for these encoder
> > > classes, u will be more clear.
> > >
> > > The one you are telling abt having a feature to fully avoid the
> timestamp
> > > is a topic to discuss
> > >
> > > Hope I make it clear to you
> > >
> > > -Anoop-
> > > ________________________________________
> > > From: anil gupta [[EMAIL PROTECTED]]
> > > Sent: Friday, May 25, 2012 3:21 AM
> > > To: [EMAIL PROTECTED]
> > > Subject: Disable timestamp in HBase Table a.k.a Disable Versioning in
> > > HBase Table
> > >
> > > Hi All,
> > >
> > > We are planning to store data in HBase. Currently, in one of our use
> case
> > > once a row is written into HBase Table we wont be modifying the data of
> > > that row. Since, for every cell(right?) in HBase a timestamp(long
> value)
> > is
> > > stored; this would take up extra 8 bytes. I was thinking is there a way
> > to
> > > disable timestamp on HBase table when versioning is not required. I
> went
> > > through the documentation and searched mailing list for same but could
> > not
> > > find anything relevant. Since we are talking about billions of cells,
> > this
> > > would add up to significant amount of space.(around 7.45 GigaBytes for
> 1
> > > billion cells). Does this sounds like a feature HBase is missing?
> > >
> > > Please share your thoughts.
> > >
> > > --
> > > Thanks & Regards,
> > > Anil Gupta
> > >
> >
>

--
Thanks & Regards,
Anil Gupta
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB