Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Re: HBase - Secondary Index


Copy link to this message
-
Re: HBase - Secondary Index
Adrien Mogenet 2013-01-06, 20:40
Are your talking about Data block encoding of K/V ?
https://issues.apache.org/jira/browse/HBASE-4218
On Sun, Jan 6, 2013 at 9:36 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:

> Does anyone has any links or information to the new prefix encoding feature
> in HBase that's being referred to in this mail?
>
> On Sun, Jan 6, 2013 at 12:30 PM, Adrien Mogenet <[EMAIL PROTECTED]
> >wrote:
>
> > Nice topic, perhaps one of the most important for 2013 :-)
> > I still don't get how you're ensuring consistency between index table and
> > main table, without an external component (such as bookkeeper/zookeeper).
> > What's the exact write path in your situation when inserting data ?
> > (WAL/RegionObserver, pre/post put/WALedit...)
> >
> > The underlying question is about how you're ensuring that WALEdit in
> Index
> > and Main tables are perfectly sync'ed, and how you 're able to rollback
> in
> > case of issue in both WAL ?
> >
> >
> > On Fri, Dec 28, 2012 at 11:55 AM, Shengjie Min <[EMAIL PROTECTED]>
> > wrote:
> >
> > > >Yes as you say when the no of rows to be returned is becoming more and
> > > more the latency will be becoming more.  seeks within an HFile block is
> > > some what expensive op now. (Not much but still)  The new encoding
> > >prefix
> > > trie will be a huge bonus here. There the seeks will be flying.. [Ted
> > also
> > > presented this in the Hadoop China]  Thanks to Matt... :)  I am trying
> to
> > > measure the scan performance with this new encoding . Trying to >back
> > port
> > > a simple patch for 94 version just for testing...   Yes when the no of
> > > results to be returned is more and more any index will become less
> > > performing as per my study  :)
> > >
> > > yes, you are right, I guess it's just a drawback of any index approach.
> > > Thanks for the explanation.
> > >
> > > Shengjie
> > >
> > > On 28 December 2012 04:14, Anoop Sam John <[EMAIL PROTECTED]> wrote:
> > >
> > > > > Do you have link to that presentation?
> > > >
> > > > http://hbtc2012.hadooper.cn/subject/track4TedYu4.pdf
> > > >
> > > > -Anoop-
> > > >
> > > > ________________________________________
> > > > From: Mohit Anchlia [[EMAIL PROTECTED]]
> > > > Sent: Friday, December 28, 2012 9:12 AM
> > > > To: [EMAIL PROTECTED]
> > > > Subject: Re: HBase - Secondary Index
> > > >
> > > > On Thu, Dec 27, 2012 at 7:33 PM, Anoop Sam John <[EMAIL PROTECTED]>
> > > > wrote:
> > > >
> > > > > Yes as you say when the no of rows to be returned is becoming more
> > and
> > > > > more the latency will be becoming more.  seeks within an HFile
> block
> > is
> > > > > some what expensive op now. (Not much but still)  The new encoding
> > > prefix
> > > > > trie will be a huge bonus here. There the seeks will be flying..
> [Ted
> > > > also
> > > > > presented this in the Hadoop China]  Thanks to Matt... :)  I am
> > trying
> > > to
> > > > > measure the scan performance with this new encoding . Trying to
> back
> > > > port a
> > > > > simple patch for 94 version just for testing...   Yes when the no
> of
> > > > > results to be returned is more and more any index will become less
> > > > > performing as per my study  :)
> > > > >
> > > > > Do you have link to that presentation?
> > > >
> > > >
> > > > > >btw, quick question- in your presentation, the scale there is
> > seconds
> > > or
> > > > > mill-seconds:)
> > > > >
> > > > > It is seconds.  Dont consider the exact values. What is the % of
> > > increase
> > > > > in latency is important :) Those were not high end machines.
> > > > >
> > > > > -Anoop-
> > > > > ________________________________________
> > > > > From: Shengjie Min [[EMAIL PROTECTED]]
> > > > > Sent: Thursday, December 27, 2012 9:59 PM
> > > > > To: [EMAIL PROTECTED]
> > > > > Subject: Re: HBase - Secondary Index
> > > > >
> > > > >  >Didnt follow u completely here. There wont be any get()
> happening..
> > > As
> > > > > the
> > > > > >exact rowkey in a region we get from the index table, we can seek
> to

Adrien Mogenet
06.59.16.64.22
http://www.mogenet.me