Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Re: HBase - Secondary Index


+
anil gupta 2012-12-14, 08:41
+
Anoop Sam John 2012-12-14, 08:54
+
ramkrishna vasudevan 2012-12-14, 11:34
+
anil gupta 2012-12-14, 18:01
+
Anoop Sam John 2012-12-17, 04:02
+
anil gupta 2012-12-18, 08:28
+
Anoop Sam John 2012-12-18, 09:27
+
anil gupta 2012-12-19, 08:24
+
Michel Segel 2012-12-18, 09:02
+
Anoop Sam John 2012-12-18, 09:35
+
anil gupta 2012-12-19, 08:39
+
Shengjie Min 2012-12-27, 11:23
+
Anoop Sam John 2012-12-27, 11:30
+
Shengjie Min 2012-12-27, 13:07
+
Anoop John 2012-12-27, 15:54
+
ramkrishna vasudevan 2012-12-27, 16:11
+
Shengjie Min 2012-12-27, 16:29
+
Anoop Sam John 2012-12-28, 03:33
+
Mohit Anchlia 2012-12-28, 03:42
+
Anoop Sam John 2012-12-28, 04:14
+
Shengjie Min 2012-12-28, 10:55
+
Adrien Mogenet 2013-01-06, 20:30
+
Anoop Sam John 2013-01-07, 03:48
+
Mohit Anchlia 2013-01-07, 04:17
+
Anoop Sam John 2013-01-07, 13:49
+
Michael Segel 2013-01-08, 14:33
+
lars hofhansl 2013-01-09, 00:30
+
Michel Segel 2013-01-09, 01:30
Copy link to this message
-
Re: HBase - Secondary Index
+1 on Lars comment.

Either the client gets the rowkey from secondary table and then gets the
real data from Primary Table. ** OR ** Send the request to all the RS(or
region) hosting a region of primary table.

Anoop is using the latter mechanism. Both the mechanism have their pros and
cons. IMO, there is no outright winner.

~Anil Gupta

On Tue, Jan 8, 2013 at 4:30 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> Different use cases.
>
>
> For global point queries you want exactly what you said below.
> For range scans across many rows you want Anoop's design. As usually it
> depends.
>
>
> The tradeoff is bringing a lot of unnecessary data to the client vs having
> to contact each region (or at least each region server).
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Michael Segel <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Tuesday, January 8, 2013 6:33 AM
> Subject: Re: HBase - Secondary Index
>
> So if you're using an inverted table / index why on earth are you doing it
> at the region level?
>
> I've tried to explain this to others over 6 months ago and its not really
> a good idea.
>
> You're over complicating this and you will end up creating performance
> bottlenecks when your secondary index is completely orthogonal to your row
> key.
>
> To give you an example...
>
> Suppose you're CCCIS and you have a large database of auto insurance
> claims that you've acquired over the years from your Pathways product.
>
> Your primary key would be a combination of the Insurance Company's ID and
> their internal claim ID for the individual claim.
> Your row would be all of the data associated to that claim.
>
> So now lets say you want to find the average cost to repair a front end
> collision of an S80 Volvo.
> The make and model of the car would be orthogonal to the initial key. This
> means that the result set containing insurance records for Front End
> collisions of S80 Volvos would be most likely evenly distributed across the
> cluster's regions.
>
> If you used a series of inverted tables, you would be able to use a series
> of get()s to get the result set from each index and then find their
> intersections. (Note that you could also put them in sort order so that the
> intersections would be fairly straight forward to find.
>
> Doing this at the region level isn't so simple.
>
> So I have to again ask why go through and over complicate things?
>
> Just saying...
>
> On Jan 7, 2013, at 7:49 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> > It is inverted index based on column(s) value(s)
> > It will be region wise indexing. Can work when some one knows the rowkey
> range or NOT.
> >
> > -Anoop-
> > ________________________________________
> > From: Mohit Anchlia [[EMAIL PROTECTED]]
> > Sent: Monday, January 07, 2013 9:47 AM
> > To: [EMAIL PROTECTED]
> > Subject: Re: HBase - Secondary Index
> >
> > Hi Anoop,
> >
> > Am I correct in understanding that this indexing mechanism is only
> > applicable when you know the row key? It's not an inverted index truly
> > based on the column value.
> >
> > Mohit
> > On Sun, Jan 6, 2013 at 7:48 PM, Anoop Sam John <[EMAIL PROTECTED]>
> wrote:
> >
> >> Hi Adrien
> >>                 We are making the consistency btw the main table and
> >> index table and the roll back mentioned below etc using the CP hooks.
> The
> >> current hooks were not enough for those though..  I am in the process of
> >> trying to contribute those new hooks, core changes etc now...  Once all
> are
> >> done I will be able to explain in details..
> >>
> >> -Anoop-
> >> ________________________________________
> >> From: Adrien Mogenet [[EMAIL PROTECTED]]
> >> Sent: Monday, January 07, 2013 2:00 AM
> >> To: [EMAIL PROTECTED]
> >> Subject: Re: HBase - Secondary Index
> >>
> >> Nice topic, perhaps one of the most important for 2013 :-)
> >> I still don't get how you're ensuring consistency between index table
> and
> >> main table, without an external component (such as
Thanks & Regards,
Anil Gupta
+
Anoop Sam John 2013-01-09, 03:22
+
ramkrishna vasudevan 2013-01-09, 04:11
+
Mohit Anchlia 2013-01-09, 01:50
+
Asaf Mesika 2013-01-08, 23:00
+
Mohit Anchlia 2013-01-06, 20:36
+
Adrien Mogenet 2013-01-06, 20:40
+
anil gupta 2013-01-06, 22:12
+
Anoop Sam John 2012-12-20, 03:33
+
Farah Karim 2012-12-25, 10:14
+
David Arthur 2012-12-20, 02:47
+
Anoop Sam John 2012-12-20, 03:44