Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Re: HBase - Secondary Index


+
anil gupta 2012-12-14, 08:41
+
Anoop Sam John 2012-12-14, 08:54
+
ramkrishna vasudevan 2012-12-14, 11:34
+
anil gupta 2012-12-14, 18:01
+
Anoop Sam John 2012-12-17, 04:02
+
anil gupta 2012-12-18, 08:28
+
Anoop Sam John 2012-12-18, 09:27
+
anil gupta 2012-12-19, 08:24
+
Michel Segel 2012-12-18, 09:02
Copy link to this message
-
RE: HBase - Secondary Index
Hi Mike
>My question is that since you don't have any formal SQL syntax, how are you doing this all server side?
I think the question is to Anil.. In his case he is not doing the index data scan at the server side. He scan the index table data back to client and from client doing gets to get the main table data.  Correct Anil?
Just making  it clear... :)

-Anoop-
________________________________________
From: Michel Segel [[EMAIL PROTECTED]]
Sent: Tuesday, December 18, 2012 2:32 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: HBase - Secondary Index

Just a couple of questions...

First, since you don't have any natural secondary indices, you can create one from a couple of choices. Keeping it simple, you choose an inverted table as your index.

In doing so, you have one column containing all of the row ids for a given value.
This means that it is a simple get().

My question is that since you don't have any formal SQL syntax, how are you doing this all server side?
Sent from a remote device. Please excuse any typos...

Mike Segel

On Dec 18, 2012, at 2:28 AM, anil gupta <[EMAIL PROTECTED]> wrote:

> Hi Anoop,
>
> Please find my reply inline.
>
> Thanks,
> Anil Gupta
>
> On Sun, Dec 16, 2012 at 8:02 PM, Anoop Sam John <[EMAIL PROTECTED]> wrote:
>
>> Hi Anil
>>                During the scan, there is no need to fetch any index data
>> to client side. So there is no need to create any scanner on the index
>> table at the client side. This happens at the server side.
>
>
>>
>> For the Scan on the main table with condition on timestamp and customer
>> id, a scanner to be created with Filters. Yes like normal when there is no
>> secondary index. So this scan from the client will go through all the
>> regions in the main table.
>
>
> Anil: Do you mean that if the table is spread across 50 region servers in
> 60 node cluster then we need to send a scan request to all the 50 RS.
> Right? Doesn't it sounds expensive? IMHO you were not doing this in your
> solution. Your solution looked cleaner than this since you exactly knew
> which Node you need to go to for querying while using secondary index due
> to co-location(due to static begin part for secondary table rowkey) of
> region of primary table and secondary index table. My problem is little
> more complicated due to the constraints that: I cannot have a "static begin
> part" in the rowkey of my secondary table.
>
> When it scans one particular region say (x,y] on the main table, using the
>> CP we can get the index table region object corresponding to this main
>> table region from the RS.  There is no issue in creating the static part of
>> the rowkey. You know 'x' is the region start key. Then at the server side
>> will create a scanner on the index region directly and here we can specify
>> the startkey. 'x' + <timestamp value> + <customer id>..  Using the results
>> from the index scan we will make reseek on the main region to the exact
>> rows where the data what we are interested in is available. So there wont
>> be a full region data scan happening.
>
>> When in the cases where only timestamp is there but no customer id, it
>> will be simple again. Create a scanner on the main table with only one
>> filter. At the CP side the scanner on the index region will get created
>> with startkey as 'x' + <timestamp value>..    When you create the scan
>> object and set startRow on that it need not be the full rowkey. It can be
>> part of the rowkey also. Yes like prefix.
>>
>> Hope u got it now :)
> Anil: I hope now we are on same page. Thanks a lot for your valuable time
> to discuss this stuff.
>
>>
>> -Anoop-
>> ________________________________________
>> From: anil gupta [[EMAIL PROTECTED]]
>> Sent: Friday, December 14, 2012 11:31 PM
>> To: [EMAIL PROTECTED]
>> Subject: Re: HBase - Secondary Index
>>
>> On Fri, Dec 14, 2012 at 12:54 AM, Anoop Sam John <[EMAIL PROTECTED]>
>> wrote:
>>
>>> Hi Anil,
>>>
>>>> 1. In your presentation you mentioned that region of Primary Table and
+
anil gupta 2012-12-19, 08:39
+
Shengjie Min 2012-12-27, 11:23
+
Anoop Sam John 2012-12-27, 11:30
+
Shengjie Min 2012-12-27, 13:07
+
Anoop John 2012-12-27, 15:54
+
ramkrishna vasudevan 2012-12-27, 16:11
+
Shengjie Min 2012-12-27, 16:29
+
Anoop Sam John 2012-12-28, 03:33
+
Mohit Anchlia 2012-12-28, 03:42
+
Anoop Sam John 2012-12-28, 04:14
+
Shengjie Min 2012-12-28, 10:55
+
Adrien Mogenet 2013-01-06, 20:30
+
Anoop Sam John 2013-01-07, 03:48
+
Mohit Anchlia 2013-01-07, 04:17
+
Anoop Sam John 2013-01-07, 13:49
+
Michael Segel 2013-01-08, 14:33
+
lars hofhansl 2013-01-09, 00:30
+
Michel Segel 2013-01-09, 01:30
+
anil gupta 2013-01-09, 01:28
+
Anoop Sam John 2013-01-09, 03:22
+
ramkrishna vasudevan 2013-01-09, 04:11
+
Mohit Anchlia 2013-01-09, 01:50
+
Asaf Mesika 2013-01-08, 23:00
+
Mohit Anchlia 2013-01-06, 20:36
+
Adrien Mogenet 2013-01-06, 20:40
+
anil gupta 2013-01-06, 22:12
+
Anoop Sam John 2012-12-20, 03:33
+
Farah Karim 2012-12-25, 10:14
+
David Arthur 2012-12-20, 02:47
+
Anoop Sam John 2012-12-20, 03:44