Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> observer coprocessor question regarding puts


+
rob mancuso 2013-06-08, 17:54
+
Anoop John 2013-06-09, 04:47
+
rob mancuso 2013-06-09, 17:16
+
Michael Segel 2013-06-10, 19:42
+
rob mancuso 2013-06-14, 02:34
Copy link to this message
-
Re: observer coprocessor question regarding puts
Ok...

But then you are duplicating the data, so you will have to reconcile the two sets and there is a possibility that the data sets are out of sync.

I don't know your entire Schema, but if the row key is larger than the value, you may want to think about changing the Schema.
Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 13, 2013, at 9:34 PM, rob mancuso <[EMAIL PROTECTED]> wrote:

> Thx Mike, for the most part.
>
> My key is substantially larger than my value, so I was thinking of leaving
> the cq->value stuff as is and just inverting the rowkey.
>
> So the original table would have
>
> [A, B, C] cf1:cq1 val1
>
> And the secondary table would have
>
> [C, B, A] cf1:cq1 val1
> On Jun 10, 2013 3:42 PM, "Michael Segel" <[EMAIL PROTECTED]> wrote:
>
>>
>> If I understand you ...
>>
>> You have the row key = [A,B,C]
>> You want to create an inverted mapping of  Key [C] => {[A,B,C]}
>>
>> That is to say that your inverted index would be all of the rows where the
>> value of C = x  .
>> And x is some value.
>>
>> You should have to worry about column qualifiers just the values of A , B
>> and C.
>>
>> In this case, the columns in your index will also be the values of the
>> tuples.
>> You really don't need C because you already have it, but then you'd need
>> to remember to add it to the pair (A, B) that you are storing.
>> I'd say waste the space and store (A,B,C) but that's just me.
>>
>>
>> Is that what you want to do?
>>
>> -Mike
>>
>> On Jun 9, 2013, at 12:16 PM, rob mancuso <[EMAIL PROTECTED]> wrote:
>>
>>> Thx Anoop, I believe this is what I'm looking for.
>>>
>>> Regarding my use case,  my rowkey is [A,B,C], but i also have a
>> requirement
>>> to access data by [C] only.  So I'm looking to use a post-put coprocessor
>>> to maintain one secondary index table where the rowkey starts with [C].
>> My
>>> cqs are numerics representing time and can be any number btw 1 and 3600
>> (ie
>>> seconds within an hour). Because I won't know the cq value for each
>>> incoming put (just the cf), I need something to deconstruct the put into
>> a
>>> list of cqs ...which I believe you've provided with getFamilyMap.
>>>
>>> Thx again!
>>> On Jun 9, 2013 12:47 AM, "Anoop John" <[EMAIL PROTECTED]> wrote:
>>>
>>>> You want to have an index per every CF+CQ right?  You want to maintain
>> diff
>>>> tables for diff columns?
>>>>
>>>> Put is having getFamilyMap method Map CF vs List KVs.  From this List of
>>>> KVs you can get all the CQ names and values etc..
>>>>
>>>> -Anoop-
>>>>
>>>> On Sat, Jun 8, 2013 at 11:24 PM, rob mancuso <[EMAIL PROTECTED]>
>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm looking to write a post-put observer coprocessor to maintain a
>>>>> secondary index.  Basically, my current rowkey design is a composite of
>>>>> A,B,C and I want to be able to also access data by C.  So all i'm
>> looking
>>>>> to do is invert the rowkey and apply it for all cf:cq values that come
>>>> in.
>>>>>
>>>>> My problem (i think), is that in all the good examples i've seen, they
>>>> all
>>>>> deconstruct the Put by calling put.get(<cf>,<cq>)...implying they know
>>>> the
>>>>> qualifier ahead of time.  I'm looking to specify the family and
>> generate
>>>> a
>>>>> put to the secondary index table for all qualifiers ...not knowing or
>>>>> caring what the qualifier is.
>>>>>
>>>>> Any pointers would be appreciated,
>>>>> Thx - Rob
>>>>>
>>>>> Is there a way
>>
>>
+
Michael Segel 2013-06-14, 14:45
+
rob mancuso 2013-06-18, 02:36