Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> observer coprocessor question regarding puts


+
rob mancuso 2013-06-08, 17:54
+
Anoop John 2013-06-09, 04:47
+
rob mancuso 2013-06-09, 17:16
+
Michael Segel 2013-06-10, 19:42
+
rob mancuso 2013-06-14, 02:34
+
Michel Segel 2013-06-14, 02:51
Copy link to this message
-
Re: observer coprocessor question regarding puts
Not to beat a dead horse...

I did want to touch a bit more on the schema design issues and considerations.

If you have a really wide composite key and you're only storing a single cell, you will end up with a very long (tall) table.

Does this make sense?

Would it make more sense in using a smaller key and then storing multiple cells with part of the rowkey as a column qualifier?

Using your example... you have [A,B,C] as your rowkey and then Column1 with a value.

You could make the row key [A, B] with the column qualifier [C] storing the value there.

Does that make sense?

-Mike

On Jun 13, 2013, at 9:51 PM, Michel Segel <[EMAIL PROTECTED]> wrote:

> Ok...
>
> But then you are duplicating the data, so you will have to reconcile the two sets and there is a possibility that the data sets are out of sync.
>
> I don't know your entire Schema, but if the row key is larger than the value, you may want to think about changing the Schema.
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Jun 13, 2013, at 9:34 PM, rob mancuso <[EMAIL PROTECTED]> wrote:
>
>> Thx Mike, for the most part.
>>
>> My key is substantially larger than my value, so I was thinking of leaving
>> the cq->value stuff as is and just inverting the rowkey.
>>
>> So the original table would have
>>
>> [A, B, C] cf1:cq1 val1
>>
>> And the secondary table would have
>>
>> [C, B, A] cf1:cq1 val1
>> On Jun 10, 2013 3:42 PM, "Michael Segel" <[EMAIL PROTECTED]> wrote:
>>
>>>
>>> If I understand you ...
>>>
>>> You have the row key = [A,B,C]
>>> You want to create an inverted mapping of  Key [C] => {[A,B,C]}
>>>
>>> That is to say that your inverted index would be all of the rows where the
>>> value of C = x  .
>>> And x is some value.
>>>
>>> You should have to worry about column qualifiers just the values of A , B
>>> and C.
>>>
>>> In this case, the columns in your index will also be the values of the
>>> tuples.
>>> You really don't need C because you already have it, but then you'd need
>>> to remember to add it to the pair (A, B) that you are storing.
>>> I'd say waste the space and store (A,B,C) but that's just me.
>>>
>>>
>>> Is that what you want to do?
>>>
>>> -Mike
>>>
>>> On Jun 9, 2013, at 12:16 PM, rob mancuso <[EMAIL PROTECTED]> wrote:
>>>
>>>> Thx Anoop, I believe this is what I'm looking for.
>>>>
>>>> Regarding my use case,  my rowkey is [A,B,C], but i also have a
>>> requirement
>>>> to access data by [C] only.  So I'm looking to use a post-put coprocessor
>>>> to maintain one secondary index table where the rowkey starts with [C].
>>> My
>>>> cqs are numerics representing time and can be any number btw 1 and 3600
>>> (ie
>>>> seconds within an hour). Because I won't know the cq value for each
>>>> incoming put (just the cf), I need something to deconstruct the put into
>>> a
>>>> list of cqs ...which I believe you've provided with getFamilyMap.
>>>>
>>>> Thx again!
>>>> On Jun 9, 2013 12:47 AM, "Anoop John" <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> You want to have an index per every CF+CQ right?  You want to maintain
>>> diff
>>>>> tables for diff columns?
>>>>>
>>>>> Put is having getFamilyMap method Map CF vs List KVs.  From this List of
>>>>> KVs you can get all the CQ names and values etc..
>>>>>
>>>>> -Anoop-
>>>>>
>>>>> On Sat, Jun 8, 2013 at 11:24 PM, rob mancuso <[EMAIL PROTECTED]>
>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm looking to write a post-put observer coprocessor to maintain a
>>>>>> secondary index.  Basically, my current rowkey design is a composite of
>>>>>> A,B,C and I want to be able to also access data by C.  So all i'm
>>> looking
>>>>>> to do is invert the rowkey and apply it for all cf:cq values that come
>>>>> in.
>>>>>>
>>>>>> My problem (i think), is that in all the good examples i've seen, they
>>>>> all
>>>>>> deconstruct the Put by calling put.get(<cf>,<cq>)...implying they know
>>>>> the
>>>>>> qualifier ahead of time.  I'm looking to specify the family and
+
rob mancuso 2013-06-18, 02:36
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB