Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Schema design for filters


+
Kristoffer Sjögren 2013-06-27, 17:59
+
Michael Segel 2013-06-27, 19:21
+
Kristoffer Sjögren 2013-06-27, 21:41
+
Michael Segel 2013-06-27, 21:51
+
Kristoffer Sjögren 2013-06-27, 22:13
+
Michael Segel 2013-06-27, 22:58
+
Kristoffer Sjögren 2013-06-27, 23:39
+
James Taylor 2013-06-28, 01:55
+
Kristoffer Sjögren 2013-06-28, 09:24
+
Otis Gospodnetic 2013-06-28, 18:34
+
Kristoffer Sjögren 2013-06-28, 18:53
+
Otis Gospodnetic 2013-06-28, 18:58
+
Asaf Mesika 2013-06-28, 21:30
+
Michel Segel 2013-06-28, 23:45
+
Kristoffer Sjögren 2013-06-29, 11:29
Copy link to this message
-
Re: Schema design for filters
Why is it that if all you have is a hammer, everything looks like a nail? ;-)
On Jun 27, 2013, at 8:55 PM, James Taylor <[EMAIL PROTECTED]> wrote:

> Hi Kristoffer,
> Have you had a look at Phoenix (https://github.com/forcedotcom/phoenix)? You could model your schema much like an O/R mapper and issue SQL queries through Phoenix for your filtering.
>
> James
> @JamesPlusPlus
> http://phoenix-hbase.blogspot.com
>
> On Jun 27, 2013, at 4:39 PM, "Kristoffer Sjögren" <[EMAIL PROTECTED]> wrote:
>
>> Thanks for your help Mike. Much appreciated.
>>
>> I dont store rows/columns in JSON format. The schema is exactly that of a
>> specific java class, where the rowkey is a unique object identifier with
>> the class type encoded into it. Columns are the field names of the class
>> and the values are that of the object instance.
>>
>> Did think about coprocessors but the schema is discovered a runtime and I
>> cant hard code it.
>>
>> However, I still believe that filters might work. Had a look
>> at SingleColumnValueFilter and this filter is be able to target specific
>> column qualifiers with specific WritableByteArrayComparables.
>>
>> But list comparators are still missing... So I guess the only way is to
>> write these comparators?
>>
>> Do you follow my reasoning? Will it work?
>>
>>
>>
>>
>> On Fri, Jun 28, 2013 at 12:58 AM, Michael Segel
>> <[EMAIL PROTECTED]>wrote:
>>
>>> Ok...
>>>
>>> If you want to do type checking and schema enforcement...
>>>
>>> You will need to do this as a coprocessor.
>>>
>>> The quick and dirty way... (Not recommended) would be to hard code the
>>> schema in to the co-processor code.)
>>>
>>> A better way... at start up, load up ZK to manage the set of known table
>>> schemas which would be a map of column qualifier to data type.
>>> (If JSON then you need to do a separate lookup to get the records schema)
>>>
>>> Then a single java class that does the look up and then handles the known
>>> data type comparators.
>>>
>>> Does this make sense?
>>> (Sorry, kinda was thinking this out as I typed the response. But it should
>>> work )
>>>
>>> At least it would be a design approach I would talk. YMMV
>>>
>>> Having said that, I expect someone to say its a bad idea and that they
>>> have a better solution.
>>>
>>> HTH
>>>
>>> -Mike
>>>
>>> On Jun 27, 2013, at 5:13 PM, Kristoffer Sjögren <[EMAIL PROTECTED]> wrote:
>>>
>>>> I see your point. Everything is just bytes.
>>>>
>>>> However, the schema is known and every row is formatted according to this
>>>> schema, although some columns may not exist, that is, no value exist for
>>>> this property on this row.
>>>>
>>>> So if im able to apply these "typed comparators" to the right cell values
>>>> it may be possible? But I cant find a filter that target specific
>>> columns?
>>>>
>>>> Seems like all filters scan every column/qualifier and there is no way of
>>>> knowing what column is currently being evaluated?
>>>>
>>>>
>>>> On Thu, Jun 27, 2013 at 11:51 PM, Michael Segel
>>>> <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> You have to remember that HBase doesn't enforce any sort of typing.
>>>>> That's why this can be difficult.
>>>>>
>>>>> You'd have to write a coprocessor to enforce a schema on a table.
>>>>> Even then YMMV if you're writing JSON structures to a column because
>>> while
>>>>> the contents of the structures could be the same, the actual strings
>>> could
>>>>> differ.
>>>>>
>>>>> HTH
>>>>>
>>>>> -Mike
>>>>>
>>>>> On Jun 27, 2013, at 4:41 PM, Kristoffer Sjögren <[EMAIL PROTECTED]>
>>> wrote:
>>>>>
>>>>>> I realize standard comparators cannot solve this.
>>>>>>
>>>>>> However I do know the type of each column so writing custom list
>>>>>> comparators for boolean, char, byte, short, int, long, float, double
>>>>> seems
>>>>>> quite straightforward.
>>>>>>
>>>>>> Long arrays, for example, are stored as a byte array with 8 bytes per
>>>>> item
>>>>>> so a comparator might look like this.
>>>>>>
>>>>>> public class LongsComparator extends WritableByteArrayComparable {