Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # dev - Re: HBase Developer's Pow-wow.


+
N Keywal 2012-08-30, 07:38
+
Stack 2012-08-30, 04:25
+
Ramkrishna.S.Vasudevan 2012-08-30, 04:35
+
Stack 2012-08-30, 04:56
+
Jonathan Hsieh 2012-08-29, 18:30
+
Ted Yu 2012-08-30, 17:20
+
Devaraj Das 2012-08-29, 20:06
+
Stack 2012-08-29, 20:32
+
Ramkrishna.S.Vasudevan 2012-08-30, 04:21
+
Devaraj Das 2012-08-29, 21:43
+
Jonathan Hsieh 2012-08-29, 23:12
+
Devaraj Das 2012-08-30, 06:12
+
Ramkrishna.S.Vasudevan 2012-08-30, 07:05
+
Andrew Purtell 2012-08-30, 06:58
+
Jimmy Xiang 2012-08-29, 20:11
+
Andrew Purtell 2012-08-29, 20:15
+
Lars George 2012-08-30, 22:04
+
Devaraj Das 2012-08-30, 22:36
+
Stack 2012-08-30, 22:42
+
Stack 2012-08-31, 22:59
+
Stack 2012-09-03, 15:40
+
Ramkrishna.S.Vasudevan 2012-09-05, 04:18
+
Stack 2012-09-09, 22:08
+
Jesse Yates 2012-09-09, 22:11
+
lars hofhansl 2012-09-10, 22:46
+
Stack 2012-09-09, 22:21
+
Jesse Yates 2012-09-09, 22:25
+
Stack 2012-09-09, 22:44
+
Jacques 2012-09-10, 03:03
+
Jacques 2012-09-10, 07:03
+
Andrew Purtell 2012-09-10, 18:09
+
Matt Corgan 2012-09-10, 19:13
+
Jacques 2012-09-10, 23:40
+
Matt Corgan 2012-09-11, 01:20
+
Jacques 2012-09-11, 04:04
+
Andrew Purtell 2012-09-11, 04:22
+
Ramkrishna.S.Vasudevan 2012-09-11, 04:47
+
Ted Yu 2012-09-10, 17:51
+
Jacques 2012-09-10, 20:45
+
Stack 2012-09-10, 04:41
Copy link to this message
-
Re: HBase Developer's Pow-wow.
Andrew Purtell 2012-09-10, 17:58
Hi Jaques,

> Does family level indexing make sense or is the real need for qualifier
> level indexing?

The use cases considered, at least over here at TM, all come down to
range scanning over values (e.g. WHERE INTEGER($value) < 50). So we
need a mapping such that a scan over the index returns either lists of
pointers to row:family:qualifier, or the value itself embedded in the
index, following the natural order of values in the primary table as
given by a comparator. And a number of projections like this. A set of
default comparators for interpreting values as integers, longs,
floating point, and complex JSON or AVRO records, would be useful.

> What are ideas for a client interface and how transparent is index usage?
>  (E.g. if you set a filter on a qualifier... )

It would be nice if the existing client API can handle it somehow.
Get, Put, Increment, Scan, all of these API objects can transmit
arbitrary attributes from the client to the server. It would be low
friction for a user to modify their use of these existing API objects,
rather than using a completely different interface like coprocessor
Endpoint invocations. (Or, at least a client library should hide that,
in that case.)

> What were the challenges and issues with the proof of concept TrendMicro
> approach that ultimately made it untenable? (was an eventually consistent
> approach)

This was simply a prototype implementation quality issue, nothing
wrong about an eventually consistent approach per se.

> Is it important to colocate/duplicate indexed values and/or additional
> portions of data in secondary indices to minimize disk seeks (almost making
> HBase optionally more columnar in nature)?

I do think we want to offer the Megastore-like option for storing
value data into indexes, and also not. Then we can manage this
tradeoff of minimizing seeks and round trips versus increased storage
utilization on a per-index basis according to the needs of the use
case.

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)
+
Jacques 2012-09-10, 20:50
+
Ted Yu 2012-08-29, 18:40
+
Devaraj Das 2012-09-11, 00:21
+
Matt Corgan 2012-09-11, 05:59
+
Stack 2012-09-05, 04:36