Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> Re: HBase Developer's Pow-wow.


+
N Keywal 2012-08-30, 07:38
+
Stack 2012-08-30, 04:25
+
Ramkrishna.S.Vasudevan 2012-08-30, 04:35
+
Stack 2012-08-30, 04:56
+
Jonathan Hsieh 2012-08-29, 18:30
+
Ted Yu 2012-08-30, 17:20
+
Devaraj Das 2012-08-29, 20:06
+
Stack 2012-08-29, 20:32
+
Ramkrishna.S.Vasudevan 2012-08-30, 04:21
+
Devaraj Das 2012-08-29, 21:43
+
Jonathan Hsieh 2012-08-29, 23:12
+
Devaraj Das 2012-08-30, 06:12
+
Ramkrishna.S.Vasudevan 2012-08-30, 07:05
+
Andrew Purtell 2012-08-30, 06:58
+
Jimmy Xiang 2012-08-29, 20:11
+
Andrew Purtell 2012-08-29, 20:15
+
Lars George 2012-08-30, 22:04
+
Devaraj Das 2012-08-30, 22:36
+
Stack 2012-08-30, 22:42
+
Stack 2012-08-31, 22:59
+
Stack 2012-09-03, 15:40
+
Ramkrishna.S.Vasudevan 2012-09-05, 04:18
+
Stack 2012-09-09, 22:08
+
Jesse Yates 2012-09-09, 22:11
+
lars hofhansl 2012-09-10, 22:46
+
Stack 2012-09-09, 22:21
+
Jesse Yates 2012-09-09, 22:25
+
Stack 2012-09-09, 22:44
+
Jacques 2012-09-10, 03:03
Copy link to this message
-
Re: HBase Developer's Pow-wow.
more food for thought on secondary indexing...

*Additional questions*:

   - How important is indexing column qualifiers themselves (similar to
   Cassandra where people frequently utilize column qualifiers as "values"
   with no actual values stored)?
   - How important is indexing cell timestamps?
*More thoughts/my answers on some of the questions I posed:*

   - From my experience, indexes should be at the region level (e.g.
   row-level sharding as opposed to term).  Other sharding approaches will
   likely have scale and consistency problems.
   - In general it seems like there is tension between the main low level
   approaches of (1) leverage as much HBase infrastructure as possible (e.g.
   secondary tables) and (2) leverage an efficient indexing library e.g.
   Lucene.

*
*
*Approach Thoughts*
Trying to leverage HBase as much as possible is hard if we want to utilize
the approach above and have consistent indexing.  However, I think we can
do it if we add support for what I will call a "local shadow family".
 These are additional, internally managed families for a table.  However,
they have the special characteristic that they belong to the region despite
their primary keys being outside the range of the region's.  Otherwise they
look like a typical family.  On splits, they are regenerated (somehow).  If
we take advantage of Lars'
HBASE-5229<https://issues.apache.org/jira/browse/HBASE-5229>,
we then have the opportunity to consistently insert one or more rows into
these local shadow families for the purpose of secondary indexing. The
structure of these secondary families could use row keys as the indexed
values, qualifiers for specific store files and the value of each being a
list of originating keys (using read-append or
HBASE-5993<https://issues.apache.org/jira/browse/HBASE-5993>).
 By leveraging the existing family infrastructure, we get things like
optional in-memory indexes and basic scanners for free and don't have to
swallow a big chunk of external indexing code.

The simplest approach for integration of these for queries would be
internally be a  ScannerBasedFilter (a filter that is based on a scanner)
and a GroupingScanner (a Scanner that does intersection and/or union of
scanners for multi criteria queries).  Implementation of these scanners
could happen at one of two levels:

   - StoreScanner level: A more efficient approach using the store file
   qualifier approach above (this allows easier maintenance of index
   deletions)
   - RegionScanner level: A simpler implementation with less violation of
   existing encapsulation.  We'd store row keys in qualifiers instead of
   values to ensure ordering that works iteratively with RegionScanner.  The
   weaknesses of this approach are less efficient scanning and figuring out
   how to manage primary value deletes.

In general, the best way to deal with deletes is probably to age them out
per storefile and just filter "near misses" as a secondary filter that
works with ScannerBasedFilter.  The client side would be TBD but would
probably offer some kind of criteria filters that on server side had all
the lower level ramifications.

*Future Optimizations*
In a perfect world, we'd actually use StoreFile block start locations as
the index pointer values in the secondary families.  This would make things
much more compact and efficient.  Especially if we used a smarter block
codec that took advantage of this nature.  However, this requires quite a
bit more work since we'd need to actually use the primary keys in the
secondary memstore and then "patch" the values to block locations as we
flushed the primary family that we were indexing (ugh).

Assuming that the primary limiter of peak write throughput for HBase is
typically WAL writing and since indexes have no "real" data, we could
consider disabling WAL for local shadow families and simply regenerate this
data upon primary WAL playback.  I haven't spent enough time in that code
to know what kind of consistency pain this would cause  (my intuition is it
would be fine as long as we didn't fix
HBASE-3149<https://issues.apache.org/jira/browse/HBASE-3149>).
If consistency isn't a problem, this would be a nice option since it means
that indexing would have minimal impact on peak write throughput.
*I haven't thought at all about...*

   - How/whether this makes sense to be implemented as a coprocessor.
   - Weird timestamp impacts/considerations here.
   - Version handling/impacts.

On Sun, Sep 9, 2012 at 8:03 PM, Jacques <[EMAIL PROTECTED]> wrote:

+
Andrew Purtell 2012-09-10, 18:09
+
Matt Corgan 2012-09-10, 19:13
+
Jacques 2012-09-10, 23:40
+
Matt Corgan 2012-09-11, 01:20
+
Jacques 2012-09-11, 04:04
+
Andrew Purtell 2012-09-11, 04:22
+
Ramkrishna.S.Vasudevan 2012-09-11, 04:47
+
Ted Yu 2012-09-10, 17:51
+
Jacques 2012-09-10, 20:45
+
Stack 2012-09-10, 04:41
+
Andrew Purtell 2012-09-10, 17:58
+
Jacques 2012-09-10, 20:50
+
Ted Yu 2012-08-29, 18:40
+
Devaraj Das 2012-09-11, 00:21
+
Matt Corgan 2012-09-11, 05:59
+
Stack 2012-09-05, 04:36