Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> what's the roadmap of secondary index of hbase?


Copy link to this message
-
RE: what's the roadmap of secondary index of hbase?
I've started a wiki page:  http://wiki.apache.org/hadoop/Hbase/SecondaryIndexing

I gave a basic description of the idea I had and the open questions.

Let's get all our thoughts in there.

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Stack
> Sent: Friday, February 25, 2011 4:07 PM
> To: [EMAIL PROTECTED]
> Subject: Re: what's the roadmap of secondary index of hbase?
>
> The MegaStore paper,
> http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf, in section
> 3.2.2, lists secondary indexing options MegaStore provides on top of
> BigTable.  For example, MS allows specifying secondary index on protobuf
> cell content or duplicating data into secondary index so you have the data to
> hand to satisfy first query and only if the client wants more do you go dig in
> the primary table.  It also talks about how secondary indices can be described
> using their schema which might be of use.  Might be worth a gander.
>
> St.Ack
>
> On Fri, Feb 25, 2011 at 3:32 PM, Stack <[EMAIL PROTECTED]> wrote:
> > On Fri, Feb 25, 2011 at 1:47 PM, Eugene Koontz <[EMAIL PROTECTED]>
> wrote:
> >> I'm thinking that we could use a coprocessor that watches the
> >> Write-Ahead Log (using the WAL-edit operations
> >>  https://issues.apache.org/jira/browse/HBASE-3257 "Coprocessors:
> >> Extend server side integration API to include HLog operations"). This
> >> coprocessor would write these edits, perhaps filtering or
> >> transforming them, and enqueing the results in a global queue. A
> >> separate process would be responsible for pulling operations off the
> >> queue and using HBase client operations to do the insert into a
> >> secondary index table appropriate for that operation.
> >>    Perhaps we could use some of the work that the Lily people have
> >> done with HBase indexing (see
> >> http://www.lilyproject.org/lily/about/playground/hbaseindexes.html)
> >> in order to do the edit->hbase operation transformations and the
> >> secondary index table creation.
> >
> > This sounds good as first approach (including lily part).
> >
> > St.Ack
> >
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB