Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> HBase Developer's Pow-wow.


Copy link to this message
-
RE: HBase Developer's Pow-wow.
Hi

Yes, a separate index table along with the main table and the master should
ensure that the regions of both tables are collocated during assignments.

The regions in index table can be same as that of the main table in the
sense that both should have the same start and endkeys.  

Different indices can be grouped within these regions.  

In case of spare data definitely the index creation is going to be a
beneficial one.
In case of dense data may be the indices may be an overhead in some cases.

In one of the wiki pages of Cassandra I also read that they suggest to have
atleast one EQUALS condition in the query that tries to use indices. This
will help in confining the results to a specific set and over which the
range queries can be applied.  So may be at the first level we can see what
gain we get when we use EQUALs condition but any way the framework can be
generic to handle range queries and EQUALs condition queries.

After the meet up is over, I can go through the discussion topics and
provide our experiences also.  

Regards
Ram
> -----Original Message-----
> From: Andrew Purtell [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, September 11, 2012 9:52 AM
> To: [EMAIL PROTECTED]
> Subject: Re: HBase Developer's Pow-wow.
>
> Regarding this:
>
> On Mon, Sep 10, 2012 at 12:13 PM, Matt Corgan <[EMAIL PROTECTED]>
> wrote:
> > 1) Per-region or Per-table
> [...]
> > 1)
> > - Per-region: the index entries are stored on the same machine as the
> > primary rows
> > - Per-table: each index is stored in a separate table, requiring
> > cross-server consistency
>
> LarsH and I were discussing this a bit. This doesn't have to be a
> choice, it could be possible to have both, a separate table for index
> storage, and colocation of the index table regions and primary table
> regions on the same regionserver so cross-region consistency issues
> can be dealt with through low latency in-memory channels. (With
> fallback to cross-server consistency mechanism when placement can't be
> ideal when the cluster is out of steady state due to failure/churn.)
> The master might assign primary and index regions out together as a
> group.
>
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein (via Tom White)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB