Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Regarding Indexing columns in HBASE


Copy link to this message
-
Re: Regarding Indexing columns in HBASE
Rams - you might enjoy this blog post from HBase committer Jesse Yates (from last summer):

http://jyates.github.io/2012/07/09/consistent-enough-secondary-indexes.html

Secondary Indexing doesn't exist in HBase core today, but there are various proposals and early implementations of it in flight.

In the mean time, as Mike and others have said, if you don't need them to be immediately consistent in a real-time write scenario, you can simply write the same data into multiple tables in different sort orders. (This is hard in a real-time write scenario because, without cross-table transactions, you'd have to handle all the cases where the record was written but the index wasn't, or vice versa.)

Ian

On Jun 4, 2013, at 12:22 PM, Ramasubramanian Narayanan wrote:

Hi Michel,

If you don't mind can you please help explain in detail ...

Also can you pls let me know whether we have secondary index in HBASE?

regards,
Rams
On Tue, Jun 4, 2013 at 1:13 PM, Michel Segel <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>wrote:

Quick and dirty...

Create an inverted table for each index....
Then you can take the intersection of the result set(s) to get your list
of rows for further filtering.

There is obviously more to this, but its the core idea...
Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 4, 2013, at 11:51 AM, Shahab Yunus <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

Just a quick thought, why don't you create different tables and duplicate
data i.e. go for demoralization and data redundancy. Is your all read
access patterns that would require 70 columns are incorporated into one
application/client? Or it will be bunch of different
clients/applications?
If that is not the case then I think why not take advantage of more
storage.

Regards,
Shahab
On Tue, Jun 4, 2013 at 12:43 PM, Ramasubramanian Narayanan <
[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

Hi,

In a HBASE table, there are 200 columns and the read pattern for
diffferent
systems invols 70 columns...
In the above case, we cannot have 70 columns in the rowkey which will
not
be a good design...

Can you please suggest how to handle this problem?
Also can we do indexing in HBASE apart from rowkey? (something called
secondary index)

regards,
Rams