Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - How HBase perform per-column scan?


Copy link to this message
-
Re: How HBase perform per-column scan?
Anoop John 2013-03-10, 15:53
As per the above said, you will need a full table scan on that CF.
As Ted said, consider having a look at your schema design.

-Anoop-
On Sun, Mar 10, 2013 at 8:10 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> bq. physically column family should be able to perform efficiently (storage
> layer
>
> When you scan a row, data for different column families would be brought
> into memory (if you don't utilize HBASE-5416)
> Take a look at:
>
> https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=13541258&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13541258
>
> which was based on the settings described in:
>
>
> https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=13541191&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13541191
>
> This boils down to your schema design. If possible, consider extracting
> column C into its own column family.
>
> Cheers
>
> On Sun, Mar 10, 2013 at 7:14 AM, PG <[EMAIL PROTECTED]> wrote:
>
> > Hi, Ted and Anoop, thanks for your notes.
> > I am talking about column rather than column family, since physically
> > column family should be able to perform efficiently (storage layer, CF's
> > are stored separately). But columns of the same column family may be
> mixed
> > physically, and that makes filters column value hard... So I want to know
> > if there are any mechanism in HBase worked on this...
> > Regards,
> > Yun
> >
> > On Mar 10, 2013, at 10:01 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >
> > > Hi, Yun:
> > > Take a look at HBASE-5416 (Improve performance of scans with some kind
> of
> > > filters) which is in 0.94.5 release.
> > >
> > > In your case, you can use a filter which specifies column C as the
> > > essential family.
> > > Here I interpret column C as column family.
> > >
> > > Cheers
> > >
> > > On Sat, Mar 9, 2013 at 11:11 AM, yun peng <[EMAIL PROTECTED]>
> wrote:
> > >
> > >> Hi, All,
> > >> I want to find all existing values for a given column in a HBase, and
> > would
> > >> that result in a full-table scan in HBase? For example, given a column
> > C,
> > >> the table is of very large number of rows, from which few rows (say
> > only 1
> > >> row) have non-empty values for column C. Would HBase still ues a full
> > table
> > >> scan to find this row? Or HBase has any optimization work for this
> kind
> > of
> > >> query?
> > >> Thanks...
> > >> Regards
> > >> Yun
> > >>
> >
>