Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> question about merge-join (or AND operator betwween colums)


Copy link to this message
-
Re: question about merge-join (or AND operator betwween colums)
I don't think that it is possible on scanner level with bloomfilters
(families are in separate files, so
they scanned independently).
But you can use filters, to filter out unneeded data.

2011/1/8 Jack Levin <[EMAIL PROTECTED]>

> Hello all, I have a scanner question, we have this table:
>
> hbase(main):002:0> scan 'mattest'
> ROW                                          COLUMN+CELL
>  1                                           column=generic:,
> timestamp=1294454057618, value=1
>  1                                           column=photo:,
> timestamp=1294453830339, value=1
>  1                                           column=type:,
> timestamp=1294453812716, value=photo
>  1                                           column=type:photo,
> timestamp=1294453884174, value=photo
>  2                                           column=generic:,
> timestamp=1294454061156, value=1
>  2                                           column=type:,
> timestamp=1294453851757, value=video
>  2                                           column=type:video,
> timestamp=1294453877719, value=video
>  2                                           column=video:,
> timestamp=1294453842722, value=1
>
> We need to run this query:
>
> hbase(main):004:0> scan 'mattest', {COLUMNS => ['generic', 'photo']}
> ROW                                          COLUMN+CELL
>  1                                           column=generic:,
> timestamp=1294454057618, value=1
>  1                                           column=photo:,
> timestamp=1294453830339, value=1
>  2                                           column=generic:,
> timestamp=1294454061156, value=1
>
> Note that  ['generic', 'photo'], utilizes 'OR' operator, and not
> 'AND'.   Is it possible to create a scanner that will not AND and not
> OR?, in which case something like this:
>
> scan 'mattest', {COLUMNS => ['generic' AND 'photo']}
> ROW                                          COLUMN+CELL
>  1                                           column=generic:,
> timestamp=1294454057618, value=1
>  1                                           column=photo:,
> timestamp=1294453830339, value=1
>
> Thanks in advance.
>
> -Jack
>