Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Question about filtering data


Copy link to this message
-
Re: Question about filtering data
Glad I was able to help.

bq. The result is a wished

I guess you meant 'The result is as wished'

On Wed, Feb 20, 2013 at 10:37 AM, Paul van Hoven <
[EMAIL PROTECTED]> wrote:

> Thank you for your answer. I applied the following filter object:
>
> Scan scan = new Scan( startRow, endRow );
>                         Filter f = new SingleColumnValueFilter(
> Bytes.toBytes("CF"),
> Bytes.toBytes("creativeId"), CompareOp.EQUAL, Bytes.toBytes("100") );
>                         scan.setFilter(f);
>
> The result is a wished:
> 215.132.144.196 , 100
> 111.209.213.26 , 100
> 56.90.211.104 , 100
> 232.141.206.11 , 100
> 110.73.138.136 , 100
>
>
> 2013/2/20 Ted Yu <[EMAIL PROTECTED]>:
> > Take a look at SingleColumnValueExcludeFilter :
> >
> >  * A {@link Filter} that checks a single column value, but does not emit
> the
> >
> >  * tested column. This will enable a performance boost over
> >
> >  * {@link SingleColumnValueFilter}, if the tested column value is not
> > actually
> >
> >  * needed as input (besides for the filtering itself).
> >
> > On Wed, Feb 20, 2013 at 10:06 AM, Paul van Hoven <
> > [EMAIL PROTECTED]> wrote:
> >
> >> Suppose I had the following data in a table:
> >>
> >>
> >> hbase(main):007:0> scan 'ToyDataTable'
> >> ROW                                            COLUMN+CELL
> >>  \x01\x07\x0C\xF8C\xF2\xCAE\xE3\xD4\xEC|\x02\x column=CF:creativeId,
> >> timestamp=1361383021175, value=100
> >>  C5%Q\x04~\xF9\x1C#\xA4\xCEUG\xA8\x84:\xAD\xFB
> >>  n\xBDr\x81D\xEAX\x17\xBF\x0B\xF2k^\xA4\xF7\xC
> >>  9\xAE\x9F
> >>  \x01\x07\x0C\xF8C\xF2\xCAE\xE3\xD4\xEC|\x02\x column=CF:ip,
> >> timestamp=1361383021175, value=182.18.51.44
> >>  C5%Q\x04~\xF9\x1C#\xA4\xCEUG\xA8\x84:\xAD\xFB
> >>  n\xBDr\x81D\xEAX\x17\xBF\x0B\xF2k^\xA4\xF7\xC
> >>  9\xAE\x9F
> >>  \x01\x07\x0C\xF8C\xF2\xCAE\xE3\xD4\xEC|\x02\x column=CF:creativeId,
> >> timestamp=1361383021176, value=200
> >>  C5%Q\x04~\xF9\x1C#\xA4\xCEUG\xA8\x84:\xAD\xFB
> >>  n\xBD\xA8t\xA1\xA3\xBAk\xADc\xC2m\xCC&s21~
> >>  \x01\x07\x0C\xF8C\xF2\xCAE\xE3\xD4\xEC|\x02\x column=CF:ip,
> >> timestamp=1361383021176, value=62.57.51.42
> >>  C5%Q\x04~\xF9\x1C#\xA4\xCEUG\xA8\x84:\xAD\xFB
> >>  n\xBD\xA8t\xA1\xA3\xBAk\xADc\xC2m\xCC&s21~
> >>
> >> So the table looks something like this
> >> RowKey md5(timestamp) + md5(ipaddress)
> >> Colum family name "CF"
> >> column qualifier names "ip" and "creativeId"
> >>
> >> So one row would be made out of
> >> ip = "192.193.32.1"
> >> creativeId = "100"
> >>
> >> Now I'd like to retrieve all the cell values for a given scan. In SQL
> >> I would do something like
> >>
> >> select * from ToyDataTable where creativeId = "100";
> >>
> >> In hBase I thought it would be possible do apply a ValueFilter object
> like
> >> this:
> >> Scan scan = new Scan( startRow, endRow );
> >> Filter f = new ValueFilter( CompareOp.EQUAL, new
> >> BinaryPrefixComparator( Bytes.toBytes("100") ) );
> >> scan.setFilter(f);
> >>
> >> ResultScanner rs = toyDataTable.getScanner( scan );
> >> for( Result r : rs ) {
> >>         String ip =  Bytes.toString( r.getValue( Bytes.toBytes("CF"),
> >> Bytes.toBytes("ip")) );
> >>         String creativeId =  Bytes.toString( r.getValue(
> >> Bytes.toBytes("CF"),
> >> Bytes.toBytes("creativeId")) );
> >>         System.out.println( ip + " , " + creativeId );
> >> }
> >>
> >> But the the actual result for this query looks like this:
> >>
> >> null , 100
> >> null , 100
> >> null , 100
> >> null , 100
> >> null , 100
> >> and so on
> >>
> >> I think I understand why the ip address is null in this case since it
> >> is sorted out by the filter object. But I actually would like to
> >> retrieve the whole data of the row depending on the value of just one
> >> cell in my case.
> >>
> >> Is this possible?
> >>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB