Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Question about filtering data


Copy link to this message
-
Question about filtering data
Paul van Hoven 2013-02-20, 18:06
Suppose I had the following data in a table:
hbase(main):007:0> scan 'ToyDataTable'
ROW                                            COLUMN+CELL
 \x01\x07\x0C\xF8C\xF2\xCAE\xE3\xD4\xEC|\x02\x column=CF:creativeId,
timestamp=1361383021175, value=100
 C5%Q\x04~\xF9\x1C#\xA4\xCEUG\xA8\x84:\xAD\xFB
 n\xBDr\x81D\xEAX\x17\xBF\x0B\xF2k^\xA4\xF7\xC
 9\xAE\x9F
 \x01\x07\x0C\xF8C\xF2\xCAE\xE3\xD4\xEC|\x02\x column=CF:ip,
timestamp=1361383021175, value=182.18.51.44
 C5%Q\x04~\xF9\x1C#\xA4\xCEUG\xA8\x84:\xAD\xFB
 n\xBDr\x81D\xEAX\x17\xBF\x0B\xF2k^\xA4\xF7\xC
 9\xAE\x9F
 \x01\x07\x0C\xF8C\xF2\xCAE\xE3\xD4\xEC|\x02\x column=CF:creativeId,
timestamp=1361383021176, value=200
 C5%Q\x04~\xF9\x1C#\xA4\xCEUG\xA8\x84:\xAD\xFB
 n\xBD\xA8t\xA1\xA3\xBAk\xADc\xC2m\xCC&s21~
 \x01\x07\x0C\xF8C\xF2\xCAE\xE3\xD4\xEC|\x02\x column=CF:ip,
timestamp=1361383021176, value=62.57.51.42
 C5%Q\x04~\xF9\x1C#\xA4\xCEUG\xA8\x84:\xAD\xFB
 n\xBD\xA8t\xA1\xA3\xBAk\xADc\xC2m\xCC&s21~

So the table looks something like this
RowKey md5(timestamp) + md5(ipaddress)
Colum family name "CF"
column qualifier names "ip" and "creativeId"

So one row would be made out of
ip = "192.193.32.1"
creativeId = "100"

Now I'd like to retrieve all the cell values for a given scan. In SQL
I would do something like

select * from ToyDataTable where creativeId = "100";

In hBase I thought it would be possible do apply a ValueFilter object like this:
Scan scan = new Scan( startRow, endRow );
Filter f = new ValueFilter( CompareOp.EQUAL, new
BinaryPrefixComparator( Bytes.toBytes("100") ) );
scan.setFilter(f);

ResultScanner rs = toyDataTable.getScanner( scan );
for( Result r : rs ) {
String ip =  Bytes.toString( r.getValue( Bytes.toBytes("CF"),
Bytes.toBytes("ip")) );
String creativeId =  Bytes.toString( r.getValue( Bytes.toBytes("CF"),
Bytes.toBytes("creativeId")) );
System.out.println( ip + " , " + creativeId );
}

But the the actual result for this query looks like this:

null , 100
null , 100
null , 100
null , 100
null , 100
and so on

I think I understand why the ip address is null in this case since it
is sorted out by the filter object. But I actually would like to
retrieve the whole data of the row depending on the value of just one
cell in my case.

Is this possible?
+
Ted Yu 2013-02-20, 18:11
+
Paul van Hoven 2013-02-20, 18:37
+
Ted Yu 2013-02-20, 18:42