Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Hadoop-HBase table hierarchical column scan


Copy link to this message
-
Re: Hadoop-HBase table hierarchical column scan
Kiru Pakkirisamy 2013-08-10, 05:12
Lars,
We could be having anywhere between 1000-40000 columns in there.
I do have setup the bloomfilter on this column family as 'rowcol'. There are no writes at all in this app. 
Maybe, the bloomfilter algorithm is giving too many false positives (my columns are strings like T_123, T_34567).
Or the bloomfilter lookup on 50-60 prefixes is costlier than us creating a map of all the columns and looking up against another map.
(maybe this is what is happening and I should not be using this filtering)
For now,  I am going to make tables with composite keys. 
Will profile later and debug this.
 
Regards,
- kiru
Kiru Pakkirisamy | webcloudtech.wordpress.com
________________________________
 From: lars hofhansl <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Friday, August 9, 2013 9:38 PM
Subject: Re: Hadoop-HBase table hierarchical column scan
 

It all depends on how many other columns you have, whether the skip-scanning the filter does is beneficial or not.
It should not worsen the performance, though. If it does we should do some profiling and find out why.

-- Lars

________________________________
From: Kiru Pakkirisamy <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; lars hofhansl <[EMAIL PROTECTED]>; Kiru Pakkirisamy <[EMAIL PROTECTED]>
Sent: Friday, August 9, 2013 8:40 PM
Subject: Re: Hadoop-HBase table hierarchical column scan
I can confirm even after trying 0.94.10 that MultipleColumnPrefixFilter only worsens the performance.
 
Regards,
- kiru
Kiru Pakkirisamy | webcloudtech.wordpress.com
________________________________
From: Kiru Pakkirisamy <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; lars hofhansl <[EMAIL PROTECTED]>
Sent: Friday, August 9, 2013 1:02 PM
Subject: Re: Hadoop-HBase table hierarchical column scan
The Prefix filters did not work for me. Actually, performance went down. But I am going to try with fix for HBASE-6870 (suggested by Ted) deployed to our Performance cluster.
 
Regards,
- kiru
Kiru Pakkirisamy | webcloudtech.wordpress.com
________________________________
From: lars hofhansl <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Friday, August 9, 2013 12:55 PM
Subject: Re: Hadoop-HBase table hierarchical column scan
Take a look at ColumnRangeFilter, (probably better in your case) ColumnPrefixFilter, or MultipleColumnPrefixFilter.
Especially the latter two let you efficiently filter on prefixes of columns.

Note that if typically scan a subset of the columns, placing these prefixes into the row key will be more efficient, as the scanner can then avoid a full scan.

-- Lars
________________________________
From: Narlin M <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Friday, August 9, 2013 12:44 PM
Subject: Hadoop-HBase table hierarchical column scan
I am fairly new to the hadoop-hbase environment having started working on
it very recently, so I hope I am wording the question correctly.

I am trying to read data from a hadoop-hbase table which has only one
column family named 'DFLT'. This family contains hierarchical column
qualifiers "/source:int64/name:string". I want to read the name column for
a particular source value, say 10. How can I achieve this using the Scan
class?

I tried setting up the scan object as follows:

...

byte[] family = Bytes.toBytes("DFLT");
byte[] qualifier = Bytes.toBytes("source:name");

Scan scan = new Scan();
scan.addColumn(family, qualifier);

FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ALL);

SingleColumnValueFilter filter = new SingleColumnValueFilter(family,
Bytes.toBytes("source"), CompareFilter.CompareOp.EQUAL,Bytes.toBytes(10));

list.addFilter(filter);

scan.setFilter(list);

...
But I do not get any data back with this setup. I am guessing that I am not
setting up the hierarchical qualifiers correctly. Any and all pointers will
be appreciated.

Thanks, Narlin M.