Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Scan startRow/stopRow vs. filter


Copy link to this message
-
Re: Scan startRow/stopRow vs. filter
According to http://hbase.apache.org/book.html#client.filter.row, in
general it is preferable to use start/stopRow rather than RowFilter.

I believe with a RowFilter, you would be doing a full table scan ...
--Suraj

On Thu, Mar 15, 2012 at 11:48 AM, Andy Lindeman <[EMAIL PROTECTED]> wrote:
> Hi all--
>
> I was reading the source code for Pig HBaseStorage loadfunc/storefunc recently.
>
> It accepts arguments such as -gte and -lt for scanning ranges of rows;
> however, it implements them by adding a RowFilter. Something that
> basically boils down to ...
>
>    scan = new Scan();
>    gte_ = Bytes.toBytesBinary(Utils.slashisize(configuredOptions_.getOptionValue("gte")));
>    scan.setFilter(new RowFilter(CompareOp.GREATOR_OR_EQUAL, new
> BinaryComparator(gte_)));
>
> How does this compare (in terms of equivalence and performance) to
> setting startRow on Scan .. such as ..
>
>    scan = new Scan();
>    scan.setStartRow(Bytes.toBytesBinary(Utils.slashisize(configuredOptions_.getOptionValue("gte")));
>
> Thanks.
>
> --
> Andy Lindeman
> http://www.andylindeman.com/
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB