Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Scan startRow/stopRow vs. filter


Copy link to this message
-
Re: Scan startRow/stopRow vs. filter
According to http://hbase.apache.org/book.html#client.filter.row, in
general it is preferable to use start/stopRow rather than RowFilter.

I believe with a RowFilter, you would be doing a full table scan ...
--Suraj

On Thu, Mar 15, 2012 at 11:48 AM, Andy Lindeman <[EMAIL PROTECTED]> wrote:
> Hi all--
>
> I was reading the source code for Pig HBaseStorage loadfunc/storefunc recently.
>
> It accepts arguments such as -gte and -lt for scanning ranges of rows;
> however, it implements them by adding a RowFilter. Something that
> basically boils down to ...
>
>    scan = new Scan();
>    gte_ = Bytes.toBytesBinary(Utils.slashisize(configuredOptions_.getOptionValue("gte")));
>    scan.setFilter(new RowFilter(CompareOp.GREATOR_OR_EQUAL, new
> BinaryComparator(gte_)));
>
> How does this compare (in terms of equivalence and performance) to
> setting startRow on Scan .. such as ..
>
>    scan = new Scan();
>    scan.setStartRow(Bytes.toBytesBinary(Utils.slashisize(configuredOptions_.getOptionValue("gte")));
>
> Thanks.
>
> --
> Andy Lindeman
> http://www.andylindeman.com/