Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> improve performance of a MapReduce job with HBase input


Copy link to this message
-
Re: improve performance of a MapReduce job with HBase input
Hi,

you can make use of 'setCaching' method of your scan object.

Eg:
Scan objScan = new Scan();
objScan.setCaching(100); // set it to some integer, as per ur use case.

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCaching(int)

thanks,
Alok

On Fri, May 25, 2012 at 11:33 PM, Ey-Chih chow <[EMAIL PROTECTED]> wrote:

> Hi,
>
> We have a MapReduce job of which input data is from HBase.  We would like
> to improve performance of the job.  According to the HBase book, we can do
> that by setting scan caching to a number higher than default.  We use
> TableInputFormat to read data from the job.  I look at the implementation
> of the class.  The class does not set caching when a scan object is
> created.  Is there anybody know how to externally set caching for the scan
> created in TableInputFormat?  Thanks.
>
> Ey-Chih Chow