Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> improve performance of a MapReduce job with HBase input


Copy link to this message
-
Re: improve performance of a MapReduce job with HBase input
Hi,

you can make use of 'setCaching' method of your scan object.

Eg:
Scan objScan = new Scan();
objScan.setCaching(100); // set it to some integer, as per ur use case.

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCaching(int)

thanks,
Alok

On Fri, May 25, 2012 at 11:33 PM, Ey-Chih chow <[EMAIL PROTECTED]> wrote:

> Hi,
>
> We have a MapReduce job of which input data is from HBase.  We would like
> to improve performance of the job.  According to the HBase book, we can do
> that by setting scan caching to a number higher than default.  We use
> TableInputFormat to read data from the job.  I look at the implementation
> of the class.  The class does not set caching when a scan object is
> created.  Is there anybody know how to externally set caching for the scan
> created in TableInputFormat?  Thanks.
>
> Ey-Chih Chow
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB