Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> speeding up rowcount


Copy link to this message
-
Re: speeding up rowcount
Since a MapReduce is a separate process, try with a high Scan cache value.

http://hbase.apache.org/book.html#perf.hbase.client.caching

Himanshu

On Sun, Oct 9, 2011 at 9:09 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> I guess your hbase.hregion.max.filesize is quite high.
> If possible, lower its value so that you have smaller regions.
>
> On Sun, Oct 9, 2011 at 7:50 AM, Rita <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> I have been doing a rowcount via mapreduce and its taking about 4-5 hours
>> to
>> count a 500million rows in a table. I was wondering if there are any map
>> reduce tunings I can do so it will go much faster.
>>
>> I have 10 node cluster, each node with 8CPUs with 64GB of memory. Any
>> tuning
>> advice would be much appreciated.
>>
>>
>> --
>> --- Get your facts first, then you can distort them as you please.--
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB