|
|
-
How to speed up Scan -- please help
Jian Lu 2010-09-13, 16:54
Hi All,
I am pulling my hair off here because the Scan takes 40 sec to scan 750,000 records. I am running HBase 0.20.4 on standalone mode on Linux with 16 GB RAM, 64-bit CPU/Operating system/JVM. My table is very simple with one family and five columns, each column contains very small String (employee name, address, etc.).
Below is my configuration. Could you all please tell me what I missed?
config.set("hbase.client.pause", "10000"); config.set("hbase.client.retries.number", "100"); config.set("hbase.client.scanner.caching", "10000");
table.setScannerCaching(10000);
scan.setCaching(10000); scan.setCacheBlocks(false); Thanks a lot! Jack.
-
Re: How to speed up Scan -- please help
Andrey Stepachev 2010-09-13, 18:00
What "vmstat 1" shows? Does gzip family used?
2010/9/13 Jian Lu <[EMAIL PROTECTED]>: > Hi All, > > I am pulling my hair off here because the Scan takes 40 sec to scan 750,000 records. I am running HBase 0.20.4 on standalone mode on Linux with 16 GB RAM, 64-bit CPU/Operating system/JVM. My table is very simple with one family and five columns, each column contains very small String (employee name, address, etc.). > > Below is my configuration. Could you all please tell me what I missed? > > config.set("hbase.client.pause", "10000"); > config.set("hbase.client.retries.number", "100"); > config.set("hbase.client.scanner.caching", "10000"); > > table.setScannerCaching(10000); > > scan.setCaching(10000); > scan.setCacheBlocks(false); > > > Thanks a lot! > Jack. >
-
RE: How to speed up Scan -- please help
Jian Lu 2010-09-13, 18:07
How to tell if I'm using gzip family?
Below is the output from vmstat 1:
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 489068 153664 2789984 0 0 0 1 0 0 0 0 100 0 0 0 0 0 489068 153664 2789984 0 0 0 0 1049 155 0 0 100 0 0 0 0 0 489068 153664 2789984 0 0 0 0 1019 152 0 0 100 0 0 0 0 0 489068 153664 2789984 0 0 0 0 1047 182 0 0 100 0 0 0 0 0 489068 153664 2789984 0 0 0 96 1035 176 0 0 90 10 0 Jack.
-----Original Message----- From: Andrey Stepachev [mailto:[EMAIL PROTECTED]] Sent: Monday, September 13, 2010 11:00 AM To: [EMAIL PROTECTED] Subject: Re: How to speed up Scan -- please help
What "vmstat 1" shows? Does gzip family used?
2010/9/13 Jian Lu <[EMAIL PROTECTED]>: > Hi All, > > I am pulling my hair off here because the Scan takes 40 sec to scan 750,000 records. I am running HBase 0.20.4 on standalone mode on Linux with 16 GB RAM, 64-bit CPU/Operating system/JVM. My table is very simple with one family and five columns, each column contains very small String (employee name, address, etc.). > > Below is my configuration. Could you all please tell me what I missed? > > config.set("hbase.client.pause", "10000"); > config.set("hbase.client.retries.number", "100"); > config.set("hbase.client.scanner.caching", "10000"); > > table.setScannerCaching(10000); > > scan.setCaching(10000); > scan.setCacheBlocks(false); > > > Thanks a lot! > Jack. >
-
Re: How to speed up Scan -- please help
Andrey Stepachev 2010-09-13, 19:06
vmstat should be taken while you are scanning. it can show you, what you host is doing. where you bottleneck: in cpu, or in disk io.
compression status can be found in hbase shell by issuing describe 'tablename' and look at COMPRESSION parameter, or you can look at description of table in hbase web interface.
2010/9/13 Jian Lu <[EMAIL PROTECTED]>: > How to tell if I'm using gzip family? > > Below is the output from vmstat 1: > > procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ > r b swpd free buff cache si so bi bo in cs us sy id wa st > 0 0 0 489068 153664 2789984 0 0 0 1 0 0 0 0 100 0 0 > 0 0 0 489068 153664 2789984 0 0 0 0 1049 155 0 0 100 0 0 > 0 0 0 489068 153664 2789984 0 0 0 0 1019 152 0 0 100 0 0 > 0 0 0 489068 153664 2789984 0 0 0 0 1047 182 0 0 100 0 0 > 0 0 0 489068 153664 2789984 0 0 0 96 1035 176 0 0 90 10 0 > > > Jack. >
|
|