|
Jonathan Bishop
2012-10-12, 00:08
Pankaj Misra
2012-10-12, 03:23
Jonathan Bishop
2012-10-12, 06:34
Pankaj Misra
2012-10-12, 06:47
Suraj Varma
2012-10-12, 07:34
Jonathan Bishop
2012-10-12, 19:07
Kevin O'dell
2012-10-12, 13:44
Jonathan Bishop
2012-10-12, 19:15
Bryan Beaudreault
2012-10-12, 19:46
Suraj Varma
2012-10-13, 02:30
Suraj Varma
2012-10-13, 02:49
Jonathan Bishop
2012-10-13, 15:55
Jonathan Bishop
2012-10-13, 15:58
Matt Corgan
2012-10-14, 05:37
Jonathan Bishop
2012-10-14, 15:48
lars hofhansl
2012-10-15, 01:03
Michel Segel
2012-10-15, 11:41
Matt Corgan
2012-10-15, 17:23
Matt Corgan
2012-10-15, 00:48
Jonathan Bishop
2012-10-15, 02:42
|
-
more regionservers does not improve performanceJonathan Bishop 2012-10-12, 00:08
Hi,
I am running a MR job with 40 simultaneous mappers, each of which does puts to HBase. I have ganged up the puts into groups of 1000 (this seems to help quite a bit) and also made sure that the table is pre-split into 100 regions, and that the row keys are randomized using MD5 hashing. My cluster size is 10, and I am allowing 4 mappers per tasktracker. In my MR job I know that the mappers are able to generate puts much faster than the puts can be handled in hbase. In other words if I let the mappers run without doing hbase puts then everything scales as you would expect with the number of mappers created. It is the hbase puts which seem to be the bottleneck. What is strange is that I do not get much run time improvement by increasing the number regionservers beyond about 4. Indeed, it seems that the system runs slower with 8 regionservers than with 4. I have added the following in hbase-env.sh hoping this would help... (from the book HBase in Action) export HBASE_OPTS="-Xmx8g" export HBASE_REGIONSERVER_OPTS="-Xmx8g -Xms8g -Xmn128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70" # Uncomment below to enable java garbage collection logging in the .out file. export HBASE_OPTS="${HBASE_OPTS} -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:${HBASE_HOME}/logs/gc-hbase.log" Monitoring hbase through the web ui I see that there are pauses for flushing, which seems to run pretty quickly, and for compacting, which seems to take somewhat longer. Any advice for making this run faster would be greatly appreciated. Currently I am looking into installing Ganglia to better monitory my cluster, but yet to have that running. I suspect an I/O issue as the regionservers do not seem terribly loaded. Thanks, Jon +
Jonathan Bishop 2012-10-12, 00:08
-
RE: more regionservers does not improve performancePankaj Misra 2012-10-12, 03:23
Hi Jonathan,
What seems to me is that, while doing the split across all 40 mappers, the keys are not randomized enough to leverage multiple regions and the pre-split strategy. This may be happening because all the 40 mappers may be trying to write onto a single region for sometime, making it a HOT region, till the key falls into another region, and then the other region becomes a HOT region hence you may seeing a high impact of compaction cycles reducing your throughput. Are the keys incremental? Are the keys randomized enough across the splits? Ideally when all 40 mappers are running you should see all the regions being filled up in parallel for maximum throughput. Hope it helps. Thanks and Regards Pankaj Misra ________________________________________ From: Jonathan Bishop [[EMAIL PROTECTED]] Sent: Friday, October 12, 2012 5:38 AM To: [EMAIL PROTECTED] Subject: more regionservers does not improve performance Hi, I am running a MR job with 40 simultaneous mappers, each of which does puts to HBase. I have ganged up the puts into groups of 1000 (this seems to help quite a bit) and also made sure that the table is pre-split into 100 regions, and that the row keys are randomized using MD5 hashing. My cluster size is 10, and I am allowing 4 mappers per tasktracker. In my MR job I know that the mappers are able to generate puts much faster than the puts can be handled in hbase. In other words if I let the mappers run without doing hbase puts then everything scales as you would expect with the number of mappers created. It is the hbase puts which seem to be the bottleneck. What is strange is that I do not get much run time improvement by increasing the number regionservers beyond about 4. Indeed, it seems that the system runs slower with 8 regionservers than with 4. I have added the following in hbase-env.sh hoping this would help... (from the book HBase in Action) export HBASE_OPTS="-Xmx8g" export HBASE_REGIONSERVER_OPTS="-Xmx8g -Xms8g -Xmn128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70" # Uncomment below to enable java garbage collection logging in the .out file. export HBASE_OPTS="${HBASE_OPTS} -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:${HBASE_HOME}/logs/gc-hbase.log" Monitoring hbase through the web ui I see that there are pauses for flushing, which seems to run pretty quickly, and for compacting, which seems to take somewhat longer. Any advice for making this run faster would be greatly appreciated. Currently I am looking into installing Ganglia to better monitory my cluster, but yet to have that running. I suspect an I/O issue as the regionservers do not seem terribly loaded. Thanks, Jon ________________________________ Impetus Ranked in the Top 50 India’s Best Companies to Work For 2012. Impetus webcast ‘Designing a Test Automation Framework for Multi-vendor Interoperable Systems’ available at http://lf1.me/0E/. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. +
Pankaj Misra 2012-10-12, 03:23
-
RE: more regionservers does not improve performanceJonathan Bishop 2012-10-12, 06:34
Pankaj,
Thanks for the reply. Actually, I am using MD5 hashing to evenly spread the keys among the splits, so I don’t believe there is any hotspot. In fact, when I monitory the web UI for HBase I see a very even load on all the regionservers. Jon Sent from my Windows 8 PC <http://windows.microsoft.com/consumer-preview> *From:* Pankaj Misra <[EMAIL PROTECTED]> *Sent:* Thursday, October 11, 2012 8:24:32 PM *To:* [EMAIL PROTECTED] *Subject:* RE: more regionservers does not improve performance Hi Jonathan, What seems to me is that, while doing the split across all 40 mappers, the keys are not randomized enough to leverage multiple regions and the pre-split strategy. This may be happening because all the 40 mappers may be trying to write onto a single region for sometime, making it a HOT region, till the key falls into another region, and then the other region becomes a HOT region hence you may seeing a high impact of compaction cycles reducing your throughput. Are the keys incremental? Are the keys randomized enough across the splits? Ideally when all 40 mappers are running you should see all the regions being filled up in parallel for maximum throughput. Hope it helps. Thanks and Regards Pankaj Misra ________________________________________ From: Jonathan Bishop [[EMAIL PROTECTED]] Sent: Friday, October 12, 2012 5:38 AM To: [EMAIL PROTECTED] Subject: more regionservers does not improve performance Hi, I am running a MR job with 40 simultaneous mappers, each of which does puts to HBase. I have ganged up the puts into groups of 1000 (this seems to help quite a bit) and also made sure that the table is pre-split into 100 regions, and that the row keys are randomized using MD5 hashing. My cluster size is 10, and I am allowing 4 mappers per tasktracker. In my MR job I know that the mappers are able to generate puts much faster than the puts can be handled in hbase. In other words if I let the mappers run without doing hbase puts then everything scales as you would expect with the number of mappers created. It is the hbase puts which seem to be the bottleneck. What is strange is that I do not get much run time improvement by increasing the number regionservers beyond about 4. Indeed, it seems that the system runs slower with 8 regionservers than with 4. I have added the following in hbase-env.sh hoping this would help... (from the book HBase in Action) export HBASE_OPTS="-Xmx8g" export HBASE_REGIONSERVER_OPTS="-Xmx8g -Xms8g -Xmn128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70" # Uncomment below to enable java garbage collection logging in the .out file. export HBASE_OPTS="${HBASE_OPTS} -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:${HBASE_HOME}/logs/gc-hbase.log" Monitoring hbase through the web ui I see that there are pauses for flushing, which seems to run pretty quickly, and for compacting, which seems to take somewhat longer. Any advice for making this run faster would be greatly appreciated. Currently I am looking into installing Ganglia to better monitory my cluster, but yet to have that running. I suspect an I/O issue as the regionservers do not seem terribly loaded. Thanks, Jon ________________________________ Impetus Ranked in the Top 50 India’s Best Companies to Work For 2012. Impetus webcast ‘Designing a Test Automation Framework for Multi-vendor Interoperable Systems’ available at http://lf1.me/0E/. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. +
Jonathan Bishop 2012-10-12, 06:34
-
RE: more regionservers does not improve performancePankaj Misra 2012-10-12, 06:47
OK, Looks like I missed out reading that part in your original mail. Did you try some of the compaction tweaks and configurations as explained in the following link for your data?
http://hbase.apache.org/book/regions.arch.html#compaction Also, how much data are your putting into the regions, and how big is one region at the end of data ingestion? Thanks and Regards Pankaj Misra -----Original Message----- From: Jonathan Bishop [mailto:[EMAIL PROTECTED]] Sent: Friday, October 12, 2012 12:04 PM To: [EMAIL PROTECTED] Subject: RE: more regionservers does not improve performance Pankaj, Thanks for the reply. Actually, I am using MD5 hashing to evenly spread the keys among the splits, so I don’t believe there is any hotspot. In fact, when I monitory the web UI for HBase I see a very even load on all the regionservers. Jon Sent from my Windows 8 PC <http://windows.microsoft.com/consumer-preview> *From:* Pankaj Misra <[EMAIL PROTECTED]> *Sent:* Thursday, October 11, 2012 8:24:32 PM *To:* [EMAIL PROTECTED] *Subject:* RE: more regionservers does not improve performance Hi Jonathan, What seems to me is that, while doing the split across all 40 mappers, the keys are not randomized enough to leverage multiple regions and the pre-split strategy. This may be happening because all the 40 mappers may be trying to write onto a single region for sometime, making it a HOT region, till the key falls into another region, and then the other region becomes a HOT region hence you may seeing a high impact of compaction cycles reducing your throughput. Are the keys incremental? Are the keys randomized enough across the splits? Ideally when all 40 mappers are running you should see all the regions being filled up in parallel for maximum throughput. Hope it helps. Thanks and Regards Pankaj Misra ________________________________________ From: Jonathan Bishop [[EMAIL PROTECTED]] Sent: Friday, October 12, 2012 5:38 AM To: [EMAIL PROTECTED] Subject: more regionservers does not improve performance Hi, I am running a MR job with 40 simultaneous mappers, each of which does puts to HBase. I have ganged up the puts into groups of 1000 (this seems to help quite a bit) and also made sure that the table is pre-split into 100 regions, and that the row keys are randomized using MD5 hashing. My cluster size is 10, and I am allowing 4 mappers per tasktracker. In my MR job I know that the mappers are able to generate puts much faster than the puts can be handled in hbase. In other words if I let the mappers run without doing hbase puts then everything scales as you would expect with the number of mappers created. It is the hbase puts which seem to be the bottleneck. What is strange is that I do not get much run time improvement by increasing the number regionservers beyond about 4. Indeed, it seems that the system runs slower with 8 regionservers than with 4. I have added the following in hbase-env.sh hoping this would help... (from the book HBase in Action) export HBASE_OPTS="-Xmx8g" export HBASE_REGIONSERVER_OPTS="-Xmx8g -Xms8g -Xmn128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70" # Uncomment below to enable java garbage collection logging in the .out file. export HBASE_OPTS="${HBASE_OPTS} -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:${HBASE_HOME}/logs/gc-hbase.log" Monitoring hbase through the web ui I see that there are pauses for flushing, which seems to run pretty quickly, and for compacting, which seems to take somewhat longer. Any advice for making this run faster would be greatly appreciated. Currently I am looking into installing Ganglia to better monitory my cluster, but yet to have that running. I suspect an I/O issue as the regionservers do not seem terribly loaded. Thanks, Jon ________________________________ Impetus Ranked in the Top 50 India’s Best Companies to Work For 2012. Impetus webcast ‘Designing a Test Automation Framework for Multi-vendor Interoperable Systems’ available at http://lf1.me/0E/. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. ________________________________ Impetus Ranked in the Top 50 India’s Best Companies to Work For 2012. Impetus webcast ‘Designing a Test Automation Framework for Multi-vendor Interoperable Systems’ available at http://lf1.me/0E/. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. +
Pankaj Misra 2012-10-12, 06:47
-
Re: more regionservers does not improve performanceSuraj Varma 2012-10-12, 07:34
What have you configured your hbase.hstore.blockingStoreFiles and
hbase.hregion.memstore.block.multiplier? Both of these block updates when the limit is hit. Try increasing these to say 20 and 4 from the default 7 and 2 and see if it helps. If this still doesn't help, see if you can set up ganglia to get a better insight into what is bottlenecking. --Suraj On Thu, Oct 11, 2012 at 11:47 PM, Pankaj Misra <[EMAIL PROTECTED]> wrote: > OK, Looks like I missed out reading that part in your original mail. Did you try some of the compaction tweaks and configurations as explained in the following link for your data? > http://hbase.apache.org/book/regions.arch.html#compaction > > > Also, how much data are your putting into the regions, and how big is one region at the end of data ingestion? > > Thanks and Regards > Pankaj Misra > > -----Original Message----- > From: Jonathan Bishop [mailto:[EMAIL PROTECTED]] > Sent: Friday, October 12, 2012 12:04 PM > To: [EMAIL PROTECTED] > Subject: RE: more regionservers does not improve performance > > Pankaj, > > Thanks for the reply. > > Actually, I am using MD5 hashing to evenly spread the keys among the splits, so I don’t believe there is any hotspot. In fact, when I monitory the web UI for HBase I see a very even load on all the regionservers. > > Jon > > Sent from my Windows 8 PC <http://windows.microsoft.com/consumer-preview> > > *From:* Pankaj Misra <[EMAIL PROTECTED]> > *Sent:* Thursday, October 11, 2012 8:24:32 PM > *To:* [EMAIL PROTECTED] > *Subject:* RE: more regionservers does not improve performance > > Hi Jonathan, > > What seems to me is that, while doing the split across all 40 mappers, the keys are not randomized enough to leverage multiple regions and the pre-split strategy. This may be happening because all the 40 mappers may be trying to write onto a single region for sometime, making it a HOT region, till the key falls into another region, and then the other region becomes a HOT region hence you may seeing a high impact of compaction cycles reducing your throughput. > > Are the keys incremental? Are the keys randomized enough across the splits? > > Ideally when all 40 mappers are running you should see all the regions being filled up in parallel for maximum throughput. Hope it helps. > > Thanks and Regards > Pankaj Misra > > > ________________________________________ > From: Jonathan Bishop [[EMAIL PROTECTED]] > Sent: Friday, October 12, 2012 5:38 AM > To: [EMAIL PROTECTED] > Subject: more regionservers does not improve performance > > Hi, > > I am running a MR job with 40 simultaneous mappers, each of which does puts to HBase. I have ganged up the puts into groups of 1000 (this seems to help quite a bit) and also made sure that the table is pre-split into 100 regions, and that the row keys are randomized using MD5 hashing. > > My cluster size is 10, and I am allowing 4 mappers per tasktracker. > > In my MR job I know that the mappers are able to generate puts much faster than the puts can be handled in hbase. In other words if I let the mappers run without doing hbase puts then everything scales as you would expect with the number of mappers created. It is the hbase puts which seem to be the bottleneck. > > What is strange is that I do not get much run time improvement by increasing the number regionservers beyond about 4. Indeed, it seems that the system runs slower with 8 regionservers than with 4. > > I have added the following in hbase-env.sh hoping this would help... (from the book HBase in Action) > > export HBASE_OPTS="-Xmx8g" > export HBASE_REGIONSERVER_OPTS="-Xmx8g -Xms8g -Xmn128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70" > > # Uncomment below to enable java garbage collection logging in the .out file. > export HBASE_OPTS="${HBASE_OPTS} -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:${HBASE_HOME}/logs/gc-hbase.log" > > Monitoring hbase through the web ui I see that there are pauses for flushing, which seems to run pretty quickly, and for compacting, which seems to take somewhat longer. +
Suraj Varma 2012-10-12, 07:34
-
Re: more regionservers does not improve performanceJonathan Bishop 2012-10-12, 19:07
Suraj,
Thanks for the quick reply. I tried various values of hbase.hstore.blockingStoreFiles and hbase.hregion.memstore.block.multiplier, but this did not have any effect I could discern. I am still getting about 5K rows per second regardless of the number of regionservers I am running. After asking around I discovered we do have ganglia installed on our grid. Taking a look at some of the machines running my regionservers I do see a spike in I/O when I run my MR job. Not sure if this is the bottleneck for me though, as the spikes are at about 20-30 MB/sec. Jon On Fri, Oct 12, 2012 at 12:34 AM, Suraj Varma <[EMAIL PROTECTED]> wrote: > What have you configured your hbase.hstore.blockingStoreFiles and > hbase.hregion.memstore.block.multiplier? Both of these block updates > when the limit is hit. Try increasing these to say 20 and 4 from the > default 7 and 2 and see if it helps. > > If this still doesn't help, see if you can set up ganglia to get a > better insight into what is bottlenecking. > --Suraj > > > > On Thu, Oct 11, 2012 at 11:47 PM, Pankaj Misra > <[EMAIL PROTECTED]> wrote: > > OK, Looks like I missed out reading that part in your original mail. Did > you try some of the compaction tweaks and configurations as explained in > the following link for your data? > > http://hbase.apache.org/book/regions.arch.html#compaction > > > > > > Also, how much data are your putting into the regions, and how big is > one region at the end of data ingestion? > > > > Thanks and Regards > > Pankaj Misra > > > > -----Original Message----- > > From: Jonathan Bishop [mailto:[EMAIL PROTECTED]] > > Sent: Friday, October 12, 2012 12:04 PM > > To: [EMAIL PROTECTED] > > Subject: RE: more regionservers does not improve performance > > > > Pankaj, > > > > Thanks for the reply. > > > > Actually, I am using MD5 hashing to evenly spread the keys among the > splits, so I don’t believe there is any hotspot. In fact, when I monitory > the web UI for HBase I see a very even load on all the regionservers. > > > > Jon > > > > Sent from my Windows 8 PC <http://windows.microsoft.com/consumer-preview > > > > > > *From:* Pankaj Misra <[EMAIL PROTECTED]> > > *Sent:* Thursday, October 11, 2012 8:24:32 PM > > *To:* [EMAIL PROTECTED] > > *Subject:* RE: more regionservers does not improve performance > > > > Hi Jonathan, > > > > What seems to me is that, while doing the split across all 40 mappers, > the keys are not randomized enough to leverage multiple regions and the > pre-split strategy. This may be happening because all the 40 mappers may be > trying to write onto a single region for sometime, making it a HOT region, > till the key falls into another region, and then the other region becomes > a HOT region hence you may seeing a high impact of compaction cycles > reducing your throughput. > > > > Are the keys incremental? Are the keys randomized enough across the > splits? > > > > Ideally when all 40 mappers are running you should see all the regions > being filled up in parallel for maximum throughput. Hope it helps. > > > > Thanks and Regards > > Pankaj Misra > > > > > > ________________________________________ > > From: Jonathan Bishop [[EMAIL PROTECTED]] > > Sent: Friday, October 12, 2012 5:38 AM > > To: [EMAIL PROTECTED] > > Subject: more regionservers does not improve performance > > > > Hi, > > > > I am running a MR job with 40 simultaneous mappers, each of which does > puts to HBase. I have ganged up the puts into groups of 1000 (this seems to > help quite a bit) and also made sure that the table is pre-split into 100 > regions, and that the row keys are randomized using MD5 hashing. > > > > My cluster size is 10, and I am allowing 4 mappers per tasktracker. > > > > In my MR job I know that the mappers are able to generate puts much > faster than the puts can be handled in hbase. In other words if I let the > mappers run without doing hbase puts then everything scales as you would > expect with the number of mappers created. It is the hbase puts which seem +
Jonathan Bishop 2012-10-12, 19:07
-
Re: more regionservers does not improve performanceKevin O'dell 2012-10-12, 13:44
Jonathan,
Lets take a deeper look here. What is your memstore set at for the table/CF in question? Lets compare that value with the flush size you are seeing for your regions. If they are really small flushes is it all to the same region? If so that is going to be schema issues. If they are full flushes you can up your memstore assuming you have the heap to cover it. If they are smaller flushes but to different regions you most likely are suffering from global limit pressure and flushing too soon. Are you flushing prematurely due to HLogs rolling? Take a look for too many hlogs and look at the flushes. It may benefit you to raise that value. Are you blocking? As Suraj was saying you may be blocking in 90second blocks. Check the RS logs for those messages as well and then Suraj's advice. This is where I would start to optimize your write path. I hope the above helps. On Fri, Oct 12, 2012 at 3:34 AM, Suraj Varma <[EMAIL PROTECTED]> wrote: > What have you configured your hbase.hstore.blockingStoreFiles and > hbase.hregion.memstore.block.multiplier? Both of these block updates > when the limit is hit. Try increasing these to say 20 and 4 from the > default 7 and 2 and see if it helps. > > If this still doesn't help, see if you can set up ganglia to get a > better insight into what is bottlenecking. > --Suraj > > > > On Thu, Oct 11, 2012 at 11:47 PM, Pankaj Misra > <[EMAIL PROTECTED]> wrote: > > OK, Looks like I missed out reading that part in your original mail. Did > you try some of the compaction tweaks and configurations as explained in > the following link for your data? > > http://hbase.apache.org/book/regions.arch.html#compaction > > > > > > Also, how much data are your putting into the regions, and how big is > one region at the end of data ingestion? > > > > Thanks and Regards > > Pankaj Misra > > > > -----Original Message----- > > From: Jonathan Bishop [mailto:[EMAIL PROTECTED]] > > Sent: Friday, October 12, 2012 12:04 PM > > To: [EMAIL PROTECTED] > > Subject: RE: more regionservers does not improve performance > > > > Pankaj, > > > > Thanks for the reply. > > > > Actually, I am using MD5 hashing to evenly spread the keys among the > splits, so I don’t believe there is any hotspot. In fact, when I monitory > the web UI for HBase I see a very even load on all the regionservers. > > > > Jon > > > > Sent from my Windows 8 PC <http://windows.microsoft.com/consumer-preview > > > > > > *From:* Pankaj Misra <[EMAIL PROTECTED]> > > *Sent:* Thursday, October 11, 2012 8:24:32 PM > > *To:* [EMAIL PROTECTED] > > *Subject:* RE: more regionservers does not improve performance > > > > Hi Jonathan, > > > > What seems to me is that, while doing the split across all 40 mappers, > the keys are not randomized enough to leverage multiple regions and the > pre-split strategy. This may be happening because all the 40 mappers may be > trying to write onto a single region for sometime, making it a HOT region, > till the key falls into another region, and then the other region becomes > a HOT region hence you may seeing a high impact of compaction cycles > reducing your throughput. > > > > Are the keys incremental? Are the keys randomized enough across the > splits? > > > > Ideally when all 40 mappers are running you should see all the regions > being filled up in parallel for maximum throughput. Hope it helps. > > > > Thanks and Regards > > Pankaj Misra > > > > > > ________________________________________ > > From: Jonathan Bishop [[EMAIL PROTECTED]] > > Sent: Friday, October 12, 2012 5:38 AM > > To: [EMAIL PROTECTED] > > Subject: more regionservers does not improve performance > > > > Hi, > > > > I am running a MR job with 40 simultaneous mappers, each of which does > puts to HBase. I have ganged up the puts into groups of 1000 (this seems to > help quite a bit) and also made sure that the table is pre-split into 100 > regions, and that the row keys are randomized using MD5 hashing. > > Kevin O'Dell Customer Operations Engineer, Cloudera +
Kevin O'dell 2012-10-12, 13:44
-
Re: more regionservers does not improve performanceJonathan Bishop 2012-10-12, 19:15
Kevin,
Sorry, I am fairly new to HBase. Can you be specific about what settings I can change, and also where they are specified? Pretty sure I am not hotspotting, and increasing memstore does not seem to have any effect. I do not seen any messages in my regionserver logs concerning blocking. I am suspecting that I am hitting some limit in our grid, but would like to know where that limit is being imposed. Jon On Fri, Oct 12, 2012 at 6:44 AM, Kevin O'dell <[EMAIL PROTECTED]>wrote: > Jonathan, > > Lets take a deeper look here. > > What is your memstore set at for the table/CF in question? Lets compare > that value with the flush size you are seeing for your regions. If they > are really small flushes is it all to the same region? If so that is going > to be schema issues. If they are full flushes you can up your memstore > assuming you have the heap to cover it. If they are smaller flushes but to > different regions you most likely are suffering from global limit pressure > and flushing too soon. > > Are you flushing prematurely due to HLogs rolling? Take a look for too > many hlogs and look at the flushes. It may benefit you to raise that > value. > > Are you blocking? As Suraj was saying you may be blocking in 90second > blocks. Check the RS logs for those messages as well and then Suraj's > advice. > > This is where I would start to optimize your write path. I hope the above > helps. > > On Fri, Oct 12, 2012 at 3:34 AM, Suraj Varma <[EMAIL PROTECTED]> wrote: > > > What have you configured your hbase.hstore.blockingStoreFiles and > > hbase.hregion.memstore.block.multiplier? Both of these block updates > > when the limit is hit. Try increasing these to say 20 and 4 from the > > default 7 and 2 and see if it helps. > > > > If this still doesn't help, see if you can set up ganglia to get a > > better insight into what is bottlenecking. > > --Suraj > > > > > > > > On Thu, Oct 11, 2012 at 11:47 PM, Pankaj Misra > > <[EMAIL PROTECTED]> wrote: > > > OK, Looks like I missed out reading that part in your original mail. > Did > > you try some of the compaction tweaks and configurations as explained in > > the following link for your data? > > > http://hbase.apache.org/book/regions.arch.html#compaction > > > > > > > > > Also, how much data are your putting into the regions, and how big is > > one region at the end of data ingestion? > > > > > > Thanks and Regards > > > Pankaj Misra > > > > > > -----Original Message----- > > > From: Jonathan Bishop [mailto:[EMAIL PROTECTED]] > > > Sent: Friday, October 12, 2012 12:04 PM > > > To: [EMAIL PROTECTED] > > > Subject: RE: more regionservers does not improve performance > > > > > > Pankaj, > > > > > > Thanks for the reply. > > > > > > Actually, I am using MD5 hashing to evenly spread the keys among the > > splits, so I don’t believe there is any hotspot. In fact, when I monitory > > the web UI for HBase I see a very even load on all the regionservers. > > > > > > Jon > > > > > > Sent from my Windows 8 PC < > http://windows.microsoft.com/consumer-preview > > > > > > > > > *From:* Pankaj Misra <[EMAIL PROTECTED]> > > > *Sent:* Thursday, October 11, 2012 8:24:32 PM > > > *To:* [EMAIL PROTECTED] > > > *Subject:* RE: more regionservers does not improve performance > > > > > > Hi Jonathan, > > > > > > What seems to me is that, while doing the split across all 40 mappers, > > the keys are not randomized enough to leverage multiple regions and the > > pre-split strategy. This may be happening because all the 40 mappers may > be > > trying to write onto a single region for sometime, making it a HOT > region, > > till the key falls into another region, and then the other region > becomes > > a HOT region hence you may seeing a high impact of compaction cycles > > reducing your throughput. > > > > > > Are the keys incremental? Are the keys randomized enough across the > > splits? > > > > > > Ideally when all 40 mappers are running you should see all the regions +
Jonathan Bishop 2012-10-12, 19:15
-
Re: more regionservers does not improve performanceBryan Beaudreault 2012-10-12, 19:46
I recommend turning on debug logging on your region servers. You may need
to tune down certain packages back to info, because there are a few spammy ones, but overall it helps. You should see messages such as "12/10/09 14:22:57 INFO regionserver.HRegion: Blocking updates for 'IPC Server handler 41 on 60020' on region XXX: memstore size 256.0m is >= than blocking 256.0m size". As you can see, this is an INFO anyway so you should be able to see it now if it is happening. You can try upping the number of IPC handlers and the memstore flush threshold. Also, maybe you are bottlenecked by the WAL. Try doing put.setWriteToWAL(false), just to see if it increases performance. If so and you want to be a bit more safe with regard to the wal, you can try turning on deferred flush on your table. I don't really know how to increase performance of the wal aside from that, if this does seem to have an affect. On Fri, Oct 12, 2012 at 3:15 PM, Jonathan Bishop <[EMAIL PROTECTED]>wrote: > Kevin, > > Sorry, I am fairly new to HBase. Can you be specific about what settings I > can change, and also where they are specified? > > Pretty sure I am not hotspotting, and increasing memstore does not seem to > have any effect. > > I do not seen any messages in my regionserver logs concerning blocking. > > I am suspecting that I am hitting some limit in our grid, but would like to > know where that limit is being imposed. > > Jon > > On Fri, Oct 12, 2012 at 6:44 AM, Kevin O'dell <[EMAIL PROTECTED] > >wrote: > > > Jonathan, > > > > Lets take a deeper look here. > > > > What is your memstore set at for the table/CF in question? Lets compare > > that value with the flush size you are seeing for your regions. If they > > are really small flushes is it all to the same region? If so that is > going > > to be schema issues. If they are full flushes you can up your memstore > > assuming you have the heap to cover it. If they are smaller flushes but > to > > different regions you most likely are suffering from global limit > pressure > > and flushing too soon. > > > > Are you flushing prematurely due to HLogs rolling? Take a look for too > > many hlogs and look at the flushes. It may benefit you to raise that > > value. > > > > Are you blocking? As Suraj was saying you may be blocking in 90second > > blocks. Check the RS logs for those messages as well and then Suraj's > > advice. > > > > This is where I would start to optimize your write path. I hope the > above > > helps. > > > > On Fri, Oct 12, 2012 at 3:34 AM, Suraj Varma <[EMAIL PROTECTED]> > wrote: > > > > > What have you configured your hbase.hstore.blockingStoreFiles and > > > hbase.hregion.memstore.block.multiplier? Both of these block updates > > > when the limit is hit. Try increasing these to say 20 and 4 from the > > > default 7 and 2 and see if it helps. > > > > > > If this still doesn't help, see if you can set up ganglia to get a > > > better insight into what is bottlenecking. > > > --Suraj > > > > > > > > > > > > On Thu, Oct 11, 2012 at 11:47 PM, Pankaj Misra > > > <[EMAIL PROTECTED]> wrote: > > > > OK, Looks like I missed out reading that part in your original mail. > > Did > > > you try some of the compaction tweaks and configurations as explained > in > > > the following link for your data? > > > > http://hbase.apache.org/book/regions.arch.html#compaction > > > > > > > > > > > > Also, how much data are your putting into the regions, and how big is > > > one region at the end of data ingestion? > > > > > > > > Thanks and Regards > > > > Pankaj Misra > > > > > > > > -----Original Message----- > > > > From: Jonathan Bishop [mailto:[EMAIL PROTECTED]] > > > > Sent: Friday, October 12, 2012 12:04 PM > > > > To: [EMAIL PROTECTED] > > > > Subject: RE: more regionservers does not improve performance > > > > > > > > Pankaj, > > > > > > > > Thanks for the reply. > > > > > > > > Actually, I am using MD5 hashing to evenly spread the keys among the > > > splits, so I don’t believe there is any hotspot. In fact, when I +
Bryan Beaudreault 2012-10-12, 19:46
-
Re: more regionservers does not improve performanceSuraj Varma 2012-10-13, 02:30
Hi Jonathan:
What specific metric on ganglia did you notice for "IO is spiking"? Is it your disk IO? Is your disk swapping? Do you see cpu iowait spikes? I see you have given 8g to the RegionServer ... how much RAM is available total on that node? What heap are the individual mappers & DN set to run on (i.e. check whether you are overallocated on heap when the _mappers_ run ... causing disk swapping ... leading to IO?). There can be multiple causes ... so, you may need to look at ganglia stats and narrow the bottleneck down as described in http://hbase.apache.org/book/casestudies.perftroub.html Here's a good reference for all the memstore related tweaks you can try (and also to understand what each configuration means): http://blog.sematext.com/2012/07/16/hbase-memstore-what-you-should-know/ Also, provide more details on your schema (CFs, row size), Put sizes, etc as well to see if that triggers an idea from the list. --S On Fri, Oct 12, 2012 at 12:46 PM, Bryan Beaudreault <[EMAIL PROTECTED]> wrote: > I recommend turning on debug logging on your region servers. You may need > to tune down certain packages back to info, because there are a few spammy > ones, but overall it helps. > > You should see messages such as "12/10/09 14:22:57 INFO > regionserver.HRegion: Blocking updates for 'IPC Server handler 41 on 60020' > on region XXX: memstore size 256.0m is >= than blocking 256.0m size". As > you can see, this is an INFO anyway so you should be able to see it now if > it is happening. > > You can try upping the number of IPC handlers and the memstore flush > threshold. Also, maybe you are bottlenecked by the WAL. Try doing > put.setWriteToWAL(false), just to see if it increases performance. If so > and you want to be a bit more safe with regard to the wal, you can try > turning on deferred flush on your table. I don't really know how to > increase performance of the wal aside from that, if this does seem to have > an affect. > > > > On Fri, Oct 12, 2012 at 3:15 PM, Jonathan Bishop <[EMAIL PROTECTED]>wrote: > >> Kevin, >> >> Sorry, I am fairly new to HBase. Can you be specific about what settings I >> can change, and also where they are specified? >> >> Pretty sure I am not hotspotting, and increasing memstore does not seem to >> have any effect. >> >> I do not seen any messages in my regionserver logs concerning blocking. >> >> I am suspecting that I am hitting some limit in our grid, but would like to >> know where that limit is being imposed. >> >> Jon >> >> On Fri, Oct 12, 2012 at 6:44 AM, Kevin O'dell <[EMAIL PROTECTED] >> >wrote: >> >> > Jonathan, >> > >> > Lets take a deeper look here. >> > >> > What is your memstore set at for the table/CF in question? Lets compare >> > that value with the flush size you are seeing for your regions. If they >> > are really small flushes is it all to the same region? If so that is >> going >> > to be schema issues. If they are full flushes you can up your memstore >> > assuming you have the heap to cover it. If they are smaller flushes but >> to >> > different regions you most likely are suffering from global limit >> pressure >> > and flushing too soon. >> > >> > Are you flushing prematurely due to HLogs rolling? Take a look for too >> > many hlogs and look at the flushes. It may benefit you to raise that >> > value. >> > >> > Are you blocking? As Suraj was saying you may be blocking in 90second >> > blocks. Check the RS logs for those messages as well and then Suraj's >> > advice. >> > >> > This is where I would start to optimize your write path. I hope the >> above >> > helps. >> > >> > On Fri, Oct 12, 2012 at 3:34 AM, Suraj Varma <[EMAIL PROTECTED]> >> wrote: >> > >> > > What have you configured your hbase.hstore.blockingStoreFiles and >> > > hbase.hregion.memstore.block.multiplier? Both of these block updates >> > > when the limit is hit. Try increasing these to say 20 and 4 from the >> > > default 7 and 2 and see if it helps. >> > > >> > > If this still doesn't help, see if you can set up ganglia to get a +
Suraj Varma 2012-10-13, 02:30
-
Re: more regionservers does not improve performanceSuraj Varma 2012-10-13, 02:49
I'm intrigued by this statement in your first mail:
> What is strange is that I do not get much run time improvement by > increasing the number regionservers beyond about 4. Indeed, it seems that > the system runs slower with 8 regionservers than with 4. So ... are you saying that if you shut down four of your region servers and task trackers right now ... you are able to generate more throughput (requests/sec)? Merely adding more region servers slows things down? Or did you change other things (like more region splits, etc) between these two states? Also - your 40 mappers ... are they using TableMapReduceUtil based splits ... or custom splits? Are the mappers going across the network to region servers on other nodes? Or are they all local calls? Just trying to understanding your cluster setup a bit more ... --Suraj On Fri, Oct 12, 2012 at 7:30 PM, Suraj Varma <[EMAIL PROTECTED]> wrote: > Hi Jonathan: > What specific metric on ganglia did you notice for "IO is spiking"? Is > it your disk IO? Is your disk swapping? Do you see cpu iowait spikes? > > I see you have given 8g to the RegionServer ... how much RAM is > available total on that node? What heap are the individual mappers & > DN set to run on (i.e. check whether you are overallocated on heap > when the _mappers_ run ... causing disk swapping ... leading to IO?). > > There can be multiple causes ... so, you may need to look at ganglia > stats and narrow the bottleneck down as described in > http://hbase.apache.org/book/casestudies.perftroub.html > > Here's a good reference for all the memstore related tweaks you can > try (and also to understand what each configuration means): > http://blog.sematext.com/2012/07/16/hbase-memstore-what-you-should-know/ > > Also, provide more details on your schema (CFs, row size), Put sizes, > etc as well to see if that triggers an idea from the list. > --S > > > On Fri, Oct 12, 2012 at 12:46 PM, Bryan Beaudreault > <[EMAIL PROTECTED]> wrote: >> I recommend turning on debug logging on your region servers. You may need >> to tune down certain packages back to info, because there are a few spammy >> ones, but overall it helps. >> >> You should see messages such as "12/10/09 14:22:57 INFO >> regionserver.HRegion: Blocking updates for 'IPC Server handler 41 on 60020' >> on region XXX: memstore size 256.0m is >= than blocking 256.0m size". As >> you can see, this is an INFO anyway so you should be able to see it now if >> it is happening. >> >> You can try upping the number of IPC handlers and the memstore flush >> threshold. Also, maybe you are bottlenecked by the WAL. Try doing >> put.setWriteToWAL(false), just to see if it increases performance. If so >> and you want to be a bit more safe with regard to the wal, you can try >> turning on deferred flush on your table. I don't really know how to >> increase performance of the wal aside from that, if this does seem to have >> an affect. >> >> >> >> On Fri, Oct 12, 2012 at 3:15 PM, Jonathan Bishop <[EMAIL PROTECTED]>wrote: >> >>> Kevin, >>> >>> Sorry, I am fairly new to HBase. Can you be specific about what settings I >>> can change, and also where they are specified? >>> >>> Pretty sure I am not hotspotting, and increasing memstore does not seem to >>> have any effect. >>> >>> I do not seen any messages in my regionserver logs concerning blocking. >>> >>> I am suspecting that I am hitting some limit in our grid, but would like to >>> know where that limit is being imposed. >>> >>> Jon >>> >>> On Fri, Oct 12, 2012 at 6:44 AM, Kevin O'dell <[EMAIL PROTECTED] >>> >wrote: >>> >>> > Jonathan, >>> > >>> > Lets take a deeper look here. >>> > >>> > What is your memstore set at for the table/CF in question? Lets compare >>> > that value with the flush size you are seeing for your regions. If they >>> > are really small flushes is it all to the same region? If so that is >>> going >>> > to be schema issues. If they are full flushes you can up your memstore >>> > assuming you have the heap to cover it. If they are smaller flushes but +
Suraj Varma 2012-10-13, 02:49
-
Re: more regionservers does not improve performanceJonathan Bishop 2012-10-13, 15:55
Suraj,
Yes, things seem to slow down, but this is a noisy cluster with other jobs running so it is hard to tell. Certainly, I did see speed ups when going from 1 to 2, and 2 to 4 regionservers. Just no change, or even a slowdown when going from 4 to 8 regionservers. What I am most concerned about is that I do not see scalability. I expected 8 regionservers to run twice as fast as 4. While I did not use the TableMapReduceUtil for generating splits, I did generate my own splits and definitely see and even load on the regionservers. Don't the mappers always go across the network to the regionservers since the input splits have no relation to the table splits? It would be great if that was not the case as each split could go to its local regionserver. Jon On Fri, Oct 12, 2012 at 7:49 PM, Suraj Varma <[EMAIL PROTECTED]> wrote: > I'm intrigued by this statement in your first mail: > > > What is strange is that I do not get much run time improvement by > > increasing the number regionservers beyond about 4. Indeed, it seems that > > the system runs slower with 8 regionservers than with 4. > > So ... are you saying that if you shut down four of your region > servers and task trackers right now ... you are able to generate more > throughput (requests/sec)? Merely adding more region servers slows > things down? > Or did you change other things (like more region splits, etc) between > these two states? > > Also - your 40 mappers ... are they using TableMapReduceUtil based > splits ... or custom splits? Are the mappers going across the network > to region servers on other nodes? Or are they all local calls? > > Just trying to understanding your cluster setup a bit more ... > --Suraj > > On Fri, Oct 12, 2012 at 7:30 PM, Suraj Varma <[EMAIL PROTECTED]> wrote: > > Hi Jonathan: > > What specific metric on ganglia did you notice for "IO is spiking"? Is > > it your disk IO? Is your disk swapping? Do you see cpu iowait spikes? > > > > I see you have given 8g to the RegionServer ... how much RAM is > > available total on that node? What heap are the individual mappers & > > DN set to run on (i.e. check whether you are overallocated on heap > > when the _mappers_ run ... causing disk swapping ... leading to IO?). > > > > There can be multiple causes ... so, you may need to look at ganglia > > stats and narrow the bottleneck down as described in > > http://hbase.apache.org/book/casestudies.perftroub.html > > > > Here's a good reference for all the memstore related tweaks you can > > try (and also to understand what each configuration means): > > http://blog.sematext.com/2012/07/16/hbase-memstore-what-you-should-know/ > > > > Also, provide more details on your schema (CFs, row size), Put sizes, > > etc as well to see if that triggers an idea from the list. > > --S > > > > > > On Fri, Oct 12, 2012 at 12:46 PM, Bryan Beaudreault > > <[EMAIL PROTECTED]> wrote: > >> I recommend turning on debug logging on your region servers. You may > need > >> to tune down certain packages back to info, because there are a few > spammy > >> ones, but overall it helps. > >> > >> You should see messages such as "12/10/09 14:22:57 INFO > >> regionserver.HRegion: Blocking updates for 'IPC Server handler 41 on > 60020' > >> on region XXX: memstore size 256.0m is >= than blocking 256.0m size". > As > >> you can see, this is an INFO anyway so you should be able to see it now > if > >> it is happening. > >> > >> You can try upping the number of IPC handlers and the memstore flush > >> threshold. Also, maybe you are bottlenecked by the WAL. Try doing > >> put.setWriteToWAL(false), just to see if it increases performance. If > so > >> and you want to be a bit more safe with regard to the wal, you can try > >> turning on deferred flush on your table. I don't really know how to > >> increase performance of the wal aside from that, if this does seem to > have > >> an affect. > >> > >> > >> > >> On Fri, Oct 12, 2012 at 3:15 PM, Jonathan Bishop <[EMAIL PROTECTED] +
Jonathan Bishop 2012-10-13, 15:55
-
Re: more regionservers does not improve performanceJonathan Bishop 2012-10-13, 15:58
Suraj,
I bumped my regionservers all the way up to 32g from 8g. They are running on 64g and 128g machines on our cluster. Unfortunately, the machines all have various states of loading (usually high) from other users. In ganglia I do not see any swapping, but that has been known to happen from time to time. Thanks for your help - I'll take a look at your links. Jon On Fri, Oct 12, 2012 at 7:30 PM, Suraj Varma <[EMAIL PROTECTED]> wrote: > Hi Jonathan: > What specific metric on ganglia did you notice for "IO is spiking"? Is > it your disk IO? Is your disk swapping? Do you see cpu iowait spikes? > > I see you have given 8g to the RegionServer ... how much RAM is > available total on that node? What heap are the individual mappers & > DN set to run on (i.e. check whether you are overallocated on heap > when the _mappers_ run ... causing disk swapping ... leading to IO?). > > There can be multiple causes ... so, you may need to look at ganglia > stats and narrow the bottleneck down as described in > http://hbase.apache.org/book/casestudies.perftroub.html > > Here's a good reference for all the memstore related tweaks you can > try (and also to understand what each configuration means): > http://blog.sematext.com/2012/07/16/hbase-memstore-what-you-should-know/ > > Also, provide more details on your schema (CFs, row size), Put sizes, > etc as well to see if that triggers an idea from the list. > --S > > > On Fri, Oct 12, 2012 at 12:46 PM, Bryan Beaudreault > <[EMAIL PROTECTED]> wrote: > > I recommend turning on debug logging on your region servers. You may > need > > to tune down certain packages back to info, because there are a few > spammy > > ones, but overall it helps. > > > > You should see messages such as "12/10/09 14:22:57 INFO > > regionserver.HRegion: Blocking updates for 'IPC Server handler 41 on > 60020' > > on region XXX: memstore size 256.0m is >= than blocking 256.0m size". As > > you can see, this is an INFO anyway so you should be able to see it now > if > > it is happening. > > > > You can try upping the number of IPC handlers and the memstore flush > > threshold. Also, maybe you are bottlenecked by the WAL. Try doing > > put.setWriteToWAL(false), just to see if it increases performance. If so > > and you want to be a bit more safe with regard to the wal, you can try > > turning on deferred flush on your table. I don't really know how to > > increase performance of the wal aside from that, if this does seem to > have > > an affect. > > > > > > > > On Fri, Oct 12, 2012 at 3:15 PM, Jonathan Bishop <[EMAIL PROTECTED] > >wrote: > > > >> Kevin, > >> > >> Sorry, I am fairly new to HBase. Can you be specific about what > settings I > >> can change, and also where they are specified? > >> > >> Pretty sure I am not hotspotting, and increasing memstore does not seem > to > >> have any effect. > >> > >> I do not seen any messages in my regionserver logs concerning blocking. > >> > >> I am suspecting that I am hitting some limit in our grid, but would > like to > >> know where that limit is being imposed. > >> > >> Jon > >> > >> On Fri, Oct 12, 2012 at 6:44 AM, Kevin O'dell <[EMAIL PROTECTED] > >> >wrote: > >> > >> > Jonathan, > >> > > >> > Lets take a deeper look here. > >> > > >> > What is your memstore set at for the table/CF in question? Lets > compare > >> > that value with the flush size you are seeing for your regions. If > they > >> > are really small flushes is it all to the same region? If so that is > >> going > >> > to be schema issues. If they are full flushes you can up your > memstore > >> > assuming you have the heap to cover it. If they are smaller flushes > but > >> to > >> > different regions you most likely are suffering from global limit > >> pressure > >> > and flushing too soon. > >> > > >> > Are you flushing prematurely due to HLogs rolling? Take a look for > too > >> > many hlogs and look at the flushes. It may benefit you to raise that > >> > value. > >> > > >> > Are you blocking? As Suraj was saying you may be blocking in 90second +
Jonathan Bishop 2012-10-13, 15:58
-
Re: more regionservers does not improve performanceMatt Corgan 2012-10-14, 05:37
Did you try setting put.setWriteToWAL(false) as Bryan suggested? This may
not be what you want in the end, but seeing what happens may help debug. Matt On Sat, Oct 13, 2012 at 8:58 AM, Jonathan Bishop <[EMAIL PROTECTED]>wrote: > Suraj, > > I bumped my regionservers all the way up to 32g from 8g. They are running > on 64g and 128g machines on our cluster. Unfortunately, the machines all > have various states of loading (usually high) from other users. > > In ganglia I do not see any swapping, but that has been known to happen > from time to time. > > Thanks for your help - I'll take a look at your links. > > Jon > > On Fri, Oct 12, 2012 at 7:30 PM, Suraj Varma <[EMAIL PROTECTED]> wrote: > > > Hi Jonathan: > > What specific metric on ganglia did you notice for "IO is spiking"? Is > > it your disk IO? Is your disk swapping? Do you see cpu iowait spikes? > > > > I see you have given 8g to the RegionServer ... how much RAM is > > available total on that node? What heap are the individual mappers & > > DN set to run on (i.e. check whether you are overallocated on heap > > when the _mappers_ run ... causing disk swapping ... leading to IO?). > > > > There can be multiple causes ... so, you may need to look at ganglia > > stats and narrow the bottleneck down as described in > > http://hbase.apache.org/book/casestudies.perftroub.html > > > > Here's a good reference for all the memstore related tweaks you can > > try (and also to understand what each configuration means): > > http://blog.sematext.com/2012/07/16/hbase-memstore-what-you-should-know/ > > > > Also, provide more details on your schema (CFs, row size), Put sizes, > > etc as well to see if that triggers an idea from the list. > > --S > > > > > > On Fri, Oct 12, 2012 at 12:46 PM, Bryan Beaudreault > > <[EMAIL PROTECTED]> wrote: > > > I recommend turning on debug logging on your region servers. You may > > need > > > to tune down certain packages back to info, because there are a few > > spammy > > > ones, but overall it helps. > > > > > > You should see messages such as "12/10/09 14:22:57 INFO > > > regionserver.HRegion: Blocking updates for 'IPC Server handler 41 on > > 60020' > > > on region XXX: memstore size 256.0m is >= than blocking 256.0m size". > As > > > you can see, this is an INFO anyway so you should be able to see it now > > if > > > it is happening. > > > > > > You can try upping the number of IPC handlers and the memstore flush > > > threshold. Also, maybe you are bottlenecked by the WAL. Try doing > > > put.setWriteToWAL(false), just to see if it increases performance. If > so > > > and you want to be a bit more safe with regard to the wal, you can try > > > turning on deferred flush on your table. I don't really know how to > > > increase performance of the wal aside from that, if this does seem to > > have > > > an affect. > > > > > > > > > > > > On Fri, Oct 12, 2012 at 3:15 PM, Jonathan Bishop < > [EMAIL PROTECTED] > > >wrote: > > > > > >> Kevin, > > >> > > >> Sorry, I am fairly new to HBase. Can you be specific about what > > settings I > > >> can change, and also where they are specified? > > >> > > >> Pretty sure I am not hotspotting, and increasing memstore does not > seem > > to > > >> have any effect. > > >> > > >> I do not seen any messages in my regionserver logs concerning > blocking. > > >> > > >> I am suspecting that I am hitting some limit in our grid, but would > > like to > > >> know where that limit is being imposed. > > >> > > >> Jon > > >> > > >> On Fri, Oct 12, 2012 at 6:44 AM, Kevin O'dell < > [EMAIL PROTECTED] > > >> >wrote: > > >> > > >> > Jonathan, > > >> > > > >> > Lets take a deeper look here. > > >> > > > >> > What is your memstore set at for the table/CF in question? Lets > > compare > > >> > that value with the flush size you are seeing for your regions. If > > they > > >> > are really small flushes is it all to the same region? If so that > is > > >> going > > >> > to be schema issues. If they are full flushes you can up your +
Matt Corgan 2012-10-14, 05:37
-
Re: more regionservers does not improve performanceJonathan Bishop 2012-10-14, 15:48
Matt,
Yes, I did. What I observed is that the map job proceeds about 3-4x faster for a while. But then I observed long pauses partway through the job, and overall run time was only reduced only modestly, way from 50 minutes to 40 minutes. Just to summarize the issue, my mapper jobs seem to scale nicely. This is expected as my dfs block size is small enough to create over 500 tasks, and I have a max of 40 mappers running. But when I include puts to hbase in my job, then I see a 4-6x slowdown which does not respond to an increasing number of regionservers. My current best guess is that there is a network bottleneck in getting the puts produced by the mappers to the appropriate regionservers, as I assume that once the puts are received by the regionservers that they can all operate in parallel without slowing each other down. Again, I am on grid which is used by many others, and the machines in my cluster are not dedicated to my job. I am mainly looking at scalability trends when running with various numbers of regionservers. Jon On Sat, Oct 13, 2012 at 10:37 PM, Matt Corgan <[EMAIL PROTECTED]> wrote: > Did you try setting put.setWriteToWAL(false) as Bryan suggested? This may > not be what you want in the end, but seeing what happens may help debug. > > Matt > > On Sat, Oct 13, 2012 at 8:58 AM, Jonathan Bishop <[EMAIL PROTECTED] > >wrote: > > > Suraj, > > > > I bumped my regionservers all the way up to 32g from 8g. They are running > > on 64g and 128g machines on our cluster. Unfortunately, the machines all > > have various states of loading (usually high) from other users. > > > > In ganglia I do not see any swapping, but that has been known to happen > > from time to time. > > > > Thanks for your help - I'll take a look at your links. > > > > Jon > > > > On Fri, Oct 12, 2012 at 7:30 PM, Suraj Varma <[EMAIL PROTECTED]> > wrote: > > > > > Hi Jonathan: > > > What specific metric on ganglia did you notice for "IO is spiking"? Is > > > it your disk IO? Is your disk swapping? Do you see cpu iowait spikes? > > > > > > I see you have given 8g to the RegionServer ... how much RAM is > > > available total on that node? What heap are the individual mappers & > > > DN set to run on (i.e. check whether you are overallocated on heap > > > when the _mappers_ run ... causing disk swapping ... leading to IO?). > > > > > > There can be multiple causes ... so, you may need to look at ganglia > > > stats and narrow the bottleneck down as described in > > > http://hbase.apache.org/book/casestudies.perftroub.html > > > > > > Here's a good reference for all the memstore related tweaks you can > > > try (and also to understand what each configuration means): > > > > http://blog.sematext.com/2012/07/16/hbase-memstore-what-you-should-know/ > > > > > > Also, provide more details on your schema (CFs, row size), Put sizes, > > > etc as well to see if that triggers an idea from the list. > > > --S > > > > > > > > > On Fri, Oct 12, 2012 at 12:46 PM, Bryan Beaudreault > > > <[EMAIL PROTECTED]> wrote: > > > > I recommend turning on debug logging on your region servers. You may > > > need > > > > to tune down certain packages back to info, because there are a few > > > spammy > > > > ones, but overall it helps. > > > > > > > > You should see messages such as "12/10/09 14:22:57 INFO > > > > regionserver.HRegion: Blocking updates for 'IPC Server handler 41 on > > > 60020' > > > > on region XXX: memstore size 256.0m is >= than blocking 256.0m size". > > As > > > > you can see, this is an INFO anyway so you should be able to see it > now > > > if > > > > it is happening. > > > > > > > > You can try upping the number of IPC handlers and the memstore flush > > > > threshold. Also, maybe you are bottlenecked by the WAL. Try doing > > > > put.setWriteToWAL(false), just to see if it increases performance. > If > > so > > > > and you want to be a bit more safe with regard to the wal, you can > try > > > > turning on deferred flush on your table. I don't really know how to +
Jonathan Bishop 2012-10-14, 15:48
-
Re: more regionservers does not improve performancelars hofhansl 2012-10-15, 01:03
Sorry for jumping in late here.
What's you compaction queue size over time? It might be that your IO system just cannot keep up with the load. HBase will buffer data in the memstore, but eventually this data has to make it to disk, then eventually you get a lot of storefiles that need to be compacted. Eventually if the compactions cannot keep up, flushes are blocked, and HBase will block client writes (what else can it do?) The large compaction queue size of a good indicator for that. BTW. HBASE-6974 will add a separate metric for these blocked writes, so that these can be tracker more directly. -- Lars ________________________________ From: Jonathan Bishop <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Sunday, October 14, 2012 8:48 AM Subject: Re: more regionservers does not improve performance Matt, Yes, I did. What I observed is that the map job proceeds about 3-4x faster for a while. But then I observed long pauses partway through the job, and overall run time was only reduced only modestly, way from 50 minutes to 40 minutes. Just to summarize the issue, my mapper jobs seem to scale nicely. This is expected as my dfs block size is small enough to create over 500 tasks, and I have a max of 40 mappers running. But when I include puts to hbase in my job, then I see a 4-6x slowdown which does not respond to an increasing number of regionservers. My current best guess is that there is a network bottleneck in getting the puts produced by the mappers to the appropriate regionservers, as I assume that once the puts are received by the regionservers that they can all operate in parallel without slowing each other down. Again, I am on grid which is used by many others, and the machines in my cluster are not dedicated to my job. I am mainly looking at scalability trends when running with various numbers of regionservers. Jon On Sat, Oct 13, 2012 at 10:37 PM, Matt Corgan <[EMAIL PROTECTED]> wrote: > Did you try setting put.setWriteToWAL(false) as Bryan suggested? This may > not be what you want in the end, but seeing what happens may help debug. > > Matt > > On Sat, Oct 13, 2012 at 8:58 AM, Jonathan Bishop <[EMAIL PROTECTED] > >wrote: > > > Suraj, > > > > I bumped my regionservers all the way up to 32g from 8g. They are running > > on 64g and 128g machines on our cluster. Unfortunately, the machines all > > have various states of loading (usually high) from other users. > > > > In ganglia I do not see any swapping, but that has been known to happen > > from time to time. > > > > Thanks for your help - I'll take a look at your links. > > > > Jon > > > > On Fri, Oct 12, 2012 at 7:30 PM, Suraj Varma <[EMAIL PROTECTED]> > wrote: > > > > > Hi Jonathan: > > > What specific metric on ganglia did you notice for "IO is spiking"? Is > > > it your disk IO? Is your disk swapping? Do you see cpu iowait spikes? > > > > > > I see you have given 8g to the RegionServer ... how much RAM is > > > available total on that node? What heap are the individual mappers & > > > DN set to run on (i.e. check whether you are overallocated on heap > > > when the _mappers_ run ... causing disk swapping ... leading to IO?). > > > > > > There can be multiple causes ... so, you may need to look at ganglia > > > stats and narrow the bottleneck down as described in > > > http://hbase.apache.org/book/casestudies.perftroub.html > > > > > > Here's a good reference for all the memstore related tweaks you can > > > try (and also to understand what each configuration means): > > > > http://blog.sematext.com/2012/07/16/hbase-memstore-what-you-should-know/ > > > > > > Also, provide more details on your schema (CFs, row size), Put sizes, > > > etc as well to see if that triggers an idea from the list. > > > --S > > > > > > > > > On Fri, Oct 12, 2012 at 12:46 PM, Bryan Beaudreault > > > <[EMAIL PROTECTED]> wrote: > > > > I recommend turning on debug logging on your region servers. You may > > > need > > > > to tune down certain packages back to info, because there are a few +
lars hofhansl 2012-10-15, 01:03
-
Re: more regionservers does not improve performanceMichel Segel 2012-10-15, 11:41
Here's a key statement:
Again, I am on grid which is used by many others, and the machines in my cluster are not dedicated to my job. I am mainly looking at scalability trends when running with various numbers of regionservers. What do you notice when you monitor the cluster in ganglia? Sent from a remote device. Please excuse any typos... Mike Segel On Oct 14, 2012, at 8:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Again, I am on grid which is used by many others, and the machines in my > cluster are not dedicated to my job. I am mainly looking at scalability > trends when running with various numbers of regionservers. +
Michel Segel 2012-10-15, 11:41
-
Re: more regionservers does not improve performanceMatt Corgan 2012-10-15, 17:23
You can change table settings in the shell. To start the shell:
HBASE_HOME/bin/hbase shell To see some examples for the "alter" command, just type it without any arguments. Here's an example for this case: describe 'MyTable' disable 'MyTable' alter 'MyTable', METHOD => 'table_att', MEMSTORE_FLUSHSIZE => '268435456' describe 'MyTable' enable 'MyTable' On Mon, Oct 15, 2012 at 4:41 AM, Michel Segel <[EMAIL PROTECTED]>wrote: > Here's a key statement: > Again, I am on grid which is used by many others, and the machines in my > cluster are not dedicated to my job. I am mainly looking at scalability > trends when running with various numbers of regionservers. > > What do you notice when you monitor the cluster in ganglia? > > > Sent from a remote device. Please excuse any typos... > > Mike Segel > > On Oct 14, 2012, at 8:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > > > Again, I am on grid which is used by many others, and the machines in my > > cluster are not dedicated to my job. I am mainly looking at scalability > > trends when running with various numbers of regionservers. > +
Matt Corgan 2012-10-15, 17:23
-
Re: more regionservers does not improve performanceMatt Corgan 2012-10-15, 00:48
It could be network bound, especially if you have decently size values
(~500B+). HBase can be rough on the network because each value travels from client to regionserver, and then makes 2 additional network hops in the WAL, and then an additional 2 hops in the memstore flush, plus ongoing compactions. After disabling the WAL, enabling GZIP compression on the table can cut down on the flush/compaction impact if your data is compressible. How long are your row keys and values, and how many cells do you have per row? Longer keys would point towards internal limitations in hbase (locking, cpu usage, etc), while longer values indicate network and disk limitations. Another consideration is that your workload may be too even and is not given enough time to find steady state. If you have 12-25 regions per server and your workload is perfectly randomized, then all regions will hit the memstore flush size simultaneously which triggers 12-25 memstore flushes at the same time. The memstore flusher may be single threaded (i forget), so you are suddenly hitting the blocking storefile limit which could explain the pauses you are seeing. You could try reducing the number of regions to ~4/server. And make sure your memstore flush size is at least 256M. Matt On Sun, Oct 14, 2012 at 8:48 AM, Jonathan Bishop <[EMAIL PROTECTED]>wrote: > Matt, > > Yes, I did. What I observed is that the map job proceeds about 3-4x faster > for a while. But then I observed long pauses partway through the job, and > overall run time was only reduced only modestly, way from 50 minutes to 40 > minutes. > > Just to summarize the issue, my mapper jobs seem to scale nicely. This is > expected as my dfs block size is small enough to create over 500 tasks, and > I have a max of 40 mappers running. > > But when I include puts to hbase in my job, then I see a 4-6x slowdown > which does not respond to an increasing number of regionservers. > > My current best guess is that there is a network bottleneck in getting the > puts produced by the mappers to the appropriate regionservers, as I assume > that once the puts are received by the regionservers that they can all > operate in parallel without slowing each other down. > > Again, I am on grid which is used by many others, and the machines in my > cluster are not dedicated to my job. I am mainly looking at scalability > trends when running with various numbers of regionservers. > > Jon > > On Sat, Oct 13, 2012 at 10:37 PM, Matt Corgan <[EMAIL PROTECTED]> wrote: > > > Did you try setting put.setWriteToWAL(false) as Bryan suggested? This > may > > not be what you want in the end, but seeing what happens may help debug. > > > > Matt > > > > On Sat, Oct 13, 2012 at 8:58 AM, Jonathan Bishop <[EMAIL PROTECTED] > > >wrote: > > > > > Suraj, > > > > > > I bumped my regionservers all the way up to 32g from 8g. They are > running > > > on 64g and 128g machines on our cluster. Unfortunately, the machines > all > > > have various states of loading (usually high) from other users. > > > > > > In ganglia I do not see any swapping, but that has been known to happen > > > from time to time. > > > > > > Thanks for your help - I'll take a look at your links. > > > > > > Jon > > > > > > On Fri, Oct 12, 2012 at 7:30 PM, Suraj Varma <[EMAIL PROTECTED]> > > wrote: > > > > > > > Hi Jonathan: > > > > What specific metric on ganglia did you notice for "IO is spiking"? > Is > > > > it your disk IO? Is your disk swapping? Do you see cpu iowait spikes? > > > > > > > > I see you have given 8g to the RegionServer ... how much RAM is > > > > available total on that node? What heap are the individual mappers & > > > > DN set to run on (i.e. check whether you are overallocated on heap > > > > when the _mappers_ run ... causing disk swapping ... leading to IO?). > > > > > > > > There can be multiple causes ... so, you may need to look at ganglia > > > > stats and narrow the bottleneck down as described in > > > > http://hbase.apache.org/book/casestudies.perftroub.html +
Matt Corgan 2012-10-15, 00:48
-
Re: more regionservers does not improve performanceJonathan Bishop 2012-10-15, 02:42
Thanks Matt,
I have 10 regions per regionserver (100 splits over 10 regionservers), and yes they all seem to almost stop at the same time. I'll try splitting the table into fewer regions as you suggest. Where do I set the memstore flush size? Sorry pretty new to this. Jon On Sun, Oct 14, 2012 at 5:48 PM, Matt Corgan <[EMAIL PROTECTED]> wrote: > It could be network bound, especially if you have decently size values > (~500B+). HBase can be rough on the network because each value travels > from client to regionserver, and then makes 2 additional network hops in > the WAL, and then an additional 2 hops in the memstore flush, plus ongoing > compactions. After disabling the WAL, enabling GZIP compression on the > table can cut down on the flush/compaction impact if your data is > compressible. > > How long are your row keys and values, and how many cells do you have per > row? Longer keys would point towards internal limitations in hbase > (locking, cpu usage, etc), while longer values indicate network and disk > limitations. > > Another consideration is that your workload may be too even and is not > given enough time to find steady state. If you have 12-25 regions per > server and your workload is perfectly randomized, then all regions will hit > the memstore flush size simultaneously which triggers 12-25 memstore > flushes at the same time. The memstore flusher may be single threaded (i > forget), so you are suddenly hitting the blocking storefile limit which > could explain the pauses you are seeing. You could try reducing the number > of regions to ~4/server. And make sure your memstore flush size is at > least 256M. > > Matt > > > On Sun, Oct 14, 2012 at 8:48 AM, Jonathan Bishop <[EMAIL PROTECTED] > >wrote: > > > Matt, > > > > Yes, I did. What I observed is that the map job proceeds about 3-4x > faster > > for a while. But then I observed long pauses partway through the job, > and > > overall run time was only reduced only modestly, way from 50 minutes to > 40 > > minutes. > > > > Just to summarize the issue, my mapper jobs seem to scale nicely. This is > > expected as my dfs block size is small enough to create over 500 tasks, > and > > I have a max of 40 mappers running. > > > > But when I include puts to hbase in my job, then I see a 4-6x slowdown > > which does not respond to an increasing number of regionservers. > > > > My current best guess is that there is a network bottleneck in getting > the > > puts produced by the mappers to the appropriate regionservers, as I > assume > > that once the puts are received by the regionservers that they can all > > operate in parallel without slowing each other down. > > > > Again, I am on grid which is used by many others, and the machines in my > > cluster are not dedicated to my job. I am mainly looking at scalability > > trends when running with various numbers of regionservers. > > > > Jon > > > > On Sat, Oct 13, 2012 at 10:37 PM, Matt Corgan <[EMAIL PROTECTED]> > wrote: > > > > > Did you try setting put.setWriteToWAL(false) as Bryan suggested? This > > may > > > not be what you want in the end, but seeing what happens may help > debug. > > > > > > Matt > > > > > > On Sat, Oct 13, 2012 at 8:58 AM, Jonathan Bishop < > [EMAIL PROTECTED] > > > >wrote: > > > > > > > Suraj, > > > > > > > > I bumped my regionservers all the way up to 32g from 8g. They are > > running > > > > on 64g and 128g machines on our cluster. Unfortunately, the machines > > all > > > > have various states of loading (usually high) from other users. > > > > > > > > In ganglia I do not see any swapping, but that has been known to > happen > > > > from time to time. > > > > > > > > Thanks for your help - I'll take a look at your links. > > > > > > > > Jon > > > > > > > > On Fri, Oct 12, 2012 at 7:30 PM, Suraj Varma <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > Hi Jonathan: > > > > > What specific metric on ganglia did you notice for "IO is spiking"? > > Is > > > > > it your disk IO? Is your disk swapping? Do you see cpu iowait +
Jonathan Bishop 2012-10-15, 02:42
|