|
Sujee Maniyam
2011-11-01, 17:54
Jean-Daniel Cryans
2011-11-01, 18:22
Sujee Maniyam
2011-11-01, 21:34
Jean-Daniel Cryans
2011-11-01, 21:40
Sujee Maniyam
2011-11-01, 21:46
Jean-Daniel Cryans
2011-11-01, 22:02
Nicolas Spiegelberg
2011-11-01, 22:57
lars hofhansl
2011-11-03, 02:10
Nicolas Spiegelberg
2011-11-03, 02:55
Doug Meil
2011-11-03, 13:25
Sujee Maniyam
2011-11-03, 18:32
Michel Segel
2011-11-04, 11:37
Mikael Sitruk
2011-11-04, 12:32
|
-
region size/count per regionserverSujee Maniyam 2011-11-01, 17:54
HI all,
My HBase cluster is 10 nodes, each node has 12core , 48G RAM, 24TB disk, 10GEthernet. My region size is 1GB. Any guidelines on how many regions can a RS handle comfortably? I vaguely remember reading some where to have no more than 1000 regions / server; that comes to 1TB / server. Seems pretty low for the current hardware config. Any rules of thumb? experiences? thanks Sujee http://sujee.net
-
Re: region size/count per regionserverJean-Daniel Cryans 2011-11-01, 18:22
These days I think the recommendation is more like 20 regions per
region server, and the region size set accordingly. The major caveat is that when you start compacting the bigger store files you can really take a massive IO hit, so most of the time major compactions are tuned to run only every week or disabled and ran manually during low traffic. The FB peeps did a lot of experimenting and that's what they are currently running, they also contributed back a few optimizations for compactions in 0.92. In our case we have a pretty old setup and had way too many regions so we ran a few online merges to bring this down to like 80 regions/RS and it's working pretty well. J-D On Tue, Nov 1, 2011 at 10:54 AM, Sujee Maniyam <[EMAIL PROTECTED]> wrote: > HI all, > My HBase cluster is 10 nodes, each node has 12core , 48G RAM, 24TB disk, > 10GEthernet. > My region size is 1GB. > > Any guidelines on how many regions can a RS handle comfortably? > I vaguely remember reading some where to have no more than 1000 regions / > server; that comes to 1TB / server. Seems pretty low for the current > hardware config. > > Any rules of thumb? experiences? > > thanks > Sujee > > http://sujee.net >
-
Re: region size/count per regionserverSujee Maniyam 2011-11-01, 21:34
>
> optimizations for compactions in 0.92. In our case we have a pretty > old setup and had way too many regions so we ran a few online merges > to bring this down to like 80 regions/RS and it's working pretty well. J-D what is the region size you use? and is it 80 regions / table / region-server? or 80 regions / all tables / regionserver?
-
Re: region size/count per regionserverJean-Daniel Cryans 2011-11-01, 21:40
On Tue, Nov 1, 2011 at 2:34 PM, Sujee Maniyam <[EMAIL PROTECTED]> wrote:
>> >> optimizations for compactions in 0.92. In our case we have a pretty >> old setup and had way too many regions so we ran a few online merges >> to bring this down to like 80 regions/RS and it's working pretty well. > > > J-D > what is the region size you use? 20GB, less in some cases for small tables. > and is it 80 regions / table / region-server? or 80 regions / all tables > / regionserver? 80 regions total / RS J-D
-
Re: region size/count per regionserverSujee Maniyam 2011-11-01, 21:46
>
> > J-D > > what is the region size you use? > > 20GB, less in some cases for small tables. > > > and is it 80 regions / table / region-server? or 80 regions / all > tables > > / regionserver? > > 80 regions total / RS 20GB, compressed ? If so is it LZO or Snappy?
-
Re: region size/count per regionserverJean-Daniel Cryans 2011-11-01, 22:02
On Tue, Nov 1, 2011 at 2:46 PM, Sujee Maniyam <[EMAIL PROTECTED]> wrote:
> 20GB, compressed ? If so is it LZO or Snappy? The region size is expressed in terms of size on disk, in our case it's LZOed. J-D
-
Re: region size/count per regionserverNicolas Spiegelberg 2011-11-01, 22:57
Simple answer
------------- 20 regions/server & <2000 regions/cluster is a good rule of thumb if you can't profile your workload yet. You really want to ensure that 1) You need to limits the regions/cluster so the master can have a reasonable startup time & can handle all the region state transitions via ZK. Most bigger companies are running 2,000 in production and achieve reasonable startup times (< 2 minutes for region assignment on cold start). If you want to test the scalability of that algorithm beyond what other companies need, admin beware. 2) The more regions/server you have, the faster that recovery can happen after RS death because you can currently parallelize recovery on a region-granularity. Too many regions/server and #1 starts to be a problem. Complicated answer ------------------ More information is optimize this formula. Additional considerations: 1) Are you IO-bound or CPU-bound 2) What is your grid topology like 3) What is your network hardware like 4) How many disks (not just size) 5) What is the data locality between RegionServer & DataNode In the Facebook case, we have 5 racks with 20 nodes each. Servers in the rack are connected by 1G Eth to a switch with a 10G uplink. We are network bound. Our saturation point is mostly commonly on the top-of-rack switch. With 20 regions/server, we can roughly parallelize our distributed log splitting within a single rack on RS death (although 2 regions do split off-rack). This minimizes top-of-rack traffic and optimized our recovery time. Even if you are CPU-bound, log splitting (hence recovery time) is an IO-bound operation. A lot of our work on region assignment is about maximizing data locality, even on RS death, so we avoid top-of-rack saturation. On 11/1/11 10:54 AM, "Sujee Maniyam" <[EMAIL PROTECTED]> wrote: >HI all, >My HBase cluster is 10 nodes, each node has 12core , 48G RAM, 24TB disk, >10GEthernet. >My region size is 1GB. > >Any guidelines on how many regions can a RS handle comfortably? >I vaguely remember reading some where to have no more than 1000 regions / >server; that comes to 1TB / server. Seems pretty low for the current >hardware config. > >Any rules of thumb? experiences? > >thanks >Sujee > >http://sujee.net
-
Re: region size/count per regionserverlars hofhansl 2011-11-03, 02:10
Do we know what would need to change in HBase in order to be able to manage more regions per regionserver?
With 20 regions per server, one would need 300G regions to just utilize 6T of drive space. To utilize a regionserver/datanode with 24T drive space the region size would be an insane 1T. -- Lars ________________________________ From: Nicolas Spiegelberg <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Karthik Ranganathan <[EMAIL PROTECTED]>; Kannan Muthukkaruppan <[EMAIL PROTECTED]> Sent: Tuesday, November 1, 2011 3:57 PM Subject: Re: region size/count per regionserver Simple answer ------------- 20 regions/server & <2000 regions/cluster is a good rule of thumb if you can't profile your workload yet. You really want to ensure that 1) You need to limits the regions/cluster so the master can have a reasonable startup time & can handle all the region state transitions via ZK. Most bigger companies are running 2,000 in production and achieve reasonable startup times (< 2 minutes for region assignment on cold start). If you want to test the scalability of that algorithm beyond what other companies need, admin beware. 2) The more regions/server you have, the faster that recovery can happen after RS death because you can currently parallelize recovery on a region-granularity. Too many regions/server and #1 starts to be a problem. Complicated answer ------------------ More information is optimize this formula. Additional considerations: 1) Are you IO-bound or CPU-bound 2) What is your grid topology like 3) What is your network hardware like 4) How many disks (not just size) 5) What is the data locality between RegionServer & DataNode In the Facebook case, we have 5 racks with 20 nodes each. Servers in the rack are connected by 1G Eth to a switch with a 10G uplink. We are network bound. Our saturation point is mostly commonly on the top-of-rack switch. With 20 regions/server, we can roughly parallelize our distributed log splitting within a single rack on RS death (although 2 regions do split off-rack). This minimizes top-of-rack traffic and optimized our recovery time. Even if you are CPU-bound, log splitting (hence recovery time) is an IO-bound operation. A lot of our work on region assignment is about maximizing data locality, even on RS death, so we avoid top-of-rack saturation. On 11/1/11 10:54 AM, "Sujee Maniyam" <[EMAIL PROTECTED]> wrote: >HI all, >My HBase cluster is 10 nodes, each node has 12core , 48G RAM, 24TB disk, >10GEthernet. >My region size is 1GB. > >Any guidelines on how many regions can a RS handle comfortably? >I vaguely remember reading some where to have no more than 1000 regions / >server; that comes to 1TB / server. Seems pretty low for the current >hardware config. > >Any rules of thumb? experiences? > >thanks >Sujee > >http://sujee.net
-
Re: region size/count per regionserverNicolas Spiegelberg 2011-11-03, 02:55
Region Scalability is definitely an investigation item that has not been
covered yet. We solved the problem with horizontal sharding into multiple clusters instead of tackling that subject with the timeframe we had. I'm guessing the 2-level ROOT/META was a response to that problem. On the actual region count / data size, that all depends on how high you want to scale your StoreFile size. 10GB StoreFiles are currently normal / reasonable. On 11/2/11 7:10 PM, "lars hofhansl" <[EMAIL PROTECTED]> wrote: >Do we know what would need to change in HBase in order to be able to >manage more regions per regionserver? >With 20 regions per server, one would need 300G regions to just utilize >6T of drive space. > > >To utilize a regionserver/datanode with 24T drive space the region size >would be an insane 1T. > >-- Lars > >________________________________ >From: Nicolas Spiegelberg <[EMAIL PROTECTED]> >To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >Cc: Karthik Ranganathan <[EMAIL PROTECTED]>; Kannan Muthukkaruppan ><[EMAIL PROTECTED]> >Sent: Tuesday, November 1, 2011 3:57 PM >Subject: Re: region size/count per regionserver > >Simple answer >------------- >20 regions/server & <2000 regions/cluster is a good rule of thumb if you >can't profile your workload yet. You really want to ensure that > >1) You need to limits the regions/cluster so the master can have a >reasonable startup time & can handle all the region state transitions via >ZK. Most bigger companies are running 2,000 in production and achieve >reasonable startup times (< 2 minutes for region assignment on cold >start). If you want to test the scalability of that algorithm beyond what >other companies need, admin beware. >2) The more regions/server you have, the faster that recovery can happen >after RS death because you can currently parallelize recovery on a >region-granularity. Too many regions/server and #1 starts to be a >problem. > > > >Complicated answer >------------------ >More information is optimize this formula. Additional considerations: > >1) Are you IO-bound or CPU-bound >2) What is your grid topology like >3) What is your network hardware like >4) How many disks (not just size) >5) What is the data locality between RegionServer & DataNode > >In the Facebook case, we have 5 racks with 20 nodes each. Servers in the >rack are connected by 1G Eth to a switch with a 10G uplink. We are >network bound. Our saturation point is mostly commonly on the top-of-rack >switch. With 20 regions/server, we can roughly parallelize our >distributed log splitting within a single rack on RS death (although 2 >regions do split off-rack). This minimizes top-of-rack traffic and >optimized our recovery time. Even if you are CPU-bound, log splitting >(hence recovery time) is an IO-bound operation. A lot of our work on >region assignment is about maximizing data locality, even on RS death, so >we avoid top-of-rack saturation. > > >On 11/1/11 10:54 AM, "Sujee Maniyam" <[EMAIL PROTECTED]> wrote: > >>HI all, >>My HBase cluster is 10 nodes, each node has 12core , 48G RAM, 24TB >>disk, >>10GEthernet. >>My region size is 1GB. >> >>Any guidelines on how many regions can a RS handle comfortably? >>I vaguely remember reading some where to have no more than 1000 regions / >>server; that comes to 1TB / server. Seems pretty low for the current >>hardware config. >> >>Any rules of thumb? experiences? >> >>thanks >>Sujee >> >>http://sujee.net
-
Re: region size/count per regionserverDoug Meil 2011-11-03, 13:25
Nicolas, when you say 10GB are normal and reasonable which HBase codeline are you referring to, and to which HFile format (ie., v1 vs. v2)? Are you referring to .89, .90 or .92? I think it would be useful to qualify that. Thanks! On 11/2/11 10:55 PM, "Nicolas Spiegelberg" <[EMAIL PROTECTED]> wrote: >Region Scalability is definitely an investigation item that has not been >covered yet. We solved the problem with horizontal sharding into multiple >clusters instead of tackling that subject with the timeframe we had. I'm >guessing the 2-level ROOT/META was a response to that problem. On the >actual region count / data size, that all depends on how high you want to >scale your StoreFile size. 10GB StoreFiles are currently normal / >reasonable. > >On 11/2/11 7:10 PM, "lars hofhansl" <[EMAIL PROTECTED]> wrote: > >>Do we know what would need to change in HBase in order to be able to >>manage more regions per regionserver? >>With 20 regions per server, one would need 300G regions to just utilize >>6T of drive space. >> >> >>To utilize a regionserver/datanode with 24T drive space the region size >>would be an insane 1T. >> >>-- Lars >> >>________________________________ >>From: Nicolas Spiegelberg <[EMAIL PROTECTED]> >>To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >>Cc: Karthik Ranganathan <[EMAIL PROTECTED]>; Kannan Muthukkaruppan >><[EMAIL PROTECTED]> >>Sent: Tuesday, November 1, 2011 3:57 PM >>Subject: Re: region size/count per regionserver >> >>Simple answer >>------------- >>20 regions/server & <2000 regions/cluster is a good rule of thumb if you >>can't profile your workload yet. You really want to ensure that >> >>1) You need to limits the regions/cluster so the master can have a >>reasonable startup time & can handle all the region state transitions via >>ZK. Most bigger companies are running 2,000 in production and achieve >>reasonable startup times (< 2 minutes for region assignment on cold >>start). If you want to test the scalability of that algorithm beyond >>what >>other companies need, admin beware. >>2) The more regions/server you have, the faster that recovery can happen >>after RS death because you can currently parallelize recovery on a >>region-granularity. Too many regions/server and #1 starts to be a >>problem. >> >> >> >>Complicated answer >>------------------ >>More information is optimize this formula. Additional considerations: >> >>1) Are you IO-bound or CPU-bound >>2) What is your grid topology like >>3) What is your network hardware like >>4) How many disks (not just size) >>5) What is the data locality between RegionServer & DataNode >> >>In the Facebook case, we have 5 racks with 20 nodes each. Servers in the >>rack are connected by 1G Eth to a switch with a 10G uplink. We are >>network bound. Our saturation point is mostly commonly on the >>top-of-rack >>switch. With 20 regions/server, we can roughly parallelize our >>distributed log splitting within a single rack on RS death (although 2 >>regions do split off-rack). This minimizes top-of-rack traffic and >>optimized our recovery time. Even if you are CPU-bound, log splitting >>(hence recovery time) is an IO-bound operation. A lot of our work on >>region assignment is about maximizing data locality, even on RS death, so >>we avoid top-of-rack saturation. >> >> >>On 11/1/11 10:54 AM, "Sujee Maniyam" <[EMAIL PROTECTED]> wrote: >> >>>HI all, >>>My HBase cluster is 10 nodes, each node has 12core , 48G RAM, 24TB >>>disk, >>>10GEthernet. >>>My region size is 1GB. >>> >>>Any guidelines on how many regions can a RS handle comfortably? >>>I vaguely remember reading some where to have no more than 1000 regions >>>/ >>>server; that comes to 1TB / server. Seems pretty low for the current >>>hardware config. >>> >>>Any rules of thumb? experiences? >>> >>>thanks >>>Sujee >>> >>>http://sujee.net > >
-
Re: region size/count per regionserverSujee Maniyam 2011-11-03, 18:32
On Tue, Nov 1, 2011 at 3:57 PM, Nicolas Spiegelberg <[EMAIL PROTECTED]>wrote:
> > In the Facebook case, we have 5 racks with 20 nodes each. Servers in the > rack are connected by 1G Eth to a switch with a 10G uplink. We are Nicholas thanks for sharing. what is the region size for this FB cluster (20 regions / server --> 2000 regions / cluster)? compressed?
-
Re: region size/count per regionserverMichel Segel 2011-11-04, 11:37
The funny thing about tuning... What works for one situation may not work well for others.
Using the old recommendation of never exceeding 1000 R per RS, keeping it low around 100-200 and monitoring tables and changing the REgion Size on a table by table basis we are doing OK. ( of course there are other nasty bugs that kill us... But that's a different thread...) The point is that you need to decide what makes sense for you and what trade offs you can live with... Just my two cents... Sent from a remote device. Please excuse any typos... Mike Segel On Nov 2, 2011, at 9:10 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Do we know what would need to change in HBase in order to be able to manage more regions per regionserver? > With 20 regions per server, one would need 300G regions to just utilize 6T of drive space. > > > To utilize a regionserver/datanode with 24T drive space the region size would be an insane 1T. > > -- Lars > > ________________________________ > From: Nicolas Spiegelberg <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Cc: Karthik Ranganathan <[EMAIL PROTECTED]>; Kannan Muthukkaruppan <[EMAIL PROTECTED]> > Sent: Tuesday, November 1, 2011 3:57 PM > Subject: Re: region size/count per regionserver > > Simple answer > ------------- > 20 regions/server & <2000 regions/cluster is a good rule of thumb if you > can't profile your workload yet. You really want to ensure that > > 1) You need to limits the regions/cluster so the master can have a > reasonable startup time & can handle all the region state transitions via > ZK. Most bigger companies are running 2,000 in production and achieve > reasonable startup times (< 2 minutes for region assignment on cold > start). If you want to test the scalability of that algorithm beyond what > other companies need, admin beware. > 2) The more regions/server you have, the faster that recovery can happen > after RS death because you can currently parallelize recovery on a > region-granularity. Too many regions/server and #1 starts to be a problem. > > > > Complicated answer > ------------------ > More information is optimize this formula. Additional considerations: > > 1) Are you IO-bound or CPU-bound > 2) What is your grid topology like > 3) What is your network hardware like > 4) How many disks (not just size) > 5) What is the data locality between RegionServer & DataNode > > In the Facebook case, we have 5 racks with 20 nodes each. Servers in the > rack are connected by 1G Eth to a switch with a 10G uplink. We are > network bound. Our saturation point is mostly commonly on the top-of-rack > switch. With 20 regions/server, we can roughly parallelize our > distributed log splitting within a single rack on RS death (although 2 > regions do split off-rack). This minimizes top-of-rack traffic and > optimized our recovery time. Even if you are CPU-bound, log splitting > (hence recovery time) is an IO-bound operation. A lot of our work on > region assignment is about maximizing data locality, even on RS death, so > we avoid top-of-rack saturation. > > > On 11/1/11 10:54 AM, "Sujee Maniyam" <[EMAIL PROTECTED]> wrote: > >> HI all, >> My HBase cluster is 10 nodes, each node has 12core , 48G RAM, 24TB disk, >> 10GEthernet. >> My region size is 1GB. >> >> Any guidelines on how many regions can a RS handle comfortably? >> I vaguely remember reading some where to have no more than 1000 regions / >> server; that comes to 1TB / server. Seems pretty low for the current >> hardware config. >> >> Any rules of thumb? experiences? >> >> thanks >> Sujee >> >> http://sujee.net >
-
Re: region size/count per regionserverMikael Sitruk 2011-11-04, 12:32
I think that it is needed a little bit more than just stating 1000 per RS
or 100-200 and monitor table. The reason is simple: it is very difficult for IT to work with this statement especially if we need to sell product based on hbase technology. It is important to figure out: * How many RS * How many memory to assign to RS * How many Region per RS * How many connections a cluster of Hbase can handle are needed for a specific deployment. I'm not an IT guy but i know that will have to answer such questions. Mikael.S On Fri, Nov 4, 2011 at 1:37 PM, Michel Segel <[EMAIL PROTECTED]>wrote: > The funny thing about tuning... What works for one situation may not work > well for others. > Using the old recommendation of never exceeding 1000 R per RS, keeping it > low around 100-200 and monitoring tables and changing the REgion Size on a > table by table basis we are doing OK. > ( of course there are other nasty bugs that kill us... But that's a > different thread...) > > The point is that you need to decide what makes sense for you and what > trade offs you can live with... > > Just my two cents... > > Sent from a remote device. Please excuse any typos... > > Mike Segel > > On Nov 2, 2011, at 9:10 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > > > Do we know what would need to change in HBase in order to be able to > manage more regions per regionserver? > > With 20 regions per server, one would need 300G regions to just utilize > 6T of drive space. > > > > > > To utilize a regionserver/datanode with 24T drive space the region size > would be an insane 1T. > > > > -- Lars > > > > ________________________________ > > From: Nicolas Spiegelberg <[EMAIL PROTECTED]> > > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > > Cc: Karthik Ranganathan <[EMAIL PROTECTED]>; Kannan Muthukkaruppan < > [EMAIL PROTECTED]> > > Sent: Tuesday, November 1, 2011 3:57 PM > > Subject: Re: region size/count per regionserver > > > > Simple answer > > ------------- > > 20 regions/server & <2000 regions/cluster is a good rule of thumb if you > > can't profile your workload yet. You really want to ensure that > > > > 1) You need to limits the regions/cluster so the master can have a > > reasonable startup time & can handle all the region state transitions via > > ZK. Most bigger companies are running 2,000 in production and achieve > > reasonable startup times (< 2 minutes for region assignment on cold > > start). If you want to test the scalability of that algorithm beyond > what > > other companies need, admin beware. > > 2) The more regions/server you have, the faster that recovery can happen > > after RS death because you can currently parallelize recovery on a > > region-granularity. Too many regions/server and #1 starts to be a > problem. > > > > > > > > Complicated answer > > ------------------ > > More information is optimize this formula. Additional considerations: > > > > 1) Are you IO-bound or CPU-bound > > 2) What is your grid topology like > > 3) What is your network hardware like > > 4) How many disks (not just size) > > 5) What is the data locality between RegionServer & DataNode > > > > In the Facebook case, we have 5 racks with 20 nodes each. Servers in the > > rack are connected by 1G Eth to a switch with a 10G uplink. We are > > network bound. Our saturation point is mostly commonly on the > top-of-rack > > switch. With 20 regions/server, we can roughly parallelize our > > distributed log splitting within a single rack on RS death (although 2 > > regions do split off-rack). This minimizes top-of-rack traffic and > > optimized our recovery time. Even if you are CPU-bound, log splitting > > (hence recovery time) is an IO-bound operation. A lot of our work on > > region assignment is about maximizing data locality, even on RS death, so > > we avoid top-of-rack saturation. > > > > > > On 11/1/11 10:54 AM, "Sujee Maniyam" <[EMAIL PROTECTED]> wrote: > > > >> HI all, > >> My HBase cluster is 10 nodes, each node has 12core , 48G RAM, 24TB Mikael Sitruk |