|
|
Mohit Anchlia 2012-10-06, 00:47
Do most people start out with default values and then tune HBase? Or are there some important configuration parameter that should always be changed on client and the server?
Michael Segel 2012-10-06, 00:54
Depends. What sort of system are you tuning?
Sorry, but we have to start somewhere and if we don't know what you have in terms of hardware, we don't have a good starting point.
On Oct 5, 2012, at 7:47 PM, Mohit Anchlia <[EMAIL PROTECTED]> wrote:
> Do most people start out with default values and then tune HBase? Or are > there some important configuration parameter that should always be changed > on client and the server?
Kevin O'dell 2012-10-06, 01:05
Mohit,
Michael is right most parameters usually go one way or the other depending on what you are trying to accomplish.
Memstore - raise for high write
Blockcache - raise for high reads
hbase blocksize - higher for sequential workload lower for random
client caching - lower for really wide rows/large cells and higher for tall tables/small cells
etc.
On Fri, Oct 5, 2012 at 8:54 PM, Michael Segel <[EMAIL PROTECTED]>wrote:
> Depends. > What sort of system are you tuning? > > Sorry, but we have to start somewhere and if we don't know what you have > in terms of hardware, we don't have a good starting point. > > On Oct 5, 2012, at 7:47 PM, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > > > Do most people start out with default values and then tune HBase? Or are > > there some important configuration parameter that should always be > changed > > on client and the server? > > -- Kevin O'Dell Customer Operations Engineer, Cloudera
Mohit Anchlia 2012-10-06, 02:16
I have a timeseries data and each row has upto 1000 cols. I just started with defaults and I have not tuned any parameters on client or server. My reads are reading all the cols in a row. But request for a given row is completely random.
On Fri, Oct 5, 2012 at 6:05 PM, Kevin O'dell <[EMAIL PROTECTED]>wrote:
> Mohit, > > Michael is right most parameters usually go one way or the other depending > on what you are trying to accomplish. > > Memstore - raise for high write > > Blockcache - raise for high reads > > hbase blocksize - higher for sequential workload lower for random > > client caching - lower for really wide rows/large cells and higher for tall > tables/small cells > > etc. > > On Fri, Oct 5, 2012 at 8:54 PM, Michael Segel <[EMAIL PROTECTED] > >wrote: > > > Depends. > > What sort of system are you tuning? > > > > Sorry, but we have to start somewhere and if we don't know what you have > > in terms of hardware, we don't have a good starting point. > > > > On Oct 5, 2012, at 7:47 PM, Mohit Anchlia <[EMAIL PROTECTED]> > wrote: > > > > > Do most people start out with default values and then tune HBase? Or > are > > > there some important configuration parameter that should always be > > changed > > > on client and the server? > > > > > > > -- > Kevin O'Dell > Customer Operations Engineer, Cloudera >
Amandeep Khurana 2012-10-06, 04:21
Mohit
Getting the maximum performance out of HBase isn't just about tuning the cluster. There are several other factors to take into account. The two most important being:
1. Most important factor being the schema design
2. How you are using the APIs
Starting with the default configs is okay. Are you getting performance or stability issues? If yes, start by knocking those out.
-Amandeep
PS: I have covered several tuning concepts in HBase In Action and there is plenty information available in the online HBase manual and Lars' book as well. Refer to those if you want to understand more general concepts that are at play.
On Oct 5, 2012, at 7:16 PM, Mohit Anchlia <[EMAIL PROTECTED]> wrote:
> I have a timeseries data and each row has upto 1000 cols. I just started > with defaults and I have not tuned any parameters on client or server. My > reads are reading all the cols in a row. But request for a given row is > completely random. > > On Fri, Oct 5, 2012 at 6:05 PM, Kevin O'dell <[EMAIL PROTECTED]>wrote: > >> Mohit, >> >> Michael is right most parameters usually go one way or the other depending >> on what you are trying to accomplish. >> >> Memstore - raise for high write >> >> Blockcache - raise for high reads >> >> hbase blocksize - higher for sequential workload lower for random >> >> client caching - lower for really wide rows/large cells and higher for tall >> tables/small cells >> >> etc. >> >> On Fri, Oct 5, 2012 at 8:54 PM, Michael Segel <[EMAIL PROTECTED] >>> wrote: >> >>> Depends. >>> What sort of system are you tuning? >>> >>> Sorry, but we have to start somewhere and if we don't know what you have >>> in terms of hardware, we don't have a good starting point. >>> >>> On Oct 5, 2012, at 7:47 PM, Mohit Anchlia <[EMAIL PROTECTED]> >> wrote: >>> >>>> Do most people start out with default values and then tune HBase? Or >> are >>>> there some important configuration parameter that should always be >>> changed >>>> on client and the server? >>> >>> >> >> >> -- >> Kevin O'Dell >> Customer Operations Engineer, Cloudera >>
|
|