# Without meta on master, we double assign and lose data.
That is currently a fact that I have seen over and over on multiple loaded
clusters. Some abstract clean up of deployment vs losing data is a
no-brainer for me. Master assignment, region split, region merge are all
risky, and all places that HBase can lose data. Meta being hosted on the
master makes communication easier and less flakey. Running ITBLL on a loop
that creates a new table every time, and without meta on master everything
will fail pretty reliably in ~2 days. With meta on master things pass MUCH
# Master hosting the system tables locates the system tables as close as
possible to the machine that will be mutating the data.
Data locality is something that we all work for. Short circuit local reads,
Caching blocks in jvm, etc. Bringing data closer to the interested party
has a long history of making things faster and better. Master is in charge
of just about all mutations of all systems tables. It's in charge of
changing meta, changing acls, creating new namespaces, etc. So put the
memstore as close as possible to the system that's going to mutate meta.
# If you want to make meta faster then moving it to other regionservers
makes things worse.
Meta can get pretty hot. Putting it with other regions that clients will be
trying to access makes everything worse. It means that meta is competing
with user requests. If meta gets served and other requests don't, causing
more requests to meta; or requests to user regions get served and other
clients get starved.
At FB we've seen read throughput to meta doubled or more by swapping it to
master. Writes to meta are also much faster since there's no rpc hop, no
queueing, to fighting with reads. So far it has been the single biggest
thing to make meta faster.
On Thu, Apr 7, 2016 at 10:11 PM, Stack <[EMAIL PROTECTED]> wrote: