> Jonathan Hsieh, WAL per region (WALpr) would give you the locality (and
hence HDFS short
> circuit) of reads if you were to couple it with the favored nodes. The
cost is of course more WAL
> files... In the current situation (no WALpr) it would create quite some
traffic cross machine, no?
I think we all agree that wal per region isn't efficient on today's
spinning hard drive world where we are limited to a relatively low budget
or seeks (though may be better in the future with SSD's).
With this in mind, I actually I making the case that we would group the all
the regions from RS-A onto the same set of preferred regions servers. This
way we only need to have one or two other RS's tailing the RS.
So for example, if region X, Y and Z were on RS-A and its hlog, the shadow
region memstores for X, Y, and Z would be assigned to the same one or two
other RSs. Ideally this would be where the HLog files replicas have
locality (helped by favored nodes/block affinity). Doing this, we hold the
number of readers on the active hlogs to a constant number, do not add any
new cross machine traffic (though tailing currently has costs on the NN).
One inefficiency we have is that if there is a single log per RS, we end up
reading all the logs to tables that may not have the shadow feature
enabled. However, with HBase multi-wals coming, one strategy is to shard
wals to a number on the order of the number of disks on a machine (12-24
these days). I think the a wal per namespaces (this could be used to have
a wal per table) of the hlog would make sense. This way of shardind the
hlog would reduce the amount of reading of irrelevant log entries on a log
tailing scheme. It would have the added benefit of reducing the log
splitting work reducing MTTR and allowing for recovery priorities if the
primaries and shadows also go down. (this is an generalization of the
separate out the META into a separate log idea).
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// [EMAIL PROTECTED]