Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> loggers


On Mon, Jan 30, 2012 at 10:53 AM, Eric Newton <[EMAIL PROTECTED]> wrote:
> I will definitely look at using HBase's WAL.  What did you guys do to
> distribute the log-sort/split?

In 0.90 the log sorting was done serially by the master, which as you
can imagine was slow after a full outage.
In 0.92 we have distributed log splitting:
https://issues.apache.org/jira/browse/hbase-1364

I don't think there is a single design doc for it anywhere, but
basically we use ZK as a distributed work queue. The master shoves
items in there, and the region servers try to claim them. Each item
corresponds to a log segment (the HBase WAL rolls periodically). After
the segments are split, they're moved into the appropriate region
directories, so when the region is next opened, they are replayed.

-Todd

>
> On Mon, Jan 30, 2012 at 1:37 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
>
>> On Mon, Jan 30, 2012 at 10:05 AM, Aaron Cordova
>> <[EMAIL PROTECTED]> wrote:
>> >> The big problem is in the fact that writing replicas in HDFS is done in
>> a pipeline, rather than in parallel. There is a ticket to change this
>> (HDFS-1783), but no movement on it since last summer.
>> >
>> > ugh - why would they change this? Pipelining maximizes bandwidth usage.
>> It'd be cool if the log stream could be configured to return after written
>> to one, two, or more nodes though.
>> >
>>
>> The JIRA proposes to allow "star replication" instead of "pipeline
>> replication" on a per-stream basis. Pipelining trades off latency for
>> bandwidth -- multiple RTTs instead of 1 RTT.
>>
>> A few other notes relevant to the discussion above (sorry for losing
>> the quote history):
>>
>> Regarding HDFS's being designed for large sequential writes rather
>> than small records, that was originally true, but now its actually
>> fairly efficient. We have optimizations like HDFS-895 specifically for
>> the WAL use case which approximate things like group commit, and when
>> you combine that with group commit at the tablet-server level you can
>> get very good throughput along with durability guarantees. I haven't
>> benchmarked vs Accumulo's Loggers ever, but I'd be surprised if the
>> difference were substantial - we tend to be network bound on the WAL
>> unless the edits are really quite tiny.
>>
>> We're also looking at making our WAL implementation pluggable: see
>> HBASE-4529. Maybe a similar approach could be taken in Accumulo such
>> that HBase could use Accumulo loggers, or Accumulo could use HBase's
>> existing WAL class?
>>
>> -Todd
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>

--
Todd Lipcon
Software Engineer, Cloudera