Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Replication not suited for intensive write applications?


Copy link to this message
-
Re: Replication not suited for intensive write applications?
What is the ageOfLastShippedOp as reported on your Master region servers
(should be available through the /jmx) - it tells the delay your edits are
experiencing before being shipped. If this number is < 1000 (in
milliseconds), I would say replication is doing a very good job. This is
the most important metric worth tracking and I would be interested in how
it looks since we are also looking into using replication for write heavy
workloads...

The network on your 2nd cluster could be lower because replication ships
edits in batches - so the batching could be amortizing the amount of data
sent over the wire. Also, when you are measuring traffic - are you
measuring the traffic on the NIC - which will also include traffic due to
HDFS replication ?
On Thu, Jun 20, 2013 at 3:46 AM, Asaf Mesika <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I've been conducting lots of benchmarks to test the maximum throughput of
> replication in HBase.
>
> I've come to the conclusion that HBase replication is not suited for write
> intensive application. I hope that people here can show me where I'm wrong.
>
> *My setup*
> *Cluster (*Master and slave are alike)
> 1 Master, NameNode
> 3 RS, Data Node
>
> All computers are the same: 8 Cores x 3.4 GHz, 8 GB Ram, 1 Gigabit ethernet
> card
>
> I insert data into HBase from a java process (client) reading files from
> disk, running on the machine running the HBase Master in the master
> cluster.
>
> *Benchmark Results*
> When the client writes with 10 Threads, then the master cluster writes at
> 17 MB/sec, while the replicated cluster writes at 12 Mb/sec. The data size
> I wrote is 15 GB, all Puts, to two different tables.
> Both clusters when tested independently without replication, achieved write
> throughput of 17-19 MB/sec, so evidently the replication process is the
> bottleneck.
>
> I also tested connectivity between the two clusters using "netcat" and
> achieved 111 MB/sec.
> I've checked the usage of the network cards both on the client, master
> cluster region server and slave region servers. No computer when over
> 30mb/sec in Receive or Transmit.
> The way I checked was rather crud but works: I've run "netstat -ie" before
> HBase in the master cluster starts writing and after it finishes. The same
> was done on the replicated cluster (when the replication started and
> finished). I can tell the amount of bytes Received and Transmitted and I
> know that duration each cluster worked, thus I can calculate the
> throughput.
>
>  *The bottleneck in my opinion*
> Since we've excluded network capacity, and each cluster works at faster
> rate independently, all is left is the replication process.
> My client writes to the master cluster with 10 Threads, and manages to
> write at 17-18 MB/sec.
> Each region server has only 1 thread responsible for transmitting the data
> written to the WAL to the slave cluster. Thus in my setup I effectively
> have 3 threads writing to the slave cluster.  Thus this is the bottleneck,
> since this process can not be parallelized, since it must transmit the WAL
> in a certain order.
>
> *Conclusion*
> When writes intensively to HBase with more than 3 threads (in my setup),
> you can't use replication.
>
> *Master throughput without replication*
> On a different note, I have one thing I couldn't understand at all.
> When turned off replication, and wrote with my client with 3 threads I got
> throughput of 11.3 MB/sec. When I wrote with 10 Threads (any more than that
> doesn't help) I achieved maximum throughput of 19 MB/sec.
> The network cards showed 30MB/sec Receive and 20MB/sec Transmit on each RS,
> thus the network capacity was not the bottleneck.
> On the HBase master machine which ran the client, the network card again
> showed Receive throughput of 0.5MB/sec and Transmit throughput of 18.28
> MB/sec. Hence it's the client machine network card creating the bottleneck.
>
> The only explanation I have is the synchronized writes to the WAL. Those 10
> threads have to get in line, and one by one, write their batch of Puts to