Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Replication not suited for intensive write applications?


Copy link to this message
-
Re: Replication not suited for intensive write applications?
On Thu, Jun 20, 2013 at 7:12 PM, Varun Sharma <[EMAIL PROTECTED]> wrote:

> What is the ageOfLastShippedOp as reported on your Master region servers
> (should be available through the /jmx) - it tells the delay your edits are
> experiencing before being shipped. If this number is < 1000 (in
> milliseconds), I would say replication is doing a very good job. This is
> the most important metric worth tracking and I would be interested in how
> it looks since we are also looking into using replication for write heavy
> workloads...
>
> ageOfLastShippedOp showed 10min, on 15GB on inserted data. When I ran the
test with 50GB, it showed 30min. This was also easily spotted when in
Graphite I see when the writeRequests count started increasing in the slave
RS and when it stopped, thus can measure the duration of the replication.

Although it is the single most important metric,  I had to fire up JConsole
on the 3 Master RS since when using the hadoop-metrics.properties and
configuring a context for Graphite (or even a file) I've discovered that if
there is/was a recovered edits queue of another RS, it has reported its
ageOfLastShippedOp forever instead of the active queue (since there's isn't
a ageOfLastShippedOp metrics per queue).
> The network on your 2nd cluster could be lower because replication ships
> edits in batches - so the batching could be amortizing the amount of data
> sent over the wire. Also, when you are measuring traffic - are you
> measuring the traffic on the NIC - which will also include traffic due to
> HDFS replication ?
>
> My NIC/ethernet measuring is quite simple. I ran "netstat -ie" which gives
a total counter of bytes, both on Receive and Transmit for my interface
(eth0). Running it before and after, gives you the total amount of bytes. I
also know the duration of the replication work by watching the
writeRequestsCount metric settle on the slave RS, thus I can calculate the
throughput. 15 GB / 14min.
Regarding your question - yes, it has to include all traffic on the card,
which probably includes HDFS replication. There's much I can do about that
though.
We should note that the network capacity is not the issue, since it was
measured 30MB/sec Receive and 20MB/sec Transmit, thus it's far from the
measured max bandwidth of 111MB/sec (measured by running nc - netcat).
>
> On Thu, Jun 20, 2013 at 3:46 AM, Asaf Mesika <[EMAIL PROTECTED]>
> wrote:
>
> > Hi,
> >
> > I've been conducting lots of benchmarks to test the maximum throughput of
> > replication in HBase.
> >
> > I've come to the conclusion that HBase replication is not suited for
> write
> > intensive application. I hope that people here can show me where I'm
> wrong.
> >
> > *My setup*
> > *Cluster (*Master and slave are alike)
> > 1 Master, NameNode
> > 3 RS, Data Node
> >
> > All computers are the same: 8 Cores x 3.4 GHz, 8 GB Ram, 1 Gigabit
> ethernet
> > card
> >
> > I insert data into HBase from a java process (client) reading files from
> > disk, running on the machine running the HBase Master in the master
> > cluster.
> >
> > *Benchmark Results*
> > When the client writes with 10 Threads, then the master cluster writes at
> > 17 MB/sec, while the replicated cluster writes at 12 Mb/sec. The data
> size
> > I wrote is 15 GB, all Puts, to two different tables.
> > Both clusters when tested independently without replication, achieved
> write
> > throughput of 17-19 MB/sec, so evidently the replication process is the
> > bottleneck.
> >
> > I also tested connectivity between the two clusters using "netcat" and
> > achieved 111 MB/sec.
> > I've checked the usage of the network cards both on the client, master
> > cluster region server and slave region servers. No computer when over
> > 30mb/sec in Receive or Transmit.
> > The way I checked was rather crud but works: I've run "netstat -ie"
> before
> > HBase in the master cluster starts writing and after it finishes. The
> same
> > was done on the replicated cluster (when the replication started and
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB