Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> 0.92 and Read/writes not scaling

Copy link to this message
RE: 0.92 and Read/writes not scaling

We've been working on some similar performance testing on our 50 node
cluster running 0.92.1 and CDH3U3.

We were looking mostly at reads, but observed similar behavior. HBase
wasn't particularly busy, but we couldn't make it go faster.

Some debugging later, we found that many (sometimes most) of our
responses from HBase would return in 20 or 40 ms.  It was kind of
interesting to watch, we'd ask for the same row over and over, it would
either return in 0 ms, 20 ms, or 40 ms.

Looking around we found some related Jira's:

We added the settings to our below config to disable Nagle.

For us, setting these two, got rid of  all of the 20 and 40 ms response
times and dropped the average response time we measured from HBase by
more than half.  Plus, we can push HBase a lot harder.



-----Original Message-----
From: Juhani Connolly [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 28, 2012 4:27 AM
Subject: Re: 0.92 and Read/writes not scaling

I think there is a lot of stuff in this and the situation has changed a
bit so I'd like to summarize the current situation and verify a few

Our current environment:
- CDH 4b1: hdfs 0.23 and hbase 0.92
- separate master and namenode, 64gb, 24 cores each, colocating with
zookeepers(third zookeeper on a separate  unshared server)
- 11 datanode/regionservers, 24 cores, 64gb, 4 * 1.5tb disks(should
become a  bottleneck but isn't yet)
- Table is split into approx 300 regions and is balanced with from 25-35

regions/server, using snappy compression. Unless otherwise mentioned
delayed flushing is disabled

The current problem:
- Flushed writes seem slow compared to our previous setup(which was the
same but using hdfs 0.20.2).
  - Hardware usage is poor with no visible hardware bottlenecks(this was

also the case with our old setup)

- YCSB, PerformanceEvaluation, application specific throughput test and
a generic testing solution(attaching a simplified version that includes
the core issues and works standalone)
- On our hdfs 0.20.2 setup, we were getting throughput of 40,000
writes/sec(128-256 bytes each), or higher if we delayed log flushes,
used batch puts, or similar.
- On our new setup, we are  getting about 15,000 wps. If  we use the
non-flushing setup(-t writeunflushed in the attached test) however we
can easily push 10 times that
- Hardware not  creating bottlenecks is generally evidenced by ganglia,
top, iostat -d, iperf and a  number of others.
- We tested append speed with DFSIOTest using 256 byte entries and 10
files, giving us a throughput of 64mb(about 250,000 entries per second
in theory then), so wal writes really should be able to keep up with a
lot of throughput?

One doubt:
- While we are fairly confident this is not the case, the only thing I
could think of is that there autoFlush was off for our tests with
0.20.2. We used the same test program on both versions, and it is only
today that I explicitly set it to off(so it has been working on the
default). We never set the writebuffer size.

What I'd like to know:
- What kind of throughput are people getting on data that is fully
AutoFlushed(so every entry is sent to the wall as table.put() is called?

Are our figures(a bit over 1000 per node) normal? Or should we be
expecting the figures(4-5000 per sec per node) that we were getting on
hdfs 0.20.2?
- Do people normally see their hardware get anywhere near maxing out on
heavy write load?
- Is there something wrong with the way we are testing?

On 03/27/2012 12:18 PM, Juhani Connolly wrote: