Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - What could cause HBase's writes slower than reads?


Copy link to this message
-
Re: What could cause HBase's writes slower than reads?
yun peng 2012-11-04, 15:08
Yes Mohit, that's the cause! HBase is read values not present in datatable,
which leads to very fast read performance. When I change it to reading
valid values in data table, it is back to normal (read slower than write).
Thank you a lot for the discussion.
Yun
On Sat, Nov 3, 2012 at 11:11 AM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:

> Some of the things to look at is I/O on the disk, CPU on the server. Look
> at CPU and java thread dumps on the client. I use Ganglia to look at server
> stats and it is often very helpful.
>
> In my opinon best thing would be to add some code around various HBase
> calls to see where time is being spent on the client side and then go from
> there.
>
> Do your reads read same data set as writes?
>
> On Sat, Nov 3, 2012 at 7:09 AM, yun peng <[EMAIL PROTECTED]> wrote:
>
> > Hi, the throughput for write-only workload is 450 ops/sec and for
> read-only
> > 900 ops/sec. I am using the same machine (1-core CPU, 2G mem) for client
> to
> > drive workload into hbase/hdfs... one thread is used in client side. For
> > this workload, it looks client should not be the bottleneck... Btw, is
> > there anyway to verify this.
> > Thanks,
> > Yun
> >
> > On Sat, Nov 3, 2012 at 1:04 AM, Mohit Anchlia <[EMAIL PROTECTED]
> > >wrote:
> >
> > > What load do you see on the system? I am wondering if bottleneck is on
> > the
> > > client side.
> > >
> > > On Fri, Nov 2, 2012 at 9:07 PM, yun peng <[EMAIL PROTECTED]>
> wrote:
> > >
> > > > Hi, All,
> > > > In my HBase cluster, I observed Put() executes faster than a Get().
> > Since
> > > > HBase is optimized towards write, I wonder what may affect Put
> > > performance
> > > > in a distributed setting below.
> > > >
> > > > Hbase setup:
> > > > My HBase cluster are of three nodes, in which one hosts zookeeper and
> > > > HMaster, and two slaves. HBase cluster is attached to HDFS which
> > resides
> > > on
> > > > a separated cluster. The machines are fairly commodity or lower end,
> > with
> > > > 2G memory and 1-core CPU.
> > > >
> > > > Observed results:
> > > > I test the Put and Get latency on this setup, and find out Put runs
> > > slower
> > > > than Get (which is a bit surprising to me.) In case anyone is
> > interested,
> > > > in my result, Put() takes around 3000us and Get only in 1000us (so I
> > > think
> > > > it does not touch disk).
> > > >
> > > > What could possibly slow dow Put() and speed up Get() performance in
> > > HBase?
> > > > Does this possibly have to do with distributed setting, like Put
> needs
> > > > update multiple (duplicated) copies while Gets only one..  I am quite
> > > > newbie to HBase internal and not familiar with HBase Put/Get code
> path,
> > > has
> > > > anyone here have similar experiences?
> > > >
> > > > Thanks,
> > > > Yun
> > > >
> > >
> >
>