Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> HBase 0.94.15: writes stalls periodically even under moderate steady load (AWS EC2)


Copy link to this message
-
答复: HBase 0.94.15: writes stalls periodically even under moderate steady load (AWS EC2)
It would be better if you could provide some thread dumps while the stalls happened.

Thanks,
Liang
________________________________________
发件人: Vladimir Rodionov [[EMAIL PROTECTED]]
发送时间: 2014年1月16日 13:49
收件人: [EMAIL PROTECTED]; lars hofhansl
主题: Re: HBase 0.94.15: writes stalls periodically even under moderate steady load (AWS EC2)

Its not IO, CPU or Network - its HBase. Stalls repeat periodically. Any
particular message in a Log file I should look for?
On Wed, Jan 15, 2014 at 9:17 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> So where's the bottleneck? You say it's not IO, not is it CPU, I presume.
> Network? Are the writers blocked because there are too many storefiles?
> (in which case you maxed out your storage IO)
> Are you hotspotting a region server?
>
> From the stacktrace it looks like ycsb is doing single puts, each
> incurring an RPC. You're testing AWS' network :)
>
>
> I write 10-20k (small) rows per second in bulk on a single box for testing
> all the time.
> With 3-way replication a 5 nodes cluster is pretty puny. Each box will get
> 60% of each write on average, just to state the obvious.
>
> As I said, if it's slow, I'd love to see where the bottleneck is, so that
> we can fix it, if it is something we can fix in HBase.
>
> -- Lars
>
>
>
> ________________________________
>  From: Vladimir Rodionov <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Sent: Wednesday, January 15, 2014 5:32 PM
> Subject: Re: HBase 0.94.15: writes stalls periodically even under moderate
> steady load (AWS EC2)
>
>
> Yes, I am using ephemeral (local) storage. I found that iostat is most of
> the time idle on 3K load with periodic bursts up to 10% iowait. 3-4K is
> probably the maximum this skinny cluster can sustain w/o additional
> configuration tweaking. I will try more powerful instances, of course, but
> the beauty of m1.xlarge is 0.05 price on the spot market. 5 nodes cluster
> (+1) is ~ $7 per day. Good for experiments, but, definitely, not for real
> testing.
>
> -Vladimir Rodionov
>
>
>
> On Wed, Jan 15, 2014 at 3:27 PM, Andrew Purtell <[EMAIL PROTECTED]>
> wrote:
>
> > Also I assume your HDFS is provisioned on locally attached disk, aka
> > instance store, and not EBS?
> >
> >
> > On Wed, Jan 15, 2014 at 3:26 PM, Andrew Purtell <[EMAIL PROTECTED]>
> > wrote:
> >
> > > m1.xlarge is a poorly provisioned instance type, with low PPS at the
> > > network layer. Can you try a type advertised to have "high" I/O
> > > performance?
> > >
> > >
> > > On Wed, Jan 15, 2014 at 12:33 PM, Vladimir Rodionov <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > >> This is something which needs to be definitely solved/fixed/resolved
> > >>
> > >> I am running YCSB benchmark on aws ec2 on a small HBase cluster
> > >>
> > >> 5 (m1.xlarge) as RS
> > >> 1 (m1.xlarge) hbase-master, zookeper
> > >>
> > >> Whirr 0.8.2 (with many hacks) is used to provision HBase.
> > >>
> > >> I am running 1 ycsb client (100% insert ops) throttled at 5K ops:
> > >>
> > >> ./bin/ycsb load hbase -P workloads/load20m -p columnfamily=family -s
> > >> -threads 10 -target 5000
> > >>
> > >> OUTPUT:
> > >>
> > >> 1120 sec: 5602339 operations; 4999.7 current ops/sec; [INSERT
> > >> AverageLatency(us)=225.53]
> > >>  1130 sec: 5652117 operations; 4969.35 current ops/sec; [INSERT
> > >> AverageLatency(us)=203.31]
> > >>  1140 sec: 5665210 operations; 1309.04 current ops/sec; [INSERT
> > >> AverageLatency(us)=17.13]
> > >>  1150 sec: 5665210 operations; 0 current ops/sec;
> > >>  1160 sec: 5665210 operations; 0 current ops/sec;
> > >>  1170 sec: 5665210 operations; 0 current ops/sec;
> > >>  1180 sec: 5665210 operations; 0 current ops/sec;
> > >>  1190 sec: 5665210 operations; 0 current ops/sec;
> > >> 2014-01-15 15:19:34,139 Thread-2 WARN
> > >>  [HConnectionManager$HConnectionImplementation] Failed all from
> > >>
> >
> region=usertable,user6039,1389811852201.40518862106856d23b883e5d543d0b89.,
> > >> hostname=ip-10-45-174-120.ec2.internal, port=60020