Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> HBase 0.94.15: writes stalls periodically even under moderate steady load (AWS EC2)


Copy link to this message
-
Re: HBase 0.94.15: writes stalls periodically even under moderate steady load (AWS EC2)
I need to apologize and clarify this statement…

First, running benchmarks on AWS is ok, if you’re attempting to get a rough idea of how HBase will perform on a certain class of machines and you’re comparing m1.large to m1.xlarge or m3.xlarge … so that you can get a rough scale on sizing.

However, in this thread, you’re talking about trying to figure out why a certain mechanism isn’t working.

You’re trying to track down why writes stall when you’re working in a virtualized environment where not only do you not have control over the machines, but also the network and your storage.

Also when you run the OS on a virtual machine, there are going to be ‘anomalies’ that you can’t explain because the OS is running within a VM and can only report what it sees, and not what could be happening underneath in the VM’s OS.

So you may see a problem, but will never be able to find the cause.
On Jan 17, 2014, at 5:55 AM, Michael Segel <[EMAIL PROTECTED]> wrote:

> Guys,
>
> Trying to benchmark on AWS is a waste of time. You end up chasing ghosts.
> You want to benchmark, you need to isolate your systems to reduce extraneous factors.
>
> You need real hardware, real network in a controlled environment.
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
>> On Jan 16, 2014, at 12:34 PM, "Bryan Beaudreault" <[EMAIL PROTECTED]> wrote:
>>
>> This might be better on the user list? Anyway..
>>
>> How many IPC handlers are you giving?  m1.xlarge is very low cpu.  Not only
>> does it have only 4 cores (more cores allow more concurrent threads with
>> less context switching), but those cores are severely underpowered.  I
>> would recommend at least c1.xlarge, which is only a bit more expensive.  If
>> you happen to be doing heavy GC, with 1-2 compactions running, and with
>> many writes incoming, you are quickly using up quite a bit of CPU.  What is
>> the load and CPU usage, on the 10.38.106.234:50010?
>>
>> Did you see anything about blocking updates in the hbase logs?  How much
>> memstore are you giving?
>>
>>
>>> On Thu, Jan 16, 2014 at 1:17 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>>>
>>> On Wed, Jan 15, 2014 at 5:32 PM,
>>> Vladimir Rodionov <[EMAIL PROTECTED]> wrote:
>>>
>>>> Yes, I am using ephemeral (local) storage. I found that iostat is most of
>>>> the time idle on 3K load with periodic bursts up to 10% iowait.
>>>
>>> Ok, sounds like the problem is higher up the stack.
>>>
>>> I see in later emails on this thread a log snippet that shows an issue with
>>> the WAL writer pipeline, one of the datanodes is slow, sick, or partially
>>> unreachable. If you have uneven point to point ping times among your
>>> cluster instances, or periodic loss, it might still be AWS's fault,
>>> otherwise I wonder why the DFSClient says a datanode is sick.
>>>
>>> --
>>> Best regards,
>>>
>>>  - Andy
>>>
>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>> (via Tom White)
>>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB