Baugher,Bryan 2012-12-28, 17:40
Ted Yu 2012-12-28, 18:14
Baugher,Bryan 2012-12-28, 18:41
I was talking about the server which was anonymized:
On Fri, Dec 28, 2012 at 10:41 AM, Baugher,Bryan <[EMAIL PROTECTED]>wrote:
> On 12/28/12 12:14 PM, "Ted Yu" <[EMAIL PROTECTED]> wrote:
> >Looks like there was socket timeout :
> >java.net.SocketTimeoutException: 60000 millis timeout while waiting for
> >channel to be ready for read. ch :
> >java.nio.channels.SocketChannel[connected local=/***:39752
> >Have you collected / checked GC log on the server referenced above ?
> I am not sure exactly which server you are referring to. For the
> application server we don't currently collect gc logs. For hbase we do but
> the gc logs were truncated recently and won't help.
> >BTW Have you considered deploying 0.92.2 in your cluster ?
> Not really. We have stuck with cloudera's distribution for a couple years
> now and I don't really see us going down that track.
> >Thanks, glad to see Cerner using HBase.
> >On Fri, Dec 28, 2012 at 9:40 AM, Baugher,Bryan
> ><[EMAIL PROTECTED]>wrote:
> >> Hi everyone,
> >> For the past month or so we have noticed that some of our applications
> >> become frozen about once a day and need to be restarted in order to
> >> them back. We eventually figured out that it was caused by/happening
> >> major compactions.
> >> We have automated major compactions disabled and are running them
> >> on each table sequentially each day starting at 4am. We are running on
> >> CDH4.1.1 (Hbase Version : 0.92.1-cdh4.1.1). Interestingly enough this is
> >> only happening in our dev environment with each region server serving
> >> regions.
> >> Looking at the logs in HBase show that the compactions are occurring and
> >> this warning repeatedly while the compactions are occurring,
> >> WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call
> >> getHTableDescriptors(), rpc version=1, client version=29,
> >> methodsFingerPrint=400804878 from ***: output error
> >> Looking at our application logs we often see this error or a
> >> I took a thread dump of our application while it was locked and saw that
> >> nearly all of the threads in the application were blocked by a single
> >> thread that was waiting on HBaseClient$Call.
> >>  - http://pastebin.com/P4skndEg
> >>  - http://pastebin.com/YLZn3SRK
> >> CONFIDENTIALITY NOTICE This message and any included attachments are
> >> Cerner Corporation and are intended only for the addressee. The
> >> contained in this message is confidential and may constitute inside or
> >> non-public information under international, federal, or state securities
> >> laws. Unauthorized forwarding, printing, copying, distribution, or use
> >> such information is strictly prohibited and may be unlawful. If you are
> >> the addressee, please promptly delete this message and notify the
> >>sender of
> >> the delivery error by e-mail or you may call Cerner's corporate offices
> >> Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
Baugher,Bryan 2012-12-28, 19:39