|
|
-
Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException
Stack 2011-09-23, 16:29
Is it doing a long GC? Is the aggregation taking longer than the client 60 second timeout (You are setting lease below but I don't think that will have an effect since it server-side)?
St.Ack On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh <[EMAIL PROTECTED]> wrote: > Hi, > > I am using the AggregationClient to perform aggregate calculation on a > 10 M row hbase table which is spread across around 27 regions. > Currently all the regions are on a single region server. > > Here is briefly what the code is doing: > Leases leases = new Leases(40 * 60 * 1000, 5 * 1000); > leases.start(); > long startTime = System.currentTimeMillis(); > System.out.println("Start time: " + startTime); > Configuration conf = new Configuration(true); > AggregationClient aClient = new AggregationClient(conf); > > Scan scan = new Scan(); > scan.addColumn(TEST_FAMILY, TEST_QUALIFIER); > final ColumnInterpreter<Long, Long> ci = new LongColumnInterpreter(); > > double avg = aClient.avg(TEST_TABLE, ci, > scan); > leases.close(); > long endTime = System.currentTimeMillis(); > System.out.println("End time: " + endTime); > System.out.println("Avg is: " + avg); > System.out.println("Time taken is: " + (endTime - startTime)); > > However it is failing after some time with the following log messages > in the region server log: > 2011-09-23 14:21:25,649 DEBUG > org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for > region Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1. > 2011-09-23 14:21:25,650 DEBUG > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file > 15761056997240226 evicted 344 block(s) > 2011-09-23 14:21:25,650 DEBUG > org.apache.hadoop.hbase.regionserver.Store: closed data > 2011-09-23 14:21:25,650 INFO > org.apache.hadoop.hbase.regionserver.HRegion: Closed > Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1. > 2011-09-23 14:21:25,650 DEBUG > org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: > Closed region Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1. > 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC > Server Responder, call execCoprocessor([B@6b6ab732, > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6, > {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1})) > from 137.72.240.177:47563: output error > 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC > Server handler 9 on 60020 caught: > java.nio.channels.ClosedChannelException > at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) > at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501) > at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876) > at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955) > at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240) > > And the following in the zookeeper log: > 2011-09-23 14:19:50,978 INFO > org.apache.zookeeper.server.NIOServerCnxn: Established session > 0x1329577cad70004 with negotiated timeout 180000 for client > /137.72.240.177:45431 > 2011-09-23 14:21:11,028 WARN > org.apache.zookeeper.server.NIOServerCnxn: EndOfStreamException: > Unable to read additional data from client sessionid > 0x1329577cad70004, likely client has closed socket > 2011-09-23 14:21:11,029 INFO > org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection > for client /137.72.240.177:45431 which had sessionid 0x1329577cad70004
+
Stack 2011-09-23, 16:29
-
Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException
Mayuresh 2011-09-26, 08:56
Hi Ted, Yes I am aware that this isn't a good setup. I am working on understanding the coprocessors before I can use a bigger setup. I am talking about the prez at https://hbase.s3.amazonaws.com/hbase/HBase-CP-HUG10.pdf by Andrew Purtell, slide 5 : <quote> For long running jobs the client must periodically poll status to keep it alive; jobs without interest will be cancelled </quote> On Mon, Sep 26, 2011 at 2:08 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > Looks like my response to user@ got bounced. > > Himanshu is continuing the work on HBASE-3607 > >>> Currently all the regions are on a single region server. > I don't think the above is a good setup. > >>> It was mentioned to poll the job status periodically to avoid timeouts. > Can you tell us more about the presentation ? Normally you can include the > author in the cc list of this email chain. > > Cheers > > On Mon, Sep 26, 2011 at 12:44 AM, Mayuresh <[EMAIL PROTECTED]> > wrote: >> >> Yes the computation takes longer than 60 secs. Will setting lease >> timeout in hbase-site.xml help? >> >> In one of the coprocessor presentations. It was mentioned to poll the >> job status periodically to avoid timeouts. Any idea how to do that? >> >> On Sep 23, 2011 9:59 PM, "Stack" <[EMAIL PROTECTED]> wrote: >> > Is it doing a long GC? Is the aggregation taking longer than the >> > client 60 second timeout (You are setting lease below but I don't >> > think that will have an effect since it server-side)? >> > >> > St.Ack >> > >> > >> > On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh >> > <[EMAIL PROTECTED]> wrote: >> >> Hi, >> >> >> >> I am using the AggregationClient to perform aggregate calculation on a >> >> 10 M row hbase table which is spread across around 27 regions. >> >> Currently all the regions are on a single region server. >> >> >> >> Here is briefly what the code is doing: >> >> Leases leases = new Leases(40 * 60 * 1000, 5 * 1000); >> >> leases.start(); >> >> long startTime = System.currentTimeMillis(); >> >> System.out.println("Start time: " + startTime); >> >> Configuration conf = new Configuration(true); >> >> AggregationClient aClient = new AggregationClient(conf); >> >> >> >> Scan scan = new Scan(); >> >> scan.addColumn(TEST_FAMILY, TEST_QUALIFIER); >> >> final ColumnInterpreter<Long, Long> ci = new >> >> LongColumnInterpreter(); >> >> >> >> double avg = aClient.avg(TEST_TABLE, ci, >> >> scan); >> >> leases.close(); >> >> long endTime = System.currentTimeMillis(); >> >> System.out.println("End time: " + endTime); >> >> System.out.println("Avg is: " + avg); >> >> System.out.println("Time taken is: " + (endTime - startTime)); >> >> >> >> However it is failing after some time with the following log messages >> >> in the region server log: >> >> 2011-09-23 14:21:25,649 DEBUG >> >> org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for >> >> region >> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1. >> >> 2011-09-23 14:21:25,650 DEBUG >> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file >> >> 15761056997240226 evicted 344 block(s) >> >> 2011-09-23 14:21:25,650 DEBUG >> >> org.apache.hadoop.hbase.regionserver.Store: closed data >> >> 2011-09-23 14:21:25,650 INFO >> >> org.apache.hadoop.hbase.regionserver.HRegion: Closed >> >> >> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1. >> >> 2011-09-23 14:21:25,650 DEBUG >> >> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: >> >> Closed region >> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1. >> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC >> >> Server Responder, call execCoprocessor([B@6b6ab732, >> >> >> >> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6, >> >> >> >> {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1})) -Mayuresh
+
Mayuresh 2011-09-26, 08:56
-
Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException
Mayuresh 2011-09-26, 10:22
I increased the leases in hbase-site.xml to around 50 minutes: <property> <name>hbase.regionserver.lease.period</name> <!--value>60000</value--> <value>3000000</value> <description>HRegion server lease period in milliseconds. Default is 60 seconds. Clients must report in within this period else they are considered dead.</description> </property> However I still fail with the same error: 2011-09-26 15:50:28,857 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":118696,"call":"execCoprocessor([B@5b0e6f59, getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3, {\"timeRange\":[0,9223372036854775807],\"batch\":-1,\"startRow\":\"\",\"stopRow\":\"\",\"totalColumns\":0,\"cacheBlocks\":true,\"families\":{\"data\":[]},\"maxVersions\":1,\"caching\":-1}))","client":"137.72.240.180:54843","starttimems":1317032310155,"queuetimems":9,"class":"HRegionServer","responsesize":0,"method":"execCoprocessor"} 2011-09-26 15:50:28,872 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call execCoprocessor([B@5b0e6f59, getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3, {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"","stopRow":"","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1})) from 137.72.240.180:54843: output error 2011-09-26 15:50:28,873 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 4 on 60020 caught: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955) at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240) What does this responseTooSlow and "output error" mean? Any clues to find whats causing this? Thanks. On Mon, Sep 26, 2011 at 2:26 PM, Mayuresh <[EMAIL PROTECTED]> wrote: > Hi Ted, > > Yes I am aware that this isn't a good setup. I am working on > understanding the coprocessors before I can use a bigger setup. > > I am talking about the prez at > https://hbase.s3.amazonaws.com/hbase/HBase-CP-HUG10.pdf by Andrew > Purtell, slide 5 : > > <quote> > For long running jobs the client must periodically poll status to keep it > alive; jobs without interest will be cancelled > </quote> > > > On Mon, Sep 26, 2011 at 2:08 PM, Ted Yu <[EMAIL PROTECTED]> wrote: >> Looks like my response to user@ got bounced. >> >> Himanshu is continuing the work on HBASE-3607 >> >>>> Currently all the regions are on a single region server. >> I don't think the above is a good setup. >> >>>> It was mentioned to poll the job status periodically to avoid timeouts. >> Can you tell us more about the presentation ? Normally you can include the >> author in the cc list of this email chain. >> >> Cheers >> >> On Mon, Sep 26, 2011 at 12:44 AM, Mayuresh <[EMAIL PROTECTED]> >> wrote: >>> >>> Yes the computation takes longer than 60 secs. Will setting lease >>> timeout in hbase-site.xml help? >>> >>> In one of the coprocessor presentations. It was mentioned to poll the >>> job status periodically to avoid timeouts. Any idea how to do that? >>> >>> On Sep 23, 2011 9:59 PM, "Stack" <[EMAIL PROTECTED]> wrote: >>> > Is it doing a long GC? Is the aggregation taking longer than the >>> > client 60 second timeout (You are setting lease below but I don't >>> > think that will have an effect since it server-side)? >>> > >>> > St.Ack >>> > >>> > >>> > On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh >>> > <[EMAIL PROTECTED]> wrote: >>> >> Hi, >>> >> >>> >> I am using the AggregationClient to perform aggregate calculation on a -Mayuresh
+
Mayuresh 2011-09-26, 10:22
-
Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException
Mayuresh 2011-09-26, 14:41
Guys any guidance? Am not getting any clue onto how to solve this problem. On Mon, Sep 26, 2011 at 3:52 PM, Mayuresh <[EMAIL PROTECTED]> wrote: > I increased the leases in hbase-site.xml to around 50 minutes: > > <property> > <name>hbase.regionserver.lease.period</name> > <!--value>60000</value--> > <value>3000000</value> > <description>HRegion server lease period in milliseconds. Default is > 60 seconds. Clients must report in within this period else they are > considered dead.</description> > </property> > > However I still fail with the same error: > > 2011-09-26 15:50:28,857 WARN org.apache.hadoop.ipc.HBaseServer: > (responseTooSlow): > {"processingtimems":118696,"call":"execCoprocessor([B@5b0e6f59, > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3, > {\"timeRange\":[0,9223372036854775807],\"batch\":-1,\"startRow\":\"\",\"stopRow\":\"\",\"totalColumns\":0,\"cacheBlocks\":true,\"families\":{\"data\":[]},\"maxVersions\":1,\"caching\":-1}))","client":"137.72.240.180:54843","starttimems":1317032310155,"queuetimems":9,"class":"HRegionServer","responsesize":0,"method":"execCoprocessor"} > 2011-09-26 15:50:28,872 WARN org.apache.hadoop.ipc.HBaseServer: IPC > Server Responder, call execCoprocessor([B@5b0e6f59, > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3, > {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"","stopRow":"","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1})) > from 137.72.240.180:54843: output error > 2011-09-26 15:50:28,873 WARN org.apache.hadoop.ipc.HBaseServer: IPC > Server handler 4 on 60020 caught: > java.nio.channels.ClosedChannelException > at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) > at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501) > at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876) > at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955) > at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240) > > What does this responseTooSlow and "output error" mean? Any clues to > find whats causing this? > > Thanks. > > > On Mon, Sep 26, 2011 at 2:26 PM, Mayuresh <[EMAIL PROTECTED]> wrote: >> Hi Ted, >> >> Yes I am aware that this isn't a good setup. I am working on >> understanding the coprocessors before I can use a bigger setup. >> >> I am talking about the prez at >> https://hbase.s3.amazonaws.com/hbase/HBase-CP-HUG10.pdf by Andrew >> Purtell, slide 5 : >> >> <quote> >> For long running jobs the client must periodically poll status to keep it >> alive; jobs without interest will be cancelled >> </quote> >> >> >> On Mon, Sep 26, 2011 at 2:08 PM, Ted Yu <[EMAIL PROTECTED]> wrote: >>> Looks like my response to user@ got bounced. >>> >>> Himanshu is continuing the work on HBASE-3607 >>> >>>>> Currently all the regions are on a single region server. >>> I don't think the above is a good setup. >>> >>>>> It was mentioned to poll the job status periodically to avoid timeouts. >>> Can you tell us more about the presentation ? Normally you can include the >>> author in the cc list of this email chain. >>> >>> Cheers >>> >>> On Mon, Sep 26, 2011 at 12:44 AM, Mayuresh <[EMAIL PROTECTED]> >>> wrote: >>>> >>>> Yes the computation takes longer than 60 secs. Will setting lease >>>> timeout in hbase-site.xml help? >>>> >>>> In one of the coprocessor presentations. It was mentioned to poll the >>>> job status periodically to avoid timeouts. Any idea how to do that? >>>> >>>> On Sep 23, 2011 9:59 PM, "Stack" <[EMAIL PROTECTED]> wrote: >>>> > Is it doing a long GC? Is the aggregation taking longer than the -Mayuresh
+
Mayuresh 2011-09-26, 14:41
-
Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException
Stack 2011-09-26, 16:25
responseTooSlow is a new feature in TRUNK. Slow responses get logged with detail. I suppose your long-running CP qualifies. We should make it so there can be exceptions. The ClosedChannelException usually means the client has given up waiting, it timedout. Try setting "ipc.socket.timeout" to something large. St.Ack On Mon, Sep 26, 2011 at 3:22 AM, Mayuresh <[EMAIL PROTECTED]> wrote: > I increased the leases in hbase-site.xml to around 50 minutes: > > <property> > <name>hbase.regionserver.lease.period</name> > <!--value>60000</value--> > <value>3000000</value> > <description>HRegion server lease period in milliseconds. Default is > 60 seconds. Clients must report in within this period else they are > considered dead.</description> > </property> > > However I still fail with the same error: > > 2011-09-26 15:50:28,857 WARN org.apache.hadoop.ipc.HBaseServer: > (responseTooSlow): > {"processingtimems":118696,"call":"execCoprocessor([B@5b0e6f59, > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3, > {\"timeRange\":[0,9223372036854775807],\"batch\":-1,\"startRow\":\"\",\"stopRow\":\"\",\"totalColumns\":0,\"cacheBlocks\":true,\"families\":{\"data\":[]},\"maxVersions\":1,\"caching\":-1}))","client":"137.72.240.180:54843","starttimems":1317032310155,"queuetimems":9,"class":"HRegionServer","responsesize":0,"method":"execCoprocessor"} > 2011-09-26 15:50:28,872 WARN org.apache.hadoop.ipc.HBaseServer: IPC > Server Responder, call execCoprocessor([B@5b0e6f59, > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3, > {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"","stopRow":"","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1})) > from 137.72.240.180:54843: output error > 2011-09-26 15:50:28,873 WARN org.apache.hadoop.ipc.HBaseServer: IPC > Server handler 4 on 60020 caught: > java.nio.channels.ClosedChannelException > at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) > at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501) > at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876) > at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955) > at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240) > > What does this responseTooSlow and "output error" mean? Any clues to > find whats causing this? > > Thanks. > > > On Mon, Sep 26, 2011 at 2:26 PM, Mayuresh <[EMAIL PROTECTED]> wrote: >> Hi Ted, >> >> Yes I am aware that this isn't a good setup. I am working on >> understanding the coprocessors before I can use a bigger setup. >> >> I am talking about the prez at >> https://hbase.s3.amazonaws.com/hbase/HBase-CP-HUG10.pdf by Andrew >> Purtell, slide 5 : >> >> <quote> >> For long running jobs the client must periodically poll status to keep it >> alive; jobs without interest will be cancelled >> </quote> >> >> >> On Mon, Sep 26, 2011 at 2:08 PM, Ted Yu <[EMAIL PROTECTED]> wrote: >>> Looks like my response to user@ got bounced. >>> >>> Himanshu is continuing the work on HBASE-3607 >>> >>>>> Currently all the regions are on a single region server. >>> I don't think the above is a good setup. >>> >>>>> It was mentioned to poll the job status periodically to avoid timeouts. >>> Can you tell us more about the presentation ? Normally you can include the >>> author in the cc list of this email chain. >>> >>> Cheers >>> >>> On Mon, Sep 26, 2011 at 12:44 AM, Mayuresh <[EMAIL PROTECTED]> >>> wrote: >>>> >>>> Yes the computation takes longer than 60 secs. Will setting lease >>>> timeout in hbase-site.xml help? >>>> >>>> In one of the coprocessor presentations. It was mentioned to poll the
+
Stack 2011-09-26, 16:25
-
Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException
Himanshu Vashishtha 2011-09-26, 16:38
Hey Mayuresh! Can you try setting the Scan cache value to a good number (like a few 100 or more)? Cheers, Himanshu On Mon, Sep 26, 2011 at 9:01 AM, Himanshu Vashishtha <[EMAIL PROTECTED]> wrote: > > Hey Mayuresh! > Can you try setting the Scan cache value to a good number (like a few 100 or more)? > > Cheers, > Himanshu > > On Mon, Sep 26, 2011 at 8:41 AM, Mayuresh <[EMAIL PROTECTED]> wrote: >> >> Guys any guidance? Am not getting any clue onto how to solve this problem. >> >> On Mon, Sep 26, 2011 at 3:52 PM, Mayuresh <[EMAIL PROTECTED]> wrote: >> > I increased the leases in hbase-site.xml to around 50 minutes: >> > >> > <property> >> > <name>hbase.regionserver.lease.period</name> >> > <!--value>60000</value--> >> > <value>3000000</value> >> > <description>HRegion server lease period in milliseconds. Default is >> > 60 seconds. Clients must report in within this period else they are >> > considered dead.</description> >> > </property> >> > >> > However I still fail with the same error: >> > >> > 2011-09-26 15:50:28,857 WARN org.apache.hadoop.ipc.HBaseServer: >> > (responseTooSlow): >> > {"processingtimems":118696,"call":"execCoprocessor([B@5b0e6f59, >> > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3, >> > {\"timeRange\":[0,9223372036854775807],\"batch\":-1,\"startRow\":\"\",\"stopRow\":\"\",\"totalColumns\":0,\"cacheBlocks\":true,\"families\":{\"data\":[]},\"maxVersions\":1,\"caching\":-1}))","client":"137.72.240.180:54843","starttimems":1317032310155,"queuetimems":9,"class":"HRegionServer","responsesize":0,"method":"execCoprocessor"} >> > 2011-09-26 15:50:28,872 WARN org.apache.hadoop.ipc.HBaseServer: IPC >> > Server Responder, call execCoprocessor([B@5b0e6f59, >> > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3, >> > {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"","stopRow":"","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1})) >> > from 137.72.240.180:54843: output error >> > 2011-09-26 15:50:28,873 WARN org.apache.hadoop.ipc.HBaseServer: IPC >> > Server handler 4 on 60020 caught: >> > java.nio.channels.ClosedChannelException >> > at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) >> > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) >> > at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501) >> > at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876) >> > at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955) >> > at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390) >> > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240) >> > >> > What does this responseTooSlow and "output error" mean? Any clues to >> > find whats causing this? >> > >> > Thanks. >> > >> > >> > On Mon, Sep 26, 2011 at 2:26 PM, Mayuresh <[EMAIL PROTECTED]> wrote: >> >> Hi Ted, >> >> >> >> Yes I am aware that this isn't a good setup. I am working on >> >> understanding the coprocessors before I can use a bigger setup. >> >> >> >> I am talking about the prez at >> >> https://hbase.s3.amazonaws.com/hbase/HBase-CP-HUG10.pdf by Andrew >> >> Purtell, slide 5 : >> >> >> >> <quote> >> >> For long running jobs the client must periodically poll status to keep it >> >> alive; jobs without interest will be cancelled >> >> </quote> >> >> >> >> >> >> On Mon, Sep 26, 2011 at 2:08 PM, Ted Yu <[EMAIL PROTECTED]> wrote: >> >>> Looks like my response to user@ got bounced. >> >>> >> >>> Himanshu is continuing the work on HBASE-3607 >> >>> >> >>>>> Currently all the regions are on a single region server. >> >>> I don't think the above is a good setup. >> >>> >> >>>>> It was mentioned to poll the job status periodically to avoid timeouts. >> >>> Can you tell us more about the presentation ? Normally you can include the
+
Himanshu Vashishtha 2011-09-26, 16:38
-
Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException
Mayuresh 2011-09-26, 17:55
Thanks everyone for guiding me on this!
Setting hbase.rpc.timeout played the trick!
Thanks again everyone for the guidance!
On Mon, Sep 26, 2011 at 11:22 PM, Mayuresh <[EMAIL PROTECTED]> wrote: > Thanks Gary! That did the trick. Its now working just fine!! > > Regards, > Mayuresh > > On Mon, Sep 26, 2011 at 9:50 PM, Mayuresh <[EMAIL PROTECTED]> wrote: >> Thanks Gary! I'll try that out. >> >> Rearding bouncing mails, I could get over it by setting the mail >> format to plain text., >> >> Thanks again! >> >> On Mon, Sep 26, 2011 at 8:37 PM, Gary Helmling <[EMAIL PROTECTED]> wrote: >>> Mayuresh, >>> >>> Apologies for replying off list. The user mailing list seems to be bouncing >>> all of my messages as spam. >>> >>> See my response below on the call timeout. >>> >>> Gary >>> >>> >>> ---------- Forwarded message ---------- >>> From: Gary Helmling <[EMAIL PROTECTED]> >>> Date: Mon, Sep 26, 2011 at 7:57 AM >>> Subject: Re: Coprocessor execution - HBaseServer: (responseTooSlow) >>> java.nio.channels.ClosedChannelException >>> To: [EMAIL PROTECTED] >>> >>> >>> >>>> > However I still fail with the same error: >>>> > >>>> > 2011-09-26 15:50:28,857 WARN org.apache.hadoop.ipc.HBaseServer: >>>> > (responseTooSlow): >>>> > {"processingtimems":118696,"call":"execCoprocessor([B@5b0e6f59, >>>> > >>>> > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3, >>>> > >>>> > {\"timeRange\":[0,9223372036854775807],\"batch\":-1,\"startRow\":\"\",\"stopRow\":\"\",\"totalColumns\":0,\"cacheBlocks\":true,\"families\":{\"data\":[]},\"maxVersions\":1,\"caching\":-1}))","client":"137.72.240.180:54843","starttimems":1317032310155,"queuetimems":9,"class":"HRegionServer","responsesize":0,"method":"execCoprocessor"} >>> >>> This shows that the request took almost 2 minutes to process (118696 >>> milliseconds). >>> >>> I believe this is the client-side RPC timeout (default 60 seconds) from >>> HBASE-3154 kicking in. Try adding this to the hbase-site.xml you are using >>> on the clients: >>> >>> <property> >>> <name>hbase.rpc.timeout</name> >>> <value>300000</value> >>> </property> >>> >>> This will up the client side RPC timeout to 5 minutes. You can increase it >>> more if you think necessary or even disable it completely by setting to 0. >>> Note if you disable it though, and the server-side handler somehow gets hung >>> up (which would be a problem in itself), the client could keep waiting >>> indefinitely on the response. >>> >>> >> >> >> >> -- >> -Mayuresh >> > > > > -- > -Mayuresh >
-- -Mayuresh
+
Mayuresh 2011-09-26, 17:55
|
|