|
Eran Kutner
2012-05-10, 08:17
Igal Shilman
2012-05-10, 09:25
Eran Kutner
2012-05-10, 11:33
Michel Segel
2012-05-10, 11:53
Eran Kutner
2012-05-10, 12:22
Michael Segel
2012-05-10, 13:26
Dave Revell
2012-05-10, 17:31
Michael Segel
2012-05-10, 18:30
Dave Revell
2012-05-10, 18:41
Michael Segel
2012-05-10, 18:59
Eran Kutner
2012-05-10, 19:17
Michael Segel
2012-05-10, 19:50
Stack
2012-05-10, 21:57
Michael Segel
2012-05-11, 01:28
Michael Segel
2012-05-11, 02:46
Stack
2012-05-11, 03:28
Stack
2012-05-11, 03:34
Michael Segel
2012-05-11, 03:44
Stack
2012-05-11, 03:53
Stack
2012-05-11, 05:07
Stack
2012-05-11, 05:08
Stack
2012-05-11, 05:12
Michael Segel
2012-05-11, 11:36
Eran Kutner
2012-05-24, 11:15
Michael Segel
2012-05-24, 12:13
Stack
2012-05-24, 23:39
Dave Revell
2012-05-25, 19:52
dva
2012-08-30, 06:26
dva
2012-08-30, 06:26
Stack
2012-08-30, 22:36
|
-
Occasional regionserver crashes following socket errors writing to HDFSEran Kutner 2012-05-10, 08:17
Hi,
We're seeing occasional regionserver crashes during heavy write operations to Hbase (at the reduce phase of large M/R jobs). I have increased the file descriptors, HDFS xceivers, HDFS threads to the recommended settings and actually way above. Here is an example of the HBase log (showing only errors): 2012-05-10 03:34:54,291 WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block blk_-8928911185099340956_5189425java.io.IOException: Bad response 1 for block blk_-8928911185099340956_5189425 from datanode 10.1.104.6:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2986) 2012-05-10 03:34:54,494 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/10.1.104.9:59642remote=/ 10.1.104.9:50010]. 0 millis timeout left. at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2848) 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-8928911185099340956_5189425 bad datanode[2] 10.1.104.6:50010 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-8928911185099340956_5189425 in pipeline 10.1.104.9:50010, 10.1.104.8:50010, 10.1.104.6:50010: bad datanode 10.1.104.6:50010 2012-05-10 03:48:30,174 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server serverName=hadoop1-s09.farm-ny.gigya.com,60020,1336476100422, load=(requests=15741, regions=789, usedHeap=6822, maxHeap=7983): regionserver:60020-0x2372c0e8a2f0008 regionserver:60020-0x2372c0e8a2f0008 received expired from ZooKeeper, aborting org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:352) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:270) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507) java.io.InterruptedIOException: Aborting compaction of store properties in region gs_users,6155551|QoCW/euBIKuMat/nRC5Xtw==,1334983658004.878522ea91f41cd76b903ea06ccd17f9. because user requested stop. at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:998) at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:779) at org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:776) at org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:721) at org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:81) This is from 10.1.104.9 (same machine running the region server that crashed): 2012-05-10 03:31:16,785 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-8928911185099340956_5189425 src: /10.1.104.9:59642 dest: / 10.1.104.9:50010 2012-05-10 03:35:39,000 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-8928911185099340956_5189425 2 Exception java.net.SocketException: Connection reset 2012-05-10 03:35:39,052 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_-8928911185099340956_5189425 java.nio.channels.ClosedByInterruptException 2012-05-10 03:35:39,053 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-8928911185099340956_5189425 received exception java.io.IOException: Interrupted receiveBlock 2012-05-10 03:35:39,055 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.IOException: Block blk_-8928911185099340956_5189425 length is 24384000 does not match block file length 24449024 2012-05-10 03:35:39,055 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 50020, call startBlockRecovery(blk_-8928911185099340956_5189425) from 10.1.104.8:50251: error: java.io.IOException: Block blk_-8928911185099340956_5189425 length is 24384000 does not match block file length 24449024 java.io.IOException: Block blk_-8928911185099340956_5189425 length is 24384000 does not match block file length 24449024 2012-05-10 03:35:39,077 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-8928911185099340956_5189425 2 Exception java.net.SocketException: Broken pipe 2012-05-10 03:35:39,077 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-8928911185099340956_5189425 2 Exception java.net.SocketException: Socket closed 2012-05-10 03:35:39,108 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-8928911185099340956_5189425 2 Exception java.net.SocketException: Socket closed 2012-05-10 03:35:39,136 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-8928911185099340956_5189425 2 Exception java.net.SocketException: Socket closed 2012-05-10 03:35:39,165 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-8928911185099340956_5189425 2 Exception java.net.SocketException: Socket closed 2012-05-10 03:35:39,196 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-8928911185099340956_5189425 2 Exception java.net.SocketException: Socket closed 2012-05-10 03:35:39,221 INFO org.apache.hadoop.hdfs.server
-
Re: Occasional regionserver crashes following socket errors writing to HDFSIgal Shilman 2012-05-10, 09:25
Hi Eran,
Do you have: dfs.datanode.socket.write.timeout set in hdfs-site.xml ? (We have set this to zero in our cluster, which means waiting as long as necessary for the write to complete) Igal. On Thu, May 10, 2012 at 11:17 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: > Hi, > We're seeing occasional regionserver crashes during heavy write operations > to Hbase (at the reduce phase of large M/R jobs). I have increased the file > descriptors, HDFS xceivers, HDFS threads to the recommended settings and > actually way above. > > Here is an example of the HBase log (showing only errors): > > 2012-05-10 03:34:54,291 WARN org.apache.hadoop.hdfs.DFSClient: > DFSOutputStream ResponseProcessor exception for block > blk_-8928911185099340956_5189425java.io.IOException: Bad response 1 for > block blk_-8928911185099340956_5189425 from datanode 10.1.104.6:50010 > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java: > 2986) > > 2012-05-10 03:34:54,494 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.io.InterruptedIOException: Interruped while waiting for IO > on channel java.nio.channels.SocketChannel[connected > local=/10.1.104.9:59642remote=/ > 10.1.104.9:50010]. 0 millis timeout left. > at > > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349) > at > > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java: > 2848) > > 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-8928911185099340956_5189425 bad datanode[2] > 10.1.104.6:50010 > 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-8928911185099340956_5189425 in pipeline > 10.1.104.9:50010, 10.1.104.8:50010, 10.1.104.6:50010: bad datanode > 10.1.104.6:50010 > 2012-05-10 03:48:30,174 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > serverName=hadoop1-s09.farm-ny.gigya.com,60020,1336476100422, > load=(requests=15741, regions=789, usedHeap=6822, maxHeap=7983): > regionserver:60020-0x2372c0e8a2f0008 regionserver:60020-0x2372c0e8a2f0008 > received expired from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:352) > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:270) > at > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507) > java.io.InterruptedIOException: Aborting compaction of store properties in > region > > gs_users,6155551|QoCW/euBIKuMat/nRC5Xtw==,1334983658004.878522ea91f41cd76b903ea06ccd17f9. > because user requested stop. > at > org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:998) > at > org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:779) > at > > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:776) > at > > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:721) > at > > org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:81) > > > This is from 10.1.104.9 (same machine running the region server that > crashed): > 2012-05-10 03:31:16,785 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-8928911185099340956_5189425 src: /10.1.104.9:59642 dest: /
-
Re: Occasional regionserver crashes following socket errors writing to HDFSEran Kutner 2012-05-10, 11:33
Thanks Igal, but we already have that setting. These are the relevant
setting from hdfs-site.xml : <property> <name>dfs.datanode.max.xcievers</name> <value>65536</value> </property> <property> <name>dfs.datanode.handler.count</name> <value>10</value> </property> <property> <name>dfs.datanode.socket.write.timeout</name> <value>0</value> </property> Other ideas? -eran On Thu, May 10, 2012 at 12:25 PM, Igal Shilman <[EMAIL PROTECTED]> wrote: > Hi Eran, > Do you have: dfs.datanode.socket.write.timeout set in hdfs-site.xml ? > (We have set this to zero in our cluster, which means waiting as long as > necessary for the write to complete) > > Igal. > > On Thu, May 10, 2012 at 11:17 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: > > > Hi, > > We're seeing occasional regionserver crashes during heavy write > operations > > to Hbase (at the reduce phase of large M/R jobs). I have increased the > file > > descriptors, HDFS xceivers, HDFS threads to the recommended settings and > > actually way above. > > > > Here is an example of the HBase log (showing only errors): > > > > 2012-05-10 03:34:54,291 WARN org.apache.hadoop.hdfs.DFSClient: > > DFSOutputStream ResponseProcessor exception for block > > blk_-8928911185099340956_5189425java.io.IOException: Bad response 1 for > > block blk_-8928911185099340956_5189425 from datanode 10.1.104.6:50010 > > at > > > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java: > > 2986) > > > > 2012-05-10 03:34:54,494 WARN org.apache.hadoop.hdfs.DFSClient: > DataStreamer > > Exception: java.io.InterruptedIOException: Interruped while waiting for > IO > > on channel java.nio.channels.SocketChannel[connected > > local=/10.1.104.9:59642remote=/ > > 10.1.104.9:50010]. 0 millis timeout left. > > at > > > > > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349) > > at > > > > > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) > > at > > > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > > at > > > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > > at > java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > > at java.io.DataOutputStream.write(DataOutputStream.java:90) > > at > > > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java: > > 2848) > > > > 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > > Recovery for block blk_-8928911185099340956_5189425 bad datanode[2] > > 10.1.104.6:50010 > > 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > > Recovery for block blk_-8928911185099340956_5189425 in pipeline > > 10.1.104.9:50010, 10.1.104.8:50010, 10.1.104.6:50010: bad datanode > > 10.1.104.6:50010 > > 2012-05-10 03:48:30,174 FATAL > > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region > server > > serverName=hadoop1-s09.farm-ny.gigya.com,60020,1336476100422, > > load=(requests=15741, regions=789, usedHeap=6822, maxHeap=7983): > > regionserver:60020-0x2372c0e8a2f0008 regionserver:60020-0x2372c0e8a2f0008 > > received expired from ZooKeeper, aborting > > org.apache.zookeeper.KeeperException$SessionExpiredException: > > KeeperErrorCode = Session expired > > at > > > > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:352) > > at > > > > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:270) > > at > > > > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531) > > at > > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507) > > java.io.InterruptedIOException: Aborting compaction of store properties > in > > region > > > > > gs_users,6155551|QoCW/euBIKuMat/nRC5Xtw==,1334983658004.878522ea91f41cd76b903ea06ccd17f9. > > because user requested stop.
-
Re: Occasional regionserver crashes following socket errors writing to HDFSMichel Segel 2012-05-10, 11:53
Silly question...
Why are you using a reducer when working w HBase? Second silly question... What is the max file size of your table that you are writing to? Third silly question... How many regions are on each of your region servers Fourth silly question ... There is this bandwidth setting... Default is 10MB... Did you modify it? Sent from a remote device. Please excuse any typos... Mike Segel On May 10, 2012, at 6:33 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: > Thanks Igal, but we already have that setting. These are the relevant > setting from hdfs-site.xml : > <property> > <name>dfs.datanode.max.xcievers</name> > <value>65536</value> > </property> > <property> > <name>dfs.datanode.handler.count</name> > <value>10</value> > </property> > <property> > <name>dfs.datanode.socket.write.timeout</name> > <value>0</value> > </property> > > Other ideas? > > -eran > > > > On Thu, May 10, 2012 at 12:25 PM, Igal Shilman <[EMAIL PROTECTED]> wrote: > >> Hi Eran, >> Do you have: dfs.datanode.socket.write.timeout set in hdfs-site.xml ? >> (We have set this to zero in our cluster, which means waiting as long as >> necessary for the write to complete) >> >> Igal. >> >> On Thu, May 10, 2012 at 11:17 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: >> >>> Hi, >>> We're seeing occasional regionserver crashes during heavy write >> operations >>> to Hbase (at the reduce phase of large M/R jobs). I have increased the >> file >>> descriptors, HDFS xceivers, HDFS threads to the recommended settings and >>> actually way above. >>> >>> Here is an example of the HBase log (showing only errors): >>> >>> 2012-05-10 03:34:54,291 WARN org.apache.hadoop.hdfs.DFSClient: >>> DFSOutputStream ResponseProcessor exception for block >>> blk_-8928911185099340956_5189425java.io.IOException: Bad response 1 for >>> block blk_-8928911185099340956_5189425 from datanode 10.1.104.6:50010 >>> at >>> >>> >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java: >>> 2986) >>> >>> 2012-05-10 03:34:54,494 WARN org.apache.hadoop.hdfs.DFSClient: >> DataStreamer >>> Exception: java.io.InterruptedIOException: Interruped while waiting for >> IO >>> on channel java.nio.channels.SocketChannel[connected >>> local=/10.1.104.9:59642remote=/ >>> 10.1.104.9:50010]. 0 millis timeout left. >>> at >>> >>> >> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349) >>> at >>> >>> >> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) >>> at >>> >> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) >>> at >>> >> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) >>> at >> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) >>> at java.io.DataOutputStream.write(DataOutputStream.java:90) >>> at >>> >>> >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java: >>> 2848) >>> >>> 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error >>> Recovery for block blk_-8928911185099340956_5189425 bad datanode[2] >>> 10.1.104.6:50010 >>> 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error >>> Recovery for block blk_-8928911185099340956_5189425 in pipeline >>> 10.1.104.9:50010, 10.1.104.8:50010, 10.1.104.6:50010: bad datanode >>> 10.1.104.6:50010 >>> 2012-05-10 03:48:30,174 FATAL >>> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region >> server >>> serverName=hadoop1-s09.farm-ny.gigya.com,60020,1336476100422, >>> load=(requests=15741, regions=789, usedHeap=6822, maxHeap=7983): >>> regionserver:60020-0x2372c0e8a2f0008 regionserver:60020-0x2372c0e8a2f0008 >>> received expired from ZooKeeper, aborting >>> org.apache.zookeeper.KeeperException$SessionExpiredException: >>> KeeperErrorCode = Session expired >>> at >>> >>> >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:352)
-
Re: Occasional regionserver crashes following socket errors writing to HDFSEran Kutner 2012-05-10, 12:22
Hi Mike,
Not sure I understand the question about the reducer. I'm using a reducer because my M/R jobs require one and I want to write the result to Hbase. I have two tables I'm writing two, one is using the default file size (256MB if I remember correctly) the other one is 512MB. There are ~700 regions on each server. Didn't know there is a bandwidth limit, is it on HDFS or HBase? How can it be configured? -eran On Thu, May 10, 2012 at 2:53 PM, Michel Segel <[EMAIL PROTECTED]>wrote: > Silly question... > Why are you using a reducer when working w HBase? > > Second silly question... What is the max file size of your table that you > are writing to? > > Third silly question... How many regions are on each of your region servers > > Fourth silly question ... There is this bandwidth setting... Default is > 10MB... Did you modify it? > > > > Sent from a remote device. Please excuse any typos... > > Mike Segel > > On May 10, 2012, at 6:33 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: > > > Thanks Igal, but we already have that setting. These are the relevant > > setting from hdfs-site.xml : > > <property> > > <name>dfs.datanode.max.xcievers</name> > > <value>65536</value> > > </property> > > <property> > > <name>dfs.datanode.handler.count</name> > > <value>10</value> > > </property> > > <property> > > <name>dfs.datanode.socket.write.timeout</name> > > <value>0</value> > > </property> > > > > Other ideas? > > > > -eran > > > > > > > > On Thu, May 10, 2012 at 12:25 PM, Igal Shilman <[EMAIL PROTECTED]> wrote: > > > >> Hi Eran, > >> Do you have: dfs.datanode.socket.write.timeout set in hdfs-site.xml ? > >> (We have set this to zero in our cluster, which means waiting as long as > >> necessary for the write to complete) > >> > >> Igal. > >> > >> On Thu, May 10, 2012 at 11:17 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: > >> > >>> Hi, > >>> We're seeing occasional regionserver crashes during heavy write > >> operations > >>> to Hbase (at the reduce phase of large M/R jobs). I have increased the > >> file > >>> descriptors, HDFS xceivers, HDFS threads to the recommended settings > and > >>> actually way above. > >>> > >>> Here is an example of the HBase log (showing only errors): > >>> > >>> 2012-05-10 03:34:54,291 WARN org.apache.hadoop.hdfs.DFSClient: > >>> DFSOutputStream ResponseProcessor exception for block > >>> blk_-8928911185099340956_5189425java.io.IOException: Bad response 1 for > >>> block blk_-8928911185099340956_5189425 from datanode 10.1.104.6:50010 > >>> at > >>> > >>> > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java: > >>> 2986) > >>> > >>> 2012-05-10 03:34:54,494 WARN org.apache.hadoop.hdfs.DFSClient: > >> DataStreamer > >>> Exception: java.io.InterruptedIOException: Interruped while waiting for > >> IO > >>> on channel java.nio.channels.SocketChannel[connected > >>> local=/10.1.104.9:59642remote=/ > >>> 10.1.104.9:50010]. 0 millis timeout left. > >>> at > >>> > >>> > >> > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349) > >>> at > >>> > >>> > >> > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) > >>> at > >>> > >> > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > >>> at > >>> > >> > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > >>> at > >> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > >>> at java.io.DataOutputStream.write(DataOutputStream.java:90) > >>> at > >>> > >>> > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java: > >>> 2848) > >>> > >>> 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > >>> Recovery for block blk_-8928911185099340956_5189425 bad datanode[2] > >>> 10.1.104.6:50010 > >>> 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > >>> Recovery for block blk_-8928911185099340956_5189425 in pipeline
-
Re: Occasional regionserver crashes following socket errors writing to HDFSMichael Segel 2012-05-10, 13:26
Ok...
So the issue is that you have a lot of regions on a region server, where the max file size is the default. On your input to HBase, you have a couple of issues. 1) Your data is most likely sorted. (Not good on inserts) 2) You will want to increase your region size from default (256MB) to something like 1-2GB. 3) You probably don't have MSLABS set up or GC tuned. 4) google dfs.balance.bandwidthPerSec I believe its also used by HBase when they need to move regions. Speaking of which what happens when HBase decides to move a region? Does it make a copy on the new RS and then after its there, point to the new RS and then remove the old region? I'm assuming you're writing out of your reducer straight to HBase. Are you writing your job to 1 reducer or did you set up multiple reducers? You may want to play with having multiple reducers ... Again, here's the issue. You don't need a reducer when writing to HBase. You would be better served by refactoring your job to have the mapper write to Hbase directly. Think about it. (Really, think about it. If you really don't see it, face a white wall, with a 6 pack of beer and start drinking and focus on the question of why would I say you don't need a reducer on a map job. ) ;-) Note if you don't drink, go to the gym and get on a treadmill and run at a good pace. Put your body in to a zone and then work through the problem HTH -Mike On May 10, 2012, at 7:22 AM, Eran Kutner wrote: > Hi Mike, > Not sure I understand the question about the reducer. I'm using a reducer > because my M/R jobs require one and I want to write the result to Hbase. > I have two tables I'm writing two, one is using the default file size > (256MB if I remember correctly) the other one is 512MB. > There are ~700 regions on each server. > Didn't know there is a bandwidth limit, is it on HDFS or HBase? How can it > be configured? > > -eran > > > > On Thu, May 10, 2012 at 2:53 PM, Michel Segel <[EMAIL PROTECTED]>wrote: > >> Silly question... >> Why are you using a reducer when working w HBase? >> >> Second silly question... What is the max file size of your table that you >> are writing to? >> >> Third silly question... How many regions are on each of your region servers >> >> Fourth silly question ... There is this bandwidth setting... Default is >> 10MB... Did you modify it? >> >> >> >> Sent from a remote device. Please excuse any typos... >> >> Mike Segel >> >> On May 10, 2012, at 6:33 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: >> >>> Thanks Igal, but we already have that setting. These are the relevant >>> setting from hdfs-site.xml : >>> <property> >>> <name>dfs.datanode.max.xcievers</name> >>> <value>65536</value> >>> </property> >>> <property> >>> <name>dfs.datanode.handler.count</name> >>> <value>10</value> >>> </property> >>> <property> >>> <name>dfs.datanode.socket.write.timeout</name> >>> <value>0</value> >>> </property> >>> >>> Other ideas? >>> >>> -eran >>> >>> >>> >>> On Thu, May 10, 2012 at 12:25 PM, Igal Shilman <[EMAIL PROTECTED]> wrote: >>> >>>> Hi Eran, >>>> Do you have: dfs.datanode.socket.write.timeout set in hdfs-site.xml ? >>>> (We have set this to zero in our cluster, which means waiting as long as >>>> necessary for the write to complete) >>>> >>>> Igal. >>>> >>>> On Thu, May 10, 2012 at 11:17 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: >>>> >>>>> Hi, >>>>> We're seeing occasional regionserver crashes during heavy write >>>> operations >>>>> to Hbase (at the reduce phase of large M/R jobs). I have increased the >>>> file >>>>> descriptors, HDFS xceivers, HDFS threads to the recommended settings >> and >>>>> actually way above. >>>>> >>>>> Here is an example of the HBase log (showing only errors): >>>>> >>>>> 2012-05-10 03:34:54,291 WARN org.apache.hadoop.hdfs.DFSClient: >>>>> DFSOutputStream ResponseProcessor exception for block >>>>> blk_-8928911185099340956_5189425java.io.IOException: Bad response 1 for >>>>> block blk_-8928911185099340956_5189425 from datanode 10.1.104.6:50010
-
Re: Occasional regionserver crashes following socket errors writing to HDFSDave Revell 2012-05-10, 17:31
This "you don't need a reducer" conversation is distracting from the real
problem and is false. Many mapreduce algorithms require a reduce phase (e.g. sorting). The fact that the output is written to HBase or somewhere else is irrelevant. -Dave On Thu, May 10, 2012 at 6:26 AM, Michael Segel <[EMAIL PROTECTED]>wrote: > Ok... > > So the issue is that you have a lot of regions on a region server, where > the max file size is the default. > On your input to HBase, you have a couple of issues. > > 1) Your data is most likely sorted. (Not good on inserts) > 2) You will want to increase your region size from default (256MB) to > something like 1-2GB. > 3) You probably don't have MSLABS set up or GC tuned. > 4) google dfs.balance.bandwidthPerSec I believe its also used by HBase > when they need to move regions. > Speaking of which what happens when HBase decides to move a region? Does > it make a copy on the new RS and then after its there, point to the new RS > and then remove the old region? > > > I'm assuming you're writing out of your reducer straight to HBase. > Are you writing your job to 1 reducer or did you set up multiple reducers? > You may want to play with having multiple reducers ... > > Again, here's the issue. You don't need a reducer when writing to HBase. > You would be better served by refactoring your job to have the mapper write > to Hbase directly. > Think about it. (Really, think about it. If you really don't see it, face > a white wall, with a 6 pack of beer and start drinking and focus on the > question of why would I say you don't need a reducer on a map job. ) ;-) > Note if you don't drink, go to the gym and get on a treadmill and run at a > good pace. Put your body in to a zone and then work through the problem > > > HTH > > -Mike > > > On May 10, 2012, at 7:22 AM, Eran Kutner wrote: > > > Hi Mike, > > Not sure I understand the question about the reducer. I'm using a reducer > > because my M/R jobs require one and I want to write the result to Hbase. > > I have two tables I'm writing two, one is using the default file size > > (256MB if I remember correctly) the other one is 512MB. > > There are ~700 regions on each server. > > Didn't know there is a bandwidth limit, is it on HDFS or HBase? How can > it > > be configured? > > > > -eran > > > > > > > > On Thu, May 10, 2012 at 2:53 PM, Michel Segel <[EMAIL PROTECTED] > >wrote: > > > >> Silly question... > >> Why are you using a reducer when working w HBase? > >> > >> Second silly question... What is the max file size of your table that > you > >> are writing to? > >> > >> Third silly question... How many regions are on each of your region > servers > >> > >> Fourth silly question ... There is this bandwidth setting... Default is > >> 10MB... Did you modify it? > >> > >> > >> > >> Sent from a remote device. Please excuse any typos... > >> > >> Mike Segel > >> > >> On May 10, 2012, at 6:33 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: > >> > >>> Thanks Igal, but we already have that setting. These are the relevant > >>> setting from hdfs-site.xml : > >>> <property> > >>> <name>dfs.datanode.max.xcievers</name> > >>> <value>65536</value> > >>> </property> > >>> <property> > >>> <name>dfs.datanode.handler.count</name> > >>> <value>10</value> > >>> </property> > >>> <property> > >>> <name>dfs.datanode.socket.write.timeout</name> > >>> <value>0</value> > >>> </property> > >>> > >>> Other ideas? > >>> > >>> -eran > >>> > >>> > >>> > >>> On Thu, May 10, 2012 at 12:25 PM, Igal Shilman <[EMAIL PROTECTED]> wrote: > >>> > >>>> Hi Eran, > >>>> Do you have: dfs.datanode.socket.write.timeout set in hdfs-site.xml ? > >>>> (We have set this to zero in our cluster, which means waiting as long > as > >>>> necessary for the write to complete) > >>>> > >>>> Igal. > >>>> > >>>> On Thu, May 10, 2012 at 11:17 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: > >>>> > >>>>> Hi, > >>>>> We're seeing occasional regionserver crashes during heavy write > >>>> operations > >>
-
Re: Occasional regionserver crashes following socket errors writing to HDFSMichael Segel 2012-05-10, 18:30
Dave, do you really want to go there?
OP has a couple of issues and he was going down a rabbit hole. (You can choose if that's a reference to 'the Matrix, Jefferson Starship, Alice in Wonderland... or all of the above) So to put him on the correct path, I recommended the following, not in any order... 1) Increase his region size for this table only. 2) Look to decreasing the number of regions managed by a RS (which is why you increase region size) 3) Up the dfs.balance.bandwidthPerSec. (How often does HBase move regions and how exactly do they move regions ?) 4) Look at implementing MSLABS and GC tuning. This cuts down on the overhead. 5) Refactoring his job.... Oops. Ok I didn't put that in the list. But that was the last thing I wrote as a separate statement. Clearly you didn't take my advice and think about the problem.... To prove a point.... you wrote: 'Many mapreduce algorithms require a reduce phase (e.g. sorting)' Ok. So tell me why you would want to sort your input in to HBase and if that's really a good thing? Oops!... :-) On May 10, 2012, at 12:31 PM, Dave Revell wrote: > This "you don't need a reducer" conversation is distracting from the real > problem and is false. > > Many mapreduce algorithms require a reduce phase (e.g. sorting). The fact > that the output is written to HBase or somewhere else is irrelevant. > > -Dave > > On Thu, May 10, 2012 at 6:26 AM, Michael Segel <[EMAIL PROTECTED]>wrote: > [SNIP]
-
Re: Occasional regionserver crashes following socket errors writing to HDFSDave Revell 2012-05-10, 18:41
Some examples of when you'd want a reducer:
http://static.usenix.org/event/osdi04/tech/full_papers/dean/dean.pdf On Thu, May 10, 2012 at 11:30 AM, Michael Segel <[EMAIL PROTECTED]>wrote: > Dave, do you really want to go there? > > OP has a couple of issues and he was going down a rabbit hole. > (You can choose if that's a reference to 'the Matrix, Jefferson Starship, > Alice in Wonderland... or all of the above) > > So to put him on the correct path, I recommended the following, not in any > order... > > 1) Increase his region size for this table only. > 2) Look to decreasing the number of regions managed by a RS (which is why > you increase region size) > 3) Up the dfs.balance.bandwidthPerSec. (How often does HBase move regions > and how exactly do they move regions ?) > 4) Look at implementing MSLABS and GC tuning. This cuts down on the > overhead. > 5) Refactoring his job.... > > Oops. > Ok I didn't put that in the list. > But that was the last thing I wrote as a separate statement. > Clearly you didn't take my advice and think about the problem.... > > To prove a point.... you wrote: > 'Many mapreduce algorithms require a reduce phase (e.g. sorting)' > > Ok. So tell me why you would want to sort your input in to HBase and if > that's really a good thing? > Oops!... :-) > > > > > > > On May 10, 2012, at 12:31 PM, Dave Revell wrote: > > This "you don't need a reducer" conversation is distracting from the real > > problem and is false. > > > > Many mapreduce algorithms require a reduce phase (e.g. sorting). The fact > > that the output is written to HBase or somewhere else is irrelevant. > > > > -Dave > > > > On Thu, May 10, 2012 at 6:26 AM, Michael Segel < > [EMAIL PROTECTED]>wrote: > > [SNIP] > >
-
Re: Occasional regionserver crashes following socket errors writing to HDFSMichael Segel 2012-05-10, 18:59
Sigh.
Dave, I really think you need to think more about the problem. Think about what a reduce does and then think about what happens in side of HBase. Then think about which runs faster... a job with two mappers writing the intermediate and final results in HBase, or a M/R job that writes its output to HBase. If you really truly think about the problem, you will start to understand why I say you really don't want to use a reducer when you're working w HBase. On May 10, 2012, at 1:41 PM, Dave Revell wrote: > Some examples of when you'd want a reducer: > http://static.usenix.org/event/osdi04/tech/full_papers/dean/dean.pdf > > On Thu, May 10, 2012 at 11:30 AM, Michael Segel > <[EMAIL PROTECTED]>wrote: > >> Dave, do you really want to go there? >> >> OP has a couple of issues and he was going down a rabbit hole. >> (You can choose if that's a reference to 'the Matrix, Jefferson Starship, >> Alice in Wonderland... or all of the above) >> >> So to put him on the correct path, I recommended the following, not in any >> order... >> >> 1) Increase his region size for this table only. >> 2) Look to decreasing the number of regions managed by a RS (which is why >> you increase region size) >> 3) Up the dfs.balance.bandwidthPerSec. (How often does HBase move regions >> and how exactly do they move regions ?) >> 4) Look at implementing MSLABS and GC tuning. This cuts down on the >> overhead. >> 5) Refactoring his job.... >> >> Oops. >> Ok I didn't put that in the list. >> But that was the last thing I wrote as a separate statement. >> Clearly you didn't take my advice and think about the problem.... >> >> To prove a point.... you wrote: >> 'Many mapreduce algorithms require a reduce phase (e.g. sorting)' >> >> Ok. So tell me why you would want to sort your input in to HBase and if >> that's really a good thing? >> Oops!... :-) >> >> >> >> >> >> >> On May 10, 2012, at 12:31 PM, Dave Revell wrote: >>> This "you don't need a reducer" conversation is distracting from the real >>> problem and is false. >>> >>> Many mapreduce algorithms require a reduce phase (e.g. sorting). The fact >>> that the output is written to HBase or somewhere else is irrelevant. >>> >>> -Dave >>> >>> On Thu, May 10, 2012 at 6:26 AM, Michael Segel < >> [EMAIL PROTECTED]>wrote: >>> [SNIP] >> >>
-
Re: Occasional regionserver crashes following socket errors writing to HDFSEran Kutner 2012-05-10, 19:17
Michale I appreciate the feedback but I'd have to disagree.
In my case for example, I need to look at a complete set of data produced by the map phase in order to make a decision and write it to Hbase. So sure I could write all the mappers output to hbase then have another map only job to scan the output of the previous one do the calculation then write the output to another table. I don't really see why would that be better than using a reducer. As for the other tips, I agree the files are too large, so I increased the file size, but I don't really see why is that relevant to the error we're talking about. Why having many regions cause timeouts on HDFS? I do have mslabs configured and GC tuneups. I do run multiple reducers, I suspect that's aggravating the problem not helping it. As far as I can tell dfs.balance.bandwidthPerSec is relevant only for balancing done with the balancer, not for the initial replication. -eran On Thu, May 10, 2012 at 9:59 PM, Michael Segel <[EMAIL PROTECTED]>wrote: > Sigh. > > Dave, > I really think you need to think more about the problem. > > Think about what a reduce does and then think about what happens in side > of HBase. > > Then think about which runs faster... a job with two mappers writing the > intermediate and final results in HBase, > or a M/R job that writes its output to HBase. > > If you really truly think about the problem, you will start to understand > why I say you really don't want to use a reducer when you're working w > HBase. > > > On May 10, 2012, at 1:41 PM, Dave Revell wrote: > > > Some examples of when you'd want a reducer: > > http://static.usenix.org/event/osdi04/tech/full_papers/dean/dean.pdf > > > > On Thu, May 10, 2012 at 11:30 AM, Michael Segel > > <[EMAIL PROTECTED]>wrote: > > > >> Dave, do you really want to go there? > >> > >> OP has a couple of issues and he was going down a rabbit hole. > >> (You can choose if that's a reference to 'the Matrix, Jefferson > Starship, > >> Alice in Wonderland... or all of the above) > >> > >> So to put him on the correct path, I recommended the following, not in > any > >> order... > >> > >> 1) Increase his region size for this table only. > >> 2) Look to decreasing the number of regions managed by a RS (which is > why > >> you increase region size) > >> 3) Up the dfs.balance.bandwidthPerSec. (How often does HBase move > regions > >> and how exactly do they move regions ?) > >> 4) Look at implementing MSLABS and GC tuning. This cuts down on the > >> overhead. > >> 5) Refactoring his job.... > >> > >> Oops. > >> Ok I didn't put that in the list. > >> But that was the last thing I wrote as a separate statement. > >> Clearly you didn't take my advice and think about the problem.... > >> > >> To prove a point.... you wrote: > >> 'Many mapreduce algorithms require a reduce phase (e.g. sorting)' > >> > >> Ok. So tell me why you would want to sort your input in to HBase and if > >> that's really a good thing? > >> Oops!... :-) > >> > >> > >> > >> > >> > >> > >> On May 10, 2012, at 12:31 PM, Dave Revell wrote: > >>> This "you don't need a reducer" conversation is distracting from the > real > >>> problem and is false. > >>> > >>> Many mapreduce algorithms require a reduce phase (e.g. sorting). The > fact > >>> that the output is written to HBase or somewhere else is irrelevant. > >>> > >>> -Dave > >>> > >>> On Thu, May 10, 2012 at 6:26 AM, Michael Segel < > >> [EMAIL PROTECTED]>wrote: > >>> [SNIP] > >> > >> > >
-
Re: Occasional regionserver crashes following socket errors writing to HDFSMichael Segel 2012-05-10, 19:50
Eran,
see my response inline... On May 10, 2012, at 2:17 PM, Eran Kutner wrote: > Michale I appreciate the feedback but I'd have to disagree. > In my case for example, I need to look at a complete set of data produced > by the map phase in order to make a decision and write it to Hbase. So sure > I could write all the mappers output to hbase then have another map only > job to scan the output of the previous one do the calculation then write > the output to another table. I don't really see why would that be better > than using a reducer. You disagree without actually benchmarking the two? That's pretty bold. :-) 2 things. First Reducers are expensive. Second, writing sorted records in to HBase is also more expensive than if you're writing records in random order. Here's a caveat. I don't know what you're attempting to do, so I can only say in general, I've found it faster to write 2 mappers and avoid using reducers. > As for the other tips, I agree the files are too large, so I increased the > file size, but I don't really see why is that relevant to the error we're > talking about. Why having many regions cause timeouts on HDFS? > I do have mslabs configured and GC tuneups. > I do run multiple reducers, I suspect that's aggravating the problem not > helping it. > As far as I can tell dfs.balance.bandwidthPerSec is relevant only for > balancing done with the balancer, not for the initial replication. > > With respect to the number of regions... you'd probably get a better answer St.Ack or JD. With respect to the bandwidth issue... We set it higher to something like 10% of the available pipe. Not that its going to be used all the time, but the smaller the pipe, the longer it takes to copy a file from one node to another. How much of an impact it has on your performance... Not sure. But its always something to check and think about. BTW, I did a quick read on your problem. You didn't say which release/version of HBase you were running.... > -eran > > > > On Thu, May 10, 2012 at 9:59 PM, Michael Segel <[EMAIL PROTECTED]>wrote: > >> Sigh. >> >> Dave, >> I really think you need to think more about the problem. >> >> Think about what a reduce does and then think about what happens in side >> of HBase. >> >> Then think about which runs faster... a job with two mappers writing the >> intermediate and final results in HBase, >> or a M/R job that writes its output to HBase. >> >> If you really truly think about the problem, you will start to understand >> why I say you really don't want to use a reducer when you're working w >> HBase. >> >> >> On May 10, 2012, at 1:41 PM, Dave Revell wrote: >> >>> Some examples of when you'd want a reducer: >>> http://static.usenix.org/event/osdi04/tech/full_papers/dean/dean.pdf >>> >>> On Thu, May 10, 2012 at 11:30 AM, Michael Segel >>> <[EMAIL PROTECTED]>wrote: >>> >>>> Dave, do you really want to go there? >>>> >>>> OP has a couple of issues and he was going down a rabbit hole. >>>> (You can choose if that's a reference to 'the Matrix, Jefferson >> Starship, >>>> Alice in Wonderland... or all of the above) >>>> >>>> So to put him on the correct path, I recommended the following, not in >> any >>>> order... >>>> >>>> 1) Increase his region size for this table only. >>>> 2) Look to decreasing the number of regions managed by a RS (which is >> why >>>> you increase region size) >>>> 3) Up the dfs.balance.bandwidthPerSec. (How often does HBase move >> regions >>>> and how exactly do they move regions ?) >>>> 4) Look at implementing MSLABS and GC tuning. This cuts down on the >>>> overhead. >>>> 5) Refactoring his job.... >>>> >>>> Oops. >>>> Ok I didn't put that in the list. >>>> But that was the last thing I wrote as a separate statement. >>>> Clearly you didn't take my advice and think about the problem.... >>>> >>>> To prove a point.... you wrote: >>>> 'Many mapreduce algorithms require a reduce phase (e.g. sorting)' >>>> >>>> Ok. So tell me why you would want to sort your input in to HBase and if
-
Re: Occasional regionserver crashes following socket errors writing to HDFSStack 2012-05-10, 21:57
On Thu, May 10, 2012 at 11:59 AM, Michael Segel
<[EMAIL PROTECTED]> wrote: > Sigh. > > Dave, > I really think you need to think more about the problem. > > Think about what a reduce does and then think about what happens in side of HBase. > > Then think about which runs faster... a job with two mappers writing the intermediate and final results in HBase, > or a M/R job that writes its output to HBase. > > If you really truly think about the problem, you will start to understand why I say you really don't want to use a reducer when you're working w HBase. > We have a bit of doc that usually you might want to forego reduce phase, http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#sink. Do we need to add to it? That said, you can't make an hard and fast rule that the reduce is to be avoided absolutely. There will be cases where it makes sense (MR sort orthogonal to HBase's or a fat aggregating reduce, etc.) St.Ack P.S. Hey Michael. Go easy on the 'sighs'. The participants in this thread have a clue. I can testify to that. Also, I know you don't mean it, but on occasion, both in this thread and in others I've seen you on, your tone can come across as condescending (and there is nothing like condescension for raising the rankles). We all have our style's but you might want to review with this in mind before you hit send the next time. Just a suggestion.
-
Re: Occasional regionserver crashes following socket errors writing to HDFSMichael Segel 2012-05-11, 01:28
Stack,
That section was written by Doug after he and I had the same debate man moons ago. While I can't say with absolute certainty that you shouldn't use a reducer, I can say is that every situation where I have seen a M/R where you are writing to HBase, you end up not wanting to use a reducer. If you want a clear and concise statement you can say that the rule of thumb is that you don't want to use a reducer and that cases where you would need to first use a reducer are the rare exception. The reason I ask people to think about this topic is that unless you have a really good foundation in databases, not relying on a reducer is a bit counter intuitive. (Which is why I said that you really need to clear your mind and focus on this issue. ) -Mike PS. If you care to read the thread, I didn't become condescending until a certain individual piped up about how refactoring the M/R was a 'distraction' to the issue at hand. Not to mention his flip response w the Google paper? On May 10, 2012, at 4:57 PM, Stack wrote: > On Thu, May 10, 2012 at 11:59 AM, Michael Segel > <[EMAIL PROTECTED]> wrote: >> Sigh. >> >> Dave, >> I really think you need to think more about the problem. >> >> Think about what a reduce does and then think about what happens in side of HBase. >> >> Then think about which runs faster... a job with two mappers writing the intermediate and final results in HBase, >> or a M/R job that writes its output to HBase. >> >> If you really truly think about the problem, you will start to understand why I say you really don't want to use a reducer when you're working w HBase. >> > > We have a bit of doc that usually you might want to forego reduce > phase, http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#sink. > Do we need to add to it? That said, you can't make an hard and fast > rule that the reduce is to be avoided absolutely. There will be cases > where it makes sense (MR sort orthogonal to HBase's or a fat > aggregating reduce, etc.) > > St.Ack > P.S. Hey Michael. Go easy on the 'sighs'. The participants in this > thread have a clue. I can testify to that. Also, I know you don't > mean it, but on occasion, both in this thread and in others I've seen > you on, your tone can come across as condescending (and there is > nothing like condescension for raising the rankles). We all have our > style's but you might want to review with this in mind before you hit > send the next time. Just a suggestion. >
-
Re: Occasional regionserver crashes following socket errors writing to HDFSMichael Segel 2012-05-11, 02:46
Stack,
Since you brought it up... > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#sink. "Writing, it may make sense to avoid the reduce step and write yourself back into HBase from inside your map. You'd do this when your job does not need the sort and collation that mapreduce does on the map emitted data; on insert, HBase 'sorts' so there is no point double-sorting (and shuffling data around your mapreduce cluster) unless you need to. If you do not need the reduce, you might just have your map emit counts of records processed just so the framework's report at the end of your job has meaning or set the number of reduces to zero and use TableOutputFormat. See example code below. If running the reduce step makes sense in your case, its usually better to have lots of reducers so load is spread across the HBase cluster." This isn't 100% true. I'd lose the quotes around 'sorts' because the data is sorted on key values. period. I'd ask that you reconsider the following phrase... "You'd do this when your job does not need the sort and collation that mapreduce does on the map emitted data;" I realize I went to this little midwestern school (tOSU), where ENG meant you were in the college of engineering and not an English Major, so I'm not sure if I am parsing that statement correctly. If you refactor your M/R , HBase can be used for the 'collation' . (If you make your Mapper a null writable and manually write the output to HBase within Mapper.map(), you can write to N tables without a problem. So you can write the record out, update a table where you are keeping counters, stats, etc ... ) So I am still at a loss to find an example of where you would need a reducer. Don't get me wrong. I do believe that there are cases where you may need a reducer, just as I believe that there is intelligent life on other planets. I just haven't found it yet. Of course YMMV. Which is why I ask you to think really long and hard on this issue. With respect to Eran's problem... He's writing sorted output to Hbase. He stated that this problem happens with heavy writes. And that its worse when he has more reducers. (Something recommended in the paragraph...) So one has to ask what would cause a write to be blocked GC ? Eran says he's already tuned it. MSLABS? Eran says that's covered. Table splits? Eran says that the table's region sizes are 256MB (default) and the other table is 512MB. If the table is constantly splitting, then you need to increase the region size. Again we don't have enough information to diagnose if this is the issue. We don't know things about his cluster like the number of nodes, how much memory on each node, as well as which version of HBase. I realize that these are all pretty basic issues, but sometimes its the little things that will trip you up. HTH -Mike
-
Re: Occasional regionserver crashes following socket errors writing to HDFSStack 2012-05-11, 03:28
On Thu, May 10, 2012 at 6:28 PM, Michael Segel
<[EMAIL PROTECTED]> wrote: > That section was written by Doug after he and I had the same debate man moons ago. I'm not sure that is correct. If you git blame that section, you'll see that stack and andrew are the authors and that the edits were made in 2009 and 2010. There is this section in the book but it doesn't seem to have the benefit of your input: http://hbase.apache.org/book.html#mapreduce.example.summary.noreducer > While I can't say with absolute certainty that you shouldn't use a reducer, I can say is that every situation where I have seen a M/R where you are writing to HBase, you end up not wanting to use a reducer. If you want a clear and concise statement you can say that the rule of thumb is that you don't want to use a reducer and that cases where you would need to first use a reducer are the rare exception. > Please file an issue w/ a patch. It'd be good to get your experience into the doc. > The reason I ask people to think about this topic is that unless you have a really good foundation in databases, not relying on a reducer is a bit counter intuitive. (Which is why I said that you really need to clear your mind and focus on this issue. ) > Lets make it so that if you don't have a foundation in dbs, if you read the doc., you won't need such a background to get the best of hbase. > PS. If you care to read the thread, I didn't become condescending until a certain individual piped up about how refactoring the M/R was a 'distraction' to the issue at hand. > Not to mention his flip response w the Google paper? > There are a few problems w/ the above. + You presume I did not read the thread before responding + That the condescending tone started after Dave's intercessions (I was not referring to this thread only). Michael, fellas like you help move the hbase story along. Generally, I see that you do a great job in this forum and in others. In my previous note, I was just trying to give a pointer that what you might consider jest, others can read as condescending or sarcasm. St.Ack
-
Re: Occasional regionserver crashes following socket errors writing to HDFSStack 2012-05-11, 03:34
On Thu, May 10, 2012 at 7:46 PM, Michael Segel
<[EMAIL PROTECTED]> wrote: > "Writing, it may make sense to avoid the reduce step and write yourself back into HBase from inside your map. You'd do this when your job does not need the sort and collation that mapreduce does on the map emitted data; on insert, HBase 'sorts' so there is no point double-sorting (and shuffling data around your mapreduce cluster) unless you need to. If you do not need the reduce, you might just have your map emit counts of records processed just so the framework's report at the end of your job has meaning or set the number of reduces to zero and use TableOutputFormat. See example code below. If running the reduce step makes sense in your case, its usually better to have lots of reducers so load is spread across the HBase cluster." > > This isn't 100% true. > > I'd lose the quotes around 'sorts' because the data is sorted on key values. period. > Sounds good. > I'd ask that you reconsider the following phrase... > "You'd do this when your job does not need the sort and collation that mapreduce does on the map emitted data;" > What would you suggest instead. > I realize I went to this little midwestern school (tOSU), where ENG meant you were in the college of engineering and not an English Major, so I'm not sure if I am parsing that statement correctly. > ditto The above phrase is mine. I'm bad at writing so need help. > If you refactor your M/R , HBase can be used for the 'collation' . (If you make your Mapper a null writable and manually write the output to HBase within Mapper.map(), you can write to N tables without a problem. So you can write the record out, update a table where you are keeping counters, stats, etc ... ) So I am still at a loss to find an example of where you would need a reducer. > Can you make a patch. I'm for making a stronger statement about reduce, that its rare if ever its needed. Lets get it in the doc. > So one has to ask what would cause a write to be blocked > GC ? Eran says he's already tuned it. > MSLABS? Eran says that's covered. > > Table splits? > Eran says that the table's region sizes are 256MB (default) and the other table is 512MB. > If the table is constantly splitting, then you need to increase the region size. Again we don't have enough information to diagnose if this is the issue. > > We don't know things about his cluster like the number of nodes, how much memory on each node, as well as which version of HBase. > > I realize that these are all pretty basic issues, but sometimes its the little things that will trip you up. > Above is generally good advice. Thanks Michael. St.Ack
-
Re: Occasional regionserver crashes following socket errors writing to HDFSMichael Segel 2012-05-11, 03:44
Hmmm.
That could be. I don't know what Doug wrote except that I knew he mentioned he updated the docs on it. This is really kind of a basic issue. It just makes sense. As you already point out, you and Andrew already noticed this back in 2009 and 2010. I just don't think you took it far enough. Essentially HBase can be used in place of a reducer. In terms of a M/R job, M/M using HBase is going to be more efficient. (Assuming that you are already running HBase.) I really can't see any reason to use a reducer when using HBase. Maybe I'm being stupid, but every example I've looked at, you can refactor it to not use a reducer. I also think you may read a bit more in to my posts that I intend. ;-) -Mike On May 10, 2012, at 10:28 PM, Stack wrote: > On Thu, May 10, 2012 at 6:28 PM, Michael Segel > <[EMAIL PROTECTED]> wrote: >> That section was written by Doug after he and I had the same debate man moons ago. > > > I'm not sure that is correct. If you git blame that section, you'll > see that stack and andrew are the authors and that the edits were made > in 2009 and 2010. > > There is this section in the book but it doesn't seem to have the > benefit of your input: > http://hbase.apache.org/book.html#mapreduce.example.summary.noreducer > > >> While I can't say with absolute certainty that you shouldn't use a reducer, I can say is that every situation where I have seen a M/R where you are writing to HBase, you end up not wanting to use a reducer. If you want a clear and concise statement you can say that the rule of thumb is that you don't want to use a reducer and that cases where you would need to first use a reducer are the rare exception. >> > > Please file an issue w/ a patch. It'd be good to get your experience > into the doc. > >> The reason I ask people to think about this topic is that unless you have a really good foundation in databases, not relying on a reducer is a bit counter intuitive. (Which is why I said that you really need to clear your mind and focus on this issue. ) >> > > Lets make it so that if you don't have a foundation in dbs, if you > read the doc., you won't need such a background to get the best of > hbase. > >> PS. If you care to read the thread, I didn't become condescending until a certain individual piped up about how refactoring the M/R was a 'distraction' to the issue at hand. >> Not to mention his flip response w the Google paper? >> > > There are a few problems w/ the above. > > + You presume I did not read the thread before responding > + That the condescending tone started after Dave's intercessions (I > was not referring to this thread only). > > Michael, fellas like you help move the hbase story along. Generally, > I see that you do a great job in this forum and in others. In my > previous note, I was just trying to give a pointer that what you might > consider jest, others can read as condescending or sarcasm. > > St.Ack >
-
Re: Occasional regionserver crashes following socket errors writing to HDFSStack 2012-05-11, 03:53
On Thu, May 10, 2012 at 8:44 PM, Michael Segel
<[EMAIL PROTECTED]> wrote: > That could be. I don't know what Doug wrote except that I knew he mentioned he updated the docs on it. > No worries. Can you make an issue and a patch on how you think we should reword the section? We can be stronger in our wording around reducers (but not preclude their use). > I also think you may read a bit more in to my posts that I intend. ;-) > I don't think I am the only one guilty of this over-reading. Thats why I was making a suggestion. (Eran, sorry for hijacking your thread). Good stuff, St.Ack
-
Re: Occasional regionserver crashes following socket errors writing to HDFSStack 2012-05-11, 05:07
On Thu, May 10, 2012 at 1:17 AM, Eran Kutner <[EMAIL PROTECTED]> wrote:
> Here is an example of the HBase log (showing only errors): > > 2012-05-10 03:34:54,291 WARN org.apache.hadoop.hdfs.DFSClient: > DFSOutputStream ResponseProcessor exception for block > blk_-8928911185099340956_5189425java.io.IOException: Bad response 1 for > block blk_-8928911185099340956_5189425 from datanode 10.1.104.6:50010 > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2986) > > 2012-05-10 03:34:54,494 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.io.InterruptedIOException: Interruped while waiting for IO > on channel java.nio.channels.SocketChannel[connected > local=/10.1.104.9:59642remote=/ > 10.1.104.9:50010]. 0 millis timeout left. > at > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349) > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2848) > > 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-8928911185099340956_5189425 bad datanode[2] > 10.1.104.6:50010 > 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-8928911185099340956_5189425 in pipeline > 10.1.104.9:50010, 10.1.104.8:50010, 10.1.104.6:50010: bad datanode > 10.1.104.6:50010 Above is complaint about a DN in a write pipeline. Anything else around the above logging? You sure the write didn't go through after the dfsclient purged the 'bad datanode'. A few minutes pass and then you ge the below.... > 2012-05-10 03:48:30,174 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > serverName=hadoop1-s09.farm-ny.gigya.com,60020,1336476100422, > load=(requests=15741, regions=789, usedHeap=6822, maxHeap=7983): > regionserver:60020-0x2372c0e8a2f0008 regionserver:60020-0x2372c0e8a2f0008 > received expired from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired Says your session expired with zk. You think there was a big GC pause here? You collecting GC logging? Can you check it? > This is from 10.1.104.9 (same machine running the region server that > crashed): You probably want to look at .6 and see why it went sour. It was reported as the bad DN in the pipeline. What version of hbase? Do you have ganglia or tsdb up and running on your cluster so you can dig in across these times of fail? St.Ack
-
Re: Occasional regionserver crashes following socket errors writing to HDFSStack 2012-05-11, 05:08
On Thu, May 10, 2012 at 4:33 AM, Eran Kutner <[EMAIL PROTECTED]> wrote:
> <property> > <name>dfs.datanode.socket.write.timeout</name> > <value>0</value> > </property> > Not timing out is probably not what you want. On problem, you want the client to give up rather than hang till the end of time (There are multiple replicas, hopefully you don't timeout on all). St.Ack
-
Re: Occasional regionserver crashes following socket errors writing to HDFSStack 2012-05-11, 05:12
On Thu, May 10, 2012 at 6:26 AM, Michael Segel
<[EMAIL PROTECTED]> wrote:. > 4) google dfs.balance.bandwidthPerSec I believe its also used by HBase when they need to move regions. Nah. This is an hdfs setting. HBase don't use it directly. > Speaking of which what happens when HBase decides to move a region? Does it make a copy on the new RS and then after its there, point to the new RS and then remove the old region? > When one RS closes the region and another opens it, there is no copy to be done since the region data is in the HDFS they both share. St.Ack
-
Re: Occasional regionserver crashes following socket errors writing to HDFSMichael Segel 2012-05-11, 11:36
So I see you're looking at Eran's problem.... ;-)
Since you say he's fairly capable, I'm assuming when he said he had GC and MSLABS set up, he did it right, so a GC pause wouldn't cause the error. Bad node? possible. It could easily be a networking/hardware issue which are pain in the ass problems to track down and solve. With respect to the dfs.bandwidthPerSec... yes its an HDFS setting. As you point out, its an indirect issue. However that doesn't mean it wouldn't have an impact on performance. OP states that this occurs under heavy writes. What happens to the writes when a table is splitting? On May 11, 2012, at 12:12 AM, Stack wrote: > On Thu, May 10, 2012 at 6:26 AM, Michael Segel > <[EMAIL PROTECTED]> wrote:. >> 4) google dfs.balance.bandwidthPerSec I believe its also used by HBase when they need to move regions. > > Nah. This is an hdfs setting. HBase don't use it directly. > >> Speaking of which what happens when HBase decides to move a region? Does it make a copy on the new RS and then after its there, point to the new RS and then remove the old region? >> > > When one RS closes the region and another opens it, there is no copy > to be done since the region data is in the HDFS they both share. > > St.Ack >
-
Re: Occasional regionserver crashes following socket errors writing to HDFSEran Kutner 2012-05-24, 11:15
Thanks Stack for noticing the ZooKeeper timeout, don't know how could I
have missed that. After analyzing this for a while it is definitely unrelated to GC. In fact during the last 4 days no GC operation took more than 2 seconds, and those that got close were all concurrent mark sweeps, so they should not be stopping other threads. These are the interesting log lines: 2012-05-22 01:25:11,502 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 23706ms for sessionid 0x1372aa57bee0308, closing socket connection and attempting reconnect 2012-05-22 01:25:11,502 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 24638ms for sessionid 0x3372bf3891304bf, closing socket connection and attempting reconnect 2012-05-22 01:25:12,047 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server hadoop1-zk1/10.1.104.201:2181 2012-05-22 01:25:12,048 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to hadoop1-zk1/10.1.104.201:2181, initiating session 2012-05-22 01:25:12,080 INFO org.apache.zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x3372bf3891304bf has expired, closing socket connection 2012-05-22 01:25:12,081 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server serverName=hadoop1-s05.farm-ny.gigya.com,60020,1336990798475, load=(requests=4015, regions=708, usedHeap=2342, maxHeap=7983): regionserver:60020-0x3372bf3891304bf regionserver:60020-0x3372bf3891304bf received expired from ZooKeeper, aborting org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:352) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:270) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507) This is what the zookeeper logs show at the same time: 2012-05-22 01:24:46,014 - WARN [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@634] - EndOfStreamException: Unable to read additional data from client sessionid 0x1372aa57bef6611, likely client has closed socket 2012-05-22 01:24:46,014 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket connection for client /10.1.104.4:57598 which had sessionid 0x1372aa57bef6611 2012-05-22 01:25:08,010 - ERROR [CommitProcessor:1:NIOServerCnxn@445] - Unexpected Exception: 2012-05-22 01:25:08,016 - INFO [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client /10.1.104.5:33945 which had sessionid 0x1372aa57bee0308 2012-05-22 01:25:12,046 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - Accepted socket connection from /10.1.104.5:43070 2012-05-22 01:25:12,076 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client attempting to renew session 0x3372bf3891304bf at /10.1.104.5:43070 2012-05-22 01:25:12,076 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:Learner@103] - Revalidating client: 231702230809642175 2012-05-22 01:25:12,077 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@1573] - Invalid session 0x3372bf3891304bf for client /10.1.104.5:43070, probably expired 2012-05-22 01:25:12,078 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket connection for client /10.1.104.5:43070 which had sessionid 0x3372bf3891304bf I have zookeeper.session.timeout set to 20 seconds because I wanted quick recovery in case of a failure. Any idea why it would not respond in 20 seconds? Seems like quite a lot of time. Don't know if it's related or not but major compaction was happening while this error occurred. Thanks. -eran On Fri, May 11, 2012 at 2:36 PM, Michael Segel <[EMAIL PROTECTED]>wrote: > So I see you're looking at Eran's problem.... ;-)
-
Re: Occasional regionserver crashes following socket errors writing to HDFSMichael Segel 2012-05-24, 12:13
http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A8
<property> <name>zookeeper.session.timeout</name> <value>1200000</value> </property> <property> <name>hbase.zookeeper.property.tickTime</name> <value>6000</value> </property> The default is 60 seconds which you reduced to 20. (Assuming this is the right parameter) As you said you were doing a major compaction at the time. On May 24, 2012, at 6:15 AM, Eran Kutner wrote: > Thanks Stack for noticing the ZooKeeper timeout, don't know how could I > have missed that. > > After analyzing this for a while it is definitely unrelated to GC. In fact > during the last 4 days no GC operation took more than 2 seconds, and those > that got close were all concurrent mark sweeps, so they should not be > stopping other threads. > > These are the interesting log lines: > 2012-05-22 01:25:11,502 INFO org.apache.zookeeper.ClientCnxn: Client > session timed out, have not heard from server in 23706ms for sessionid > 0x1372aa57bee0308, closing socket connection and attempting reconnect > 2012-05-22 01:25:11,502 INFO org.apache.zookeeper.ClientCnxn: Client > session timed out, have not heard from server in 24638ms for sessionid > 0x3372bf3891304bf, closing socket connection and attempting reconnect > 2012-05-22 01:25:12,047 INFO org.apache.zookeeper.ClientCnxn: Opening > socket connection to server hadoop1-zk1/10.1.104.201:2181 > 2012-05-22 01:25:12,048 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to hadoop1-zk1/10.1.104.201:2181, initiating session > 2012-05-22 01:25:12,080 INFO org.apache.zookeeper.ClientCnxn: Unable to > reconnect to ZooKeeper service, session 0x3372bf3891304bf has expired, > closing socket connection > 2012-05-22 01:25:12,081 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > serverName=hadoop1-s05.farm-ny.gigya.com,60020,1336990798475, > load=(requests=4015, regions=708, usedHeap=2342, maxHeap=7983): > regionserver:60020-0x3372bf3891304bf regionserver:60020-0x3372bf3891304bf > received expired from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:352) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:270) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507) > > This is what the zookeeper logs show at the same time: > 2012-05-22 01:24:46,014 - WARN [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@634] - EndOfStreamException: Unable to > read additional data from client sessionid 0x1372aa57bef6611, likely client > has closed socket > 2012-05-22 01:24:46,014 - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket connection for > client /10.1.104.4:57598 which had sessionid 0x1372aa57bef6611 > 2012-05-22 01:25:08,010 - ERROR [CommitProcessor:1:NIOServerCnxn@445] - > Unexpected Exception: > 2012-05-22 01:25:08,016 - INFO [CommitProcessor:1:NIOServerCnxn@1435] - > Closed socket connection for client /10.1.104.5:33945 which had sessionid > 0x1372aa57bee0308 > 2012-05-22 01:25:12,046 - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - Accepted socket > connection from /10.1.104.5:43070 > 2012-05-22 01:25:12,076 - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client attempting to renew > session 0x3372bf3891304bf at /10.1.104.5:43070 > 2012-05-22 01:25:12,076 - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:Learner@103] - Revalidating client: 231702230809642175 > 2012-05-22 01:25:12,077 - INFO > [QuorumPeer:/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@1573] - Invalid session > 0x3372bf3891304bf for client /10.1.104.5:43070, probably expired > 2012-05-22 01:25:12,078 - INFO [NIOServerCxn.Factory:
-
Re: Occasional regionserver crashes following socket errors writing to HDFSStack 2012-05-24, 23:39
On Thu, May 24, 2012 at 4:15 AM, Eran Kutner <[EMAIL PROTECTED]> wrote:
> Any idea why it would not respond in 20 seconds? Seems like quite a lot of > time. The only explaination I have is long GC. If you are not having these, I'm not sure what it is. Pastebin log and .out from around a zk timeout. We might see something. St.Ack
-
Re: Occasional regionserver crashes following socket errors writing to HDFSDave Revell 2012-05-25, 19:52
Have you verified that your nodes are not swapping? This has caused serious
issues for many people, including me. Swapping can occur even if you have lots of available memory, for complicated reasons. Best, Dave On Thu, May 24, 2012 at 4:39 PM, Stack <[EMAIL PROTECTED]> wrote: > On Thu, May 24, 2012 at 4:15 AM, Eran Kutner <[EMAIL PROTECTED]> wrote: > > Any idea why it would not respond in 20 seconds? Seems like quite a lot > of > > time. > > The only explaination I have is long GC. If you are not having these, > I'm not sure what it is. Pastebin log and .out from around a zk > timeout. We might see something. > > St.Ack >
-
Re: Occasional regionserver crashes following socket errors writing to HDFSdva 2012-08-30, 06:26
Eran Kutner wrote: > > Hi, > We're seeing occasional regionserver crashes during heavy write operations > to Hbase (at the reduce phase of large M/R jobs). I have increased the > file > descriptors, HDFS xceivers, HDFS threads to the recommended settings and > actually way above. > > Here is an example of the HBase log (showing only errors): > > 2012-05-10 03:34:54,291 WARN org.apache.hadoop.hdfs.DFSClient: > DFSOutputStream ResponseProcessor exception for block > blk_-8928911185099340956_5189425java.io.IOException: Bad response 1 for > block blk_-8928911185099340956_5189425 from datanode 10.1.104.6:50010 > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2986) > > 2012-05-10 03:34:54,494 WARN org.apache.hadoop.hdfs.DFSClient: > DataStreamer > Exception: java.io.InterruptedIOException: Interruped while waiting for IO > on channel java.nio.channels.SocketChannel[connected > local=/10.1.104.9:59642remote=/ > 10.1.104.9:50010]. 0 millis timeout left. > at > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349) > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > at > java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2848) > > 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-8928911185099340956_5189425 bad datanode[2] > 10.1.104.6:50010 > 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-8928911185099340956_5189425 in pipeline > 10.1.104.9:50010, 10.1.104.8:50010, 10.1.104.6:50010: bad datanode > 10.1.104.6:50010 > 2012-05-10 03:48:30,174 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > serverName=hadoop1-s09.farm-ny.gigya.com,60020,1336476100422, > load=(requests=15741, regions=789, usedHeap=6822, maxHeap=7983): > regionserver:60020-0x2372c0e8a2f0008 regionserver:60020-0x2372c0e8a2f0008 > received expired from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:352) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:270) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507) > java.io.InterruptedIOException: Aborting compaction of store properties in > region > gs_users,6155551|QoCW/euBIKuMat/nRC5Xtw==,1334983658004.878522ea91f41cd76b903ea06ccd17f9. > because user requested stop. > at > org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:998) > at > org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:779) > at > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:776) > at > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:721) > at > org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:81) > > > This is from 10.1.104.9 (same machine running the region server that > crashed): > 2012-05-10 03:31:16,785 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-8928911185099340956_5189425 src: /10.1.104.9:59642 dest: / > 10.1.104.9:50010 > 2012-05-10 03:35:39,000 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder > blk_-8928911185099340956_5189425 2 Exception java.net.SocketException: > Connection reset I wrote following program using namespace std; #include "hadoop/Pipes.hh" #include "hadoop/TemplateFactory.hh" #include "hadoop/StringUtils.hh" #include "libpq-fe.h" extern "C" { #include "traverser.h"} class IndexMap:public HadoopPipes::Mapper { public: IndexMap(HadoopPipes::TaskContext & context) { } void map(HadoopPipes::MapContext & context) { std::vector<std::string> paths HadoopUtils::splitString(context.getInputValue(), "/r/n"); unsigned int k = 4; unsigned int l = 0; string concatpaths[k]; if (paths.size() % k == 0) { for (unsigned int i = 0; i < k; ++i) { concatpaths[i] = paths[l]; l = l + paths.size() / k; } for (unsigned int i = 0; i < k; ++i) { for (unsigned int j = 1; j < paths.size() / k; ++j) { concatpaths[i] = +" " + paths[i * paths.size() / k + j]; } } } else { l = 0; for (unsigned int i = 0; i < k; ++i) { concatpaths[i] = paths[l]; l = l + paths.size() / (k - 1); } for (unsigned int i = 0; i < k - 1; ++i) { for (unsigned int j = 1; j < paths.size() / (k - 1); ++j) { concatpaths[i] = +" " + paths[i * paths.size() / (k - 1)+j]; } } for (unsigned int j = 1; j < paths.size() - paths.size() / (k - 1) * (k - 1); ++j) { concatpaths[k - 1] = +" " + paths[(k - 1) * paths.size() / (k - 1) + j];
-
Re: Occasional regionserver crashes following socket errors writing to HDFSdva 2012-08-30, 06:26
Eran Kutner wrote: > > Hi, > We're seeing occasional regionserver crashes during heavy write operations > to Hbase (at the reduce phase of large M/R jobs). I have increased the > file > descriptors, HDFS xceivers, HDFS threads to the recommended settings and > actually way above. > > Here is an example of the HBase log (showing only errors): > > 2012-05-10 03:34:54,291 WARN org.apache.hadoop.hdfs.DFSClient: > DFSOutputStream ResponseProcessor exception for block > blk_-8928911185099340956_5189425java.io.IOException: Bad response 1 for > block blk_-8928911185099340956_5189425 from datanode 10.1.104.6:50010 > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2986) > > 2012-05-10 03:34:54,494 WARN org.apache.hadoop.hdfs.DFSClient: > DataStreamer > Exception: java.io.InterruptedIOException: Interruped while waiting for IO > on channel java.nio.channels.SocketChannel[connected > local=/10.1.104.9:59642remote=/ > 10.1.104.9:50010]. 0 millis timeout left. > at > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349) > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > at > java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2848) > > 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-8928911185099340956_5189425 bad datanode[2] > 10.1.104.6:50010 > 2012-05-10 03:34:54,531 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_-8928911185099340956_5189425 in pipeline > 10.1.104.9:50010, 10.1.104.8:50010, 10.1.104.6:50010: bad datanode > 10.1.104.6:50010 > 2012-05-10 03:48:30,174 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > serverName=hadoop1-s09.farm-ny.gigya.com,60020,1336476100422, > load=(requests=15741, regions=789, usedHeap=6822, maxHeap=7983): > regionserver:60020-0x2372c0e8a2f0008 regionserver:60020-0x2372c0e8a2f0008 > received expired from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:352) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:270) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507) > java.io.InterruptedIOException: Aborting compaction of store properties in > region > gs_users,6155551|QoCW/euBIKuMat/nRC5Xtw==,1334983658004.878522ea91f41cd76b903ea06ccd17f9. > because user requested stop. > at > org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:998) > at > org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:779) > at > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:776) > at > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:721) > at > org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:81) > > > This is from 10.1.104.9 (same machine running the region server that > crashed): > 2012-05-10 03:31:16,785 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block > blk_-8928911185099340956_5189425 src: /10.1.104.9:59642 dest: / > 10.1.104.9:50010 > 2012-05-10 03:35:39,000 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder > blk_-8928911185099340956_5189425 2 Exception java.net.SocketException: > Connection reset I wrote following program using namespace std; #include "hadoop/Pipes.hh" #include "hadoop/TemplateFactory.hh" #include "hadoop/StringUtils.hh" #include "libpq-fe.h" extern "C" { #include "traverser.h"} class IndexMap:public HadoopPipes::Mapper { public: IndexMap(HadoopPipes::TaskContext & context) { } void map(HadoopPipes::MapContext & context) { std::vector<std::string> paths HadoopUtils::splitString(context.getInputValue(), "/r/n"); unsigned int k = 4; unsigned int l = 0; string concatpaths[k]; if (paths.size() % k == 0) { for (unsigned int i = 0; i < k; ++i) { concatpaths[i] = paths[l]; l = l + paths.size() / k; } for (unsigned int i = 0; i < k; ++i) { for (unsigned int j = 1; j < paths.size() / k; ++j) { concatpaths[i] = +" " + paths[i * paths.size() / k + j]; } } } else { l = 0; for (unsigned int i = 0; i < k; ++i) { concatpaths[i] = paths[l]; l = l + paths.size() / (k - 1); } for (unsigned int i = 0; i < k - 1; ++i) { for (unsigned int j = 1; j < paths.size() / (k - 1); ++j) { concatpaths[i] = +" " + paths[i * paths.size() / (k - 1)+j]; } } for (unsigned int j = 1; j < paths.size() - paths.size() / (k - 1) * (k - 1); ++j) { concatpaths[k - 1] = +" " + paths[(k - 1) * paths.size() / (k - 1) + j];
-
Re: Occasional regionserver crashes following socket errors writing to HDFSStack 2012-08-30, 22:36
On Wed, Aug 29, 2012 at 11:26 PM, dva <[EMAIL PROTECTED]> wrote:
> 12/08/29 08:02:10 ERROR security.UserGroupInformation: > PriviledgedActionException as:root > cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory > file:/export/hadoop-1.0.1/bin/out.txt already exists Your message is hard to read. Your problem seems pretty basic. See the above. Fix that (move aside te file) and retry? St.Ack |