|
|
-
dynamic partition import
Nimra Choudhary 2012-05-29, 09:31
We are using Dynamic partitioning and facing the similar problem. Below is the jobtracker error log. We have a hadoop cluster of 6 nodes, 1.16 TB capacity with over 700GB still free.
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-nimrac/hive_2012-05-29_10-32-06_332_4238693577104368640/_tmp.-ext-10000/createddttm=2011-04-24/_tmp.000001_2 could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1421) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:576) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744) at org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247) ...
Is there any workaround or fix for this?
Regards, Nimra
-
Re: dynamic partition import
Nitin Pawar 2012-05-29, 09:37
can you check atleast one datanode is running and is not part of blacklisted nodes On Tue, May 29, 2012 at 3:01 PM, Nimra Choudhary <[EMAIL PROTECTED]>wrote:
> ** ** > > We are using Dynamic partitioning and facing the similar problem. Below is > the jobtracker error log. We have a hadoop cluster of 6 nodes, 1.16 TB > capacity with over 700GB still free.**** > > ** ** > > *Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.ipc.RemoteException: java.io.IOException: File > /tmp/hive-nimrac/hive_2012-05-29_10-32-06_332_4238693577104368640/_tmp.-ext-10000/createddttm=2011-04-24/_tmp.000001_2 > could only be replicated to 0 nodes, instead of 1* > > * at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1421) > * > > * at > org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596) > * > > * at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown > Source)* > > * at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > * > > * at java.lang.reflect.Method.invoke(Method.java:601)* > > * at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)* > > * at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)* > > * at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)* > > * at java.security.AccessController.doPrivileged(Native > Method)* > > * at javax.security.auth.Subject.doAs(Subject.java:415)* > > * at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) > * > > * at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)* > > * * > > * at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:576) > * > > * at > org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)* > > * at > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744)* > > * at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) > * > > * at > org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)* > > * at > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744)* > > * at > org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45) > * > > * at > org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)* > > * at > org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247)* > > * ...* > > ** ** > > Is there any workaround or fix for this?**** > > ** ** > > Regards,**** > > Nimra**** > > ** ** >
-- Nitin Pawar
-
RE: dynamic partition import
Nimra Choudhary 2012-05-29, 09:41
All my data nodes are up and running with none blacklisted.
Regards, Nimra
From: Nitin Pawar [mailto:[EMAIL PROTECTED]] Sent: Tuesday, May 29, 2012 3:07 PM To: [EMAIL PROTECTED] Subject: Re: dynamic partition import
can you check atleast one datanode is running and is not part of blacklisted nodes On Tue, May 29, 2012 at 3:01 PM, Nimra Choudhary <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
We are using Dynamic partitioning and facing the similar problem. Below is the jobtracker error log. We have a hadoop cluster of 6 nodes, 1.16 TB capacity with over 700GB still free.
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-nimrac/hive_2012-05-29_10-32-06_332_4238693577104368640/_tmp.-ext-10000/createddttm=2011-04-24/_tmp.000001_2 could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1421) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:576) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744) at org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247) ...
Is there any workaround or fix for this?
Regards, Nimra -- Nitin Pawar
-
Re: dynamic partition import
Philip Tromans 2012-05-29, 09:46
Is there anything interesting in the datanode logs?
Phil.
On 29 May 2012 10:37, Nitin Pawar <[EMAIL PROTECTED]> wrote: > can you check atleast one datanode is running and is not part of blacklisted > nodes > > > On Tue, May 29, 2012 at 3:01 PM, Nimra Choudhary <[EMAIL PROTECTED]> > wrote: >> >> >> >> We are using Dynamic partitioning and facing the similar problem. Below is >> the jobtracker error log. We have a hadoop cluster of 6 nodes, 1.16 TB >> capacity with over 700GB still free. >> >> >> >> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: >> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File >> /tmp/hive-nimrac/hive_2012-05-29_10-32-06_332_4238693577104368640/_tmp.-ext-10000/createddttm=2011-04-24/_tmp.000001_2 >> could only be replicated to 0 nodes, instead of 1 >> >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1421) >> >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596) >> >> at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown >> Source) >> >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> >> at java.lang.reflect.Method.invoke(Method.java:601) >> >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) >> >> at >> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383) >> >> at >> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379) >> >> at java.security.AccessController.doPrivileged(Native >> Method) >> >> at javax.security.auth.Subject.doAs(Subject.java:415) >> >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) >> >> at >> org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377) >> >> >> >> at >> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:576) >> >> at >> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) >> >> at >> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744) >> >> at >> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) >> >> at >> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) >> >> at >> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744) >> >> at >> org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45) >> >> at >> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) >> >> at >> org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247) >> >> ... >> >> >> >> Is there any workaround or fix for this? >> >> >> >> Regards, >> >> Nimra >> >> > > > > > -- > Nitin Pawar >
-
RE: dynamic partition import
Nimra Choudhary 2012-05-29, 09:57
The only exception I can see was :
12/05/29 15:03:22 WARN datanode.DataNode: DatanodeRegistration(172.23.106.80:50010, storageID=DS-1416163861-172.23.106.80-50010-1335859555961, infoPort=50075, ipcPort=50020):Failed to transfer blk_3137767359939041043_493855 to 172.23.108.105:50010 got java.net.SocketException: Original Exception : java.io.IOException: An established connection was aborted by the software in your host machine at sun.nio.ch.SocketDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:51) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:89) at sun.nio.ch.IOUtil.write(IOUtil.java:60) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450) at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:55) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:319) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:401) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1319) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.IOException: An established connection was aborted by the software in your host machine
Apart from this all log info like :
12/05/29 15:05:14 INFO datanode.DataNode: PacketResponder 2 for block blk_-1151833161097637022_493865 terminating 12/05/29 15:05:14 INFO datanode.DataNode: Receiving block blk_1737057988729853067_493866 src: /172.23.106.80:30093 dest: /172.23.106.80:50010 12/05/29 15:05:15 INFO DataNode.clienttrace: src: /172.23.106.80:30093, dest: /172.23.106.80:50010, bytes: 37572, op: HDFS_WRITE, cliID: DFSClient_141562960, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_1737057988729853067_493866, duration: 26623450 12/05/29 15:05:15 INFO datanode.DataNode: PacketResponder 2 for block blk_1737057988729853067_493866 terminating 12/05/29 15:05:15 INFO DataNode.clienttrace: src: /172.23.106.80:50010, dest: /172.23.106.80:30095, bytes: 37868, op: HDFS_READ, cliID: DFSClient_1094357381, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_1737057988729853067_493866, duration: 3302117 12/05/29 15:05:15 INFO datanode.DataNode: Receiving block blk_-7108535084399259969_493867 src: /172.23.106.80:30096 dest: /172.23.106.80:50010 12/05/29 15:05:15 INFO DataNode.clienttrace: src: /172.23.106.80:30096, dest: /172.23.106.80:50010, bytes: 106, op: HDFS_WRITE, cliID: DFSClient_1094357381, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_-7108535084399259969_493867, duration: 10612496 12/05/29 15:05:15 INFO datanode.DataNode: PacketResponder 2 for block blk_-7108535084399259969_493867 terminating 12/05/29 15:05:15 INFO DataNode.clienttrace: src: /172.23.106.80:50010, dest: /172.23.106.80:30100, bytes: 258, op: HDFS_READ, cliID: DFSClient_1094357381, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_-1151833161097637022_493865, duration: 349632 12/05/29 15:05:38 INFO DataNode.clienttrace: src: /172.23.106.80:50010, dest: /172.23.106.80:30102, bytes: 110, op: HDFS_READ, cliID: DFSClient_2021532316, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_-7108535084399259969_493867, duration: 246735 12/05/29 15:05:38 INFO DataNode.clienttrace: src: /172.23.106.80:50010, dest: /172.23.106.80:30103, bytes: 37868, op: HDFS_READ, cliID: DFSClient_2021532316, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_1737057988729853067_493866, duration: 738755 12/05/29 15:05:38 INFO DataNode.clienttrace: src: /172.23.106.80:50010, dest: /172.23.106.80:30104, bytes: 355404, op: HDFS_READ, cliID: DFSClient_2021532316, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_-2857180803703989524_493862, duration: 2556839 12/05/29 15:05:39 INFO DataNode.clienttrace: src: /172.23.106.80:50010, dest: /172.23.106.80:30105, bytes: 3000905, op: HDFS_READ, cliID: DFSClient_2021532316, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_3306366393069395828_493863, duration: 31513945 12/05/29 15:05:39 INFO DataNode.clienttrace: src: /172.23.106.80:50010, dest: /172.23.108.158:57662, bytes: 37868, op: HDFS_READ, cliID: DFSClient_996881783, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_1737057988729853067_493866, duration: 10630974 12/05/29 15:07:23 INFO datanode.DataNode: Receiving block blk_810161396283096631_497851 src: /172.23.108.63:41818 dest: /172.23.106.80:50010 12/05/29 15:07:23 INFO DataNode.clienttrace: src: /172.23.108.63:41818, dest: /172.23.106.80:50010, bytes: 7978, op: HDFS_WRITE, cliID: DFSClient_attempt_201205261626_0011_r_000001_0, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_810161396283096631_497851, duration: 14656991 12/05/29 15:07:23 INFO datanode.DataNode: PacketResponder 0 for block blk_810161396283096631_497851 terminating 12/05/29 15:07:24 INFO datanode.DataNode: Receiving block blk_7308942874350730353_497859 src: /172.23.106.63:51188 dest: /172.23.106.80:50010 12/05/29 15:07:24 INFO DataNode.clienttrace: src: /172.23.106.63:51188, dest: /172.23.106.80:50010, bytes: 12154, op: HDFS_WRITE, cliID: DFSClient_attempt_201205261626_0011_r_000001_0, offset: 0, srvID: DS-1416163861-172.23.106.80-50010-1335859555961, blockid: blk_7308942874350730353_497859, duration: 15148287 12/05/29 15:07:24 INFO datanode.DataNode: PacketResponder 0
-
Re: dynamic partition import
Nitin Pawar 2012-05-29, 10:03
can you check there is any firewall running on the system or if iptables is running
This error is usually because there is a firewall or iptables blocking the connection
On Tue, May 29, 2012 at 3:27 PM, Nimra Choudhary <[EMAIL PROTECTED]>wrote:
> *The only exception I can see was :* > > *12/05/29 15:03:22 WARN datanode.DataNode: DatanodeRegistration( > 172.23.106.80:50010, > storageID=DS-1416163861-172.23.106.80-50010-1335859555961, infoPort=50075, > ipcPort=50020):Failed to transfer blk_3137767359939041043_493855 to > 172.23.108.105:50010 got java.net.SocketException: Original Exception : > java.io.IOException: An established connection was aborted by the software > in your host machine* > *at sun.nio.ch.SocketDispatcher.write0(Native Method)* > *at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:51)* > *at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:89)* > *at sun.nio.ch.IOUtil.write(IOUtil.java:60)* > *at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450) > * > *at > org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:55) > * > *at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) > * > *at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > * > *at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > * > *at > java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)* > *at java.io.DataOutputStream.write(DataOutputStream.java:107)* > *at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:319) > * > *at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:401) > * > *at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1319) > * > *at java.lang.Thread.run(Thread.java:722)* > *Caused by: java.io.IOException: An established connection was aborted by > the software in your host machine* > > *Apart from this all log info like :* > > *12/05/29 15:05:14 INFO datanode.DataNode: PacketResponder 2 for block > blk_-1151833161097637022_493865 terminating* > *12/05/29 15:05:14 INFO datanode.DataNode: Receiving block > blk_1737057988729853067_493866 src: /172.23.106.80:30093 dest: / > 172.23.106.80:50010* > *12/05/29 15:05:15 INFO DataNode.clienttrace: src: /172.23.106.80:30093, > dest: /172.23.106.80:50010, bytes: 37572, op: HDFS_WRITE, cliID: > DFSClient_141562960, offset: 0, srvID: > DS-1416163861-172.23.106.80-50010-1335859555961, blockid: > blk_1737057988729853067_493866, duration: 26623450* > *12/05/29 15:05:15 INFO datanode.DataNode: PacketResponder 2 for block > blk_1737057988729853067_493866 terminating* > *12/05/29 15:05:15 INFO DataNode.clienttrace: src: /172.23.106.80:50010, > dest: /172.23.106.80:30095, bytes: 37868, op: HDFS_READ, cliID: > DFSClient_1094357381, offset: 0, srvID: > DS-1416163861-172.23.106.80-50010-1335859555961, blockid: > blk_1737057988729853067_493866, duration: 3302117* > *12/05/29 15:05:15 INFO datanode.DataNode: Receiving block > blk_-7108535084399259969_493867 src: /172.23.106.80:30096 dest: / > 172.23.106.80:50010* > *12/05/29 15:05:15 INFO DataNode.clienttrace: src: /172.23.106.80:30096, > dest: /172.23.106.80:50010, bytes: 106, op: HDFS_WRITE, cliID: > DFSClient_1094357381, offset: 0, srvID: > DS-1416163861-172.23.106.80-50010-1335859555961, blockid: > blk_-7108535084399259969_493867, duration: 10612496* > *12/05/29 15:05:15 INFO datanode.DataNode: PacketResponder 2 for block > blk_-7108535084399259969_493867 terminating* > *12/05/29 15:05:15 INFO DataNode.clienttrace: src: /172.23.106.80:50010, > dest: /172.23.106.80:30100, bytes: 258, op: HDFS_READ, cliID: > DFSClient_1094357381, offset: 0, srvID: > DS-1416163861-172.23.106.80-50010-1335859555961, blockid: > blk_-1151833161097637022_493865, duration: 349632* > *12/05/29 15:05:38 INFO DataNode.clienttrace: src: /172.23.106.80:50010,
Nitin Pawar
|
|