Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Problem in copyFromLocal


Copy link to this message
-
Re: Problem in copyFromLocal
Jeff Zhang 2010-09-10, 01:08
check the data node's log to see whether it starts correctly
On Thu, Sep 9, 2010 at 8:51 AM, Medha Atre <[EMAIL PROTECTED]> wrote:
> Sorry for the typo in the earlier message:
> --------------------------------------------------------
>
> Hi,
>
> I am a new Hadoop user. I followed the tutorial by Michael Noll on
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29(as
> well as for single node) with Hadoop-0.20 and Hadoop-0.21. I keep
> facing
> one problem intermittently:
>
> My NameNode, JobTracker, DataNode, and TaskTrackers get started without any
> problem and "jps" shows them running to. I can format the DFS space without
> any problems. But when I try to use -copyFromLocal command, it fails with
> the following exception:
>
> 2010-09-09 05:54:04,216 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 54310, call addBlock(/user/hadoop/multinode/advsh12.txt,
> DFSClient_2010062748, null, null) from
> 9.59.225.190:53125: error: java.io.IOException: File
> /user/hadoop/multinode/advsh12.txt could only be replicated to 0 nodes,
> instead of 1
> java.io.IOException: File /user/hadoop/multinode/advsh12.txt could only be
> replicated to 0 nodes, instead of 1
>       at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1448)
>       at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:690)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at
> org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1350)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1346)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1344)
>
> Notable thing is: if I let go sufficiently long time between a failure of
> the command and its repeat execution, it executes successfully the next
> time.
>
> But if I try to execute the same command without spending much time in
> between, it fails with the same exception, (I do shutdown all servers/java
> processes, delete the DFS space manually with "rm -rf", and reformat it with
> "namenode -format" between repeat executions of the -copyFromLocal command).
>
> I checked the mailing list archives for this problem. One thread
> http://www.mail-archive.com/[EMAIL PROTECTED]/msg00851.htmlsuggested
> to check and increase allowed open file descriptors. So I checked
> that on my system.
>
> $ cat /proc/sys/fs/file-max
> 1977900
> $
>
> This is a pretty large number.
>
> I checked updated the shell's open file limit too through
> /etc/security/limits.conf . Now it looks like -
>
> $ ulimit -a
> <snip>
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 172032
> max locked memory       (kbytes, -l) 32
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) *65535*
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 172032
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
> So I was wondering what might be the root cause of the problem and how I can
> fix it (either in Hadoop or in my system)?
>
> Could someone please help me?
>
> Thanks.
>

--
Best Regards

Jeff Zhang