|
Abdelrahman Kamel
2011-09-26, 15:16
bharath vissapragada
2011-09-26, 15:24
Uma Maheswara Rao G 72686...
2011-09-26, 15:32
Devaraj k
2011-09-26, 15:39
Abdelrahman Kamel
2011-09-28, 06:59
praveenesh kumar
2011-09-28, 07:15
Abdelrahman Kamel
2011-09-28, 08:57
|
-
Too many fetch failures. Help!Abdelrahman Kamel 2011-09-26, 15:16
Hi,
This is my first post here. I'm new to Hadoop. I've already installed Hadoop on 2 Ubuntu boxes (one is both master and slave and the other is only slave). When I run a Wordcount example on 5 small txt files, the process never completes and I get a "Too many fetch failures" error on my terminal. If you can help me, I cant post my terminal's output and any log files needed. Great thanks. -- Abdelrahman Kamel
-
Re: Too many fetch failures. Help!bharath vissapragada 2011-09-26, 15:24
Hey,
Try configuring your cluster with hostnames instead of ips and add those entries to /etc/hosts and sync it across all the nodes in the cluster. You need to restart the cluster after making these changes. Hope this helps, On Mon, Sep 26, 2011 at 8:46 PM, Abdelrahman Kamel <[EMAIL PROTECTED]> wrote: > Hi, > This is my first post here. > I'm new to Hadoop. > I've already installed Hadoop on 2 Ubuntu boxes (one is both master and > slave and the other is only slave). > When I run a Wordcount example on 5 small txt files, the process never > completes and I get a "Too many fetch failures" error on my terminal. > If you can help me, I cant post my terminal's output and any log files > needed. > Great thanks. > > -- > Abdelrahman Kamel > -- Regards, Bharath .V w:http://researchweb.iiit.ac.in/~bharath.v
-
Re: Too many fetch failures. Help!Uma Maheswara Rao G 72686... 2011-09-26, 15:32
Hello Abdelrahman,
Are you able to ping from one machine to other with the configured hostname? configure both the hostnames in /etc/hosts file properly and try. Regards, Uma ----- Original Message ----- From: Abdelrahman Kamel <[EMAIL PROTECTED]> Date: Monday, September 26, 2011 8:47 pm Subject: Too many fetch failures. Help! To: [EMAIL PROTECTED] > Hi, > This is my first post here. > I'm new to Hadoop. > I've already installed Hadoop on 2 Ubuntu boxes (one is both master > andslave and the other is only slave). > When I run a Wordcount example on 5 small txt files, the process never > completes and I get a "Too many fetch failures" error on my terminal. > If you can help me, I cant post my terminal's output and any log files > needed. > Great thanks. > > -- > Abdelrahman Kamel >
-
RE: Too many fetch failures. Help!Devaraj k 2011-09-26, 15:39
Hi Bharath,
There are few reasons to cause this problem. I have listed below some reasons with solutions. This might help you to solve this. If you post the logs, the problem can be figured out. Reason 1: It could be that the mapping in the /etc/hosts file is not present. The DNS server is down as a result of which the hostnames cannot be resolved. The DNS server is in-correctly configured. Solution: Setting the slave.host.name property can be one solution. Appropriate changes need to be done based on the problem. Reason 2: If the map outputs are larger, we may get java.lang.OutOfMemoryError: Java heap space. Because of this there are too many fetch failures. Solution: The error, java.lang.OutOfMemoryError: Java heap space in task tracker logs can be solved by any of the following methods: By decreasing the value configured for mapred.job.shuffle.input.buffer.percent. By increasing the heap memory of child JVM options for the property mapred.child.java.opts. Thanks Devaraj ________________________________________ From: bharath vissapragada [[EMAIL PROTECTED]] Sent: Monday, September 26, 2011 8:54 PM To: [EMAIL PROTECTED] Subject: Re: Too many fetch failures. Help! Hey, Try configuring your cluster with hostnames instead of ips and add those entries to /etc/hosts and sync it across all the nodes in the cluster. You need to restart the cluster after making these changes. Hope this helps, On Mon, Sep 26, 2011 at 8:46 PM, Abdelrahman Kamel <[EMAIL PROTECTED]> wrote: > Hi, > This is my first post here. > I'm new to Hadoop. > I've already installed Hadoop on 2 Ubuntu boxes (one is both master and > slave and the other is only slave). > When I run a Wordcount example on 5 small txt files, the process never > completes and I get a "Too many fetch failures" error on my terminal. > If you can help me, I cant post my terminal's output and any log files > needed. > Great thanks. > > -- > Abdelrahman Kamel > -- Regards, Bharath .V w:http://researchweb.iiit.ac.in/~bharath.v
-
Re: Too many fetch failures. Help!Abdelrahman Kamel 2011-09-28, 06:59
Thanks very much for all your fast replies.
*Here is my terminal output:* hduser@hdmaster:/usr/local/hadoop$ bin/hadoop jar hadoop-0.20.2-examples.jar wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output 11/09/28 07:38:23 INFO input.FileInputFormat: Total input paths to process : 5 11/09/28 07:38:23 INFO mapred.JobClient: Running job: job_201109280735_0002 11/09/28 07:38:24 INFO mapred.JobClient: map 0% reduce 0% 11/09/28 07:38:42 INFO mapred.JobClient: map 20% reduce 0% 11/09/28 07:38:44 INFO mapred.JobClient: map 40% reduce 0% 11/09/28 07:38:45 INFO mapred.JobClient: map 60% reduce 0% 11/09/28 07:38:47 INFO mapred.JobClient: map 80% reduce 0% 11/09/28 07:38:51 INFO mapred.JobClient: map 100% reduce 0% 11/09/28 07:38:54 INFO mapred.JobClient: map 100% reduce 13% 11/09/28 07:39:01 INFO mapred.JobClient: map 100% reduce 20% *The terminal is stuck here.* *And here is my JobTracker log:* 2011-09-28 07:35:43,185 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting JobTracker STARTUP_MSG: host = hdmaster/127.0.1.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 ************************************************************/ 2011-09-28 07:35:43,256 INFO org.apache.hadoop.mapred.JobTracker: Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) 2011-09-28 07:35:43,310 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=54311 2011-09-28 07:35:53,431 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2011-09-28 07:35:53,510 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50030 2011-09-28 07:35:53,511 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50030 webServer.getConnectors()[0].getLocalPort() returned 50030 2011-09-28 07:35:53,511 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50030 2011-09-28 07:35:53,511 INFO org.mortbay.log: jetty-6.1.14 2011-09-28 07:35:53,816 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50030 2011-09-28 07:35:53,817 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId2011-09-28 07:35:53,818 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 54311 2011-09-28 07:35:53,818 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030 2011-09-28 07:35:53,926 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory 2011-09-28 07:35:53,930 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: hdfs://master:54310/app/hadoop/tmp/mapred/system org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /app/hadoop/tmp/mapred/system. Name node is in safe mode. The ratio of reported blocks 0.0000 has not reached the threshold 0.9990. Safe mode will be turned off automatically. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1700) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1680) at org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:517) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:740) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy4.delete(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy4.delete(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:582) at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:227) at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1695) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:183) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:175) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3702) 2011-09-28 07:36:03,934 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory 2011-09-28 07:36:03,935 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: hdfs://master:54310/app/hadoop/tmp/mapred/system org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /app/hadoop/tmp/mapred/system. Name node is in safe mode. The ratio of reported blocks 0.0000 has not reached the threshold 0.9990. Safe mode will be turned off automatically. at org.apache.hadoop.hdfs.server.name
-
Re: Too many fetch failures. Help!praveenesh kumar 2011-09-28, 07:15
Try commenting out "127.0.0.1 localhost" line from /etc/hosts in all our
systems. On Tue, Sep 27, 2011 at 11:59 PM, Abdelrahman Kamel <[EMAIL PROTECTED]>wrote: > Thanks very much for all your fast replies. > > *Here is my terminal output:* > > hduser@hdmaster:/usr/local/hadoop$ bin/hadoop jar > hadoop-0.20.2-examples.jar > wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output > 11/09/28 07:38:23 INFO input.FileInputFormat: Total input paths to process > : > 5 > 11/09/28 07:38:23 INFO mapred.JobClient: Running job: job_201109280735_0002 > 11/09/28 07:38:24 INFO mapred.JobClient: map 0% reduce 0% > 11/09/28 07:38:42 INFO mapred.JobClient: map 20% reduce 0% > 11/09/28 07:38:44 INFO mapred.JobClient: map 40% reduce 0% > 11/09/28 07:38:45 INFO mapred.JobClient: map 60% reduce 0% > 11/09/28 07:38:47 INFO mapred.JobClient: map 80% reduce 0% > 11/09/28 07:38:51 INFO mapred.JobClient: map 100% reduce 0% > 11/09/28 07:38:54 INFO mapred.JobClient: map 100% reduce 13% > 11/09/28 07:39:01 INFO mapred.JobClient: map 100% reduce 20% > > *The terminal is stuck here.* > > *And here is my JobTracker log:* > > 2011-09-28 07:35:43,185 INFO org.apache.hadoop.mapred.JobTracker: > STARTUP_MSG: > /************************************************************ > STARTUP_MSG: Starting JobTracker > STARTUP_MSG: host = hdmaster/127.0.1.1 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 0.20.2 > STARTUP_MSG: build > https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r > 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 > ************************************************************/ > 2011-09-28 07:35:43,256 INFO org.apache.hadoop.mapred.JobTracker: Scheduler > configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, > limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) > 2011-09-28 07:35:43,310 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: > Initializing RPC Metrics with hostName=JobTracker, port=54311 > 2011-09-28 07:35:53,431 INFO org.mortbay.log: Logging to > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > org.mortbay.log.Slf4jLog > 2011-09-28 07:35:53,510 INFO org.apache.hadoop.http.HttpServer: Port > returned by webServer.getConnectors()[0].getLocalPort() before open() is > -1. > Opening the listener on 50030 > 2011-09-28 07:35:53,511 INFO org.apache.hadoop.http.HttpServer: > listener.getLocalPort() returned 50030 > webServer.getConnectors()[0].getLocalPort() returned 50030 > 2011-09-28 07:35:53,511 INFO org.apache.hadoop.http.HttpServer: Jetty bound > to port 50030 > 2011-09-28 07:35:53,511 INFO org.mortbay.log: jetty-6.1.14 > 2011-09-28 07:35:53,816 INFO org.mortbay.log: Started > SelectChannelConnector@0.0.0.0:50030 > 2011-09-28 07:35:53,817 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > Initializing JVM Metrics with processName=JobTracker, sessionId> 2011-09-28 07:35:53,818 INFO org.apache.hadoop.mapred.JobTracker: > JobTracker > up at: 54311 > 2011-09-28 07:35:53,818 INFO org.apache.hadoop.mapred.JobTracker: > JobTracker > webserver: 50030 > 2011-09-28 07:35:53,926 INFO org.apache.hadoop.mapred.JobTracker: Cleaning > up the system directory > 2011-09-28 07:35:53,930 INFO org.apache.hadoop.mapred.JobTracker: problem > cleaning system directory: hdfs://master:54310/app/hadoop/tmp/mapred/system > org.apache.hadoop.ipc.RemoteException: > org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete > /app/hadoop/tmp/mapred/system. Name node is in safe mode. > The ratio of reported blocks 0.0000 has not reached the threshold 0.9990. > Safe mode will be turned off automatically. > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1700) > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1680) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:517) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
-
Re: Too many fetch failures. Help!Abdelrahman Kamel 2011-09-28, 08:57
Thanks again.
I have solved my problem by commenting out "127.0.1.1 hdslave" from /etc/hosts in all nodes. On Wed, Sep 28, 2011 at 10:15 AM, praveenesh kumar <[EMAIL PROTECTED]>wrote: > Try commenting out "127.0.0.1 localhost" line from /etc/hosts in all our > systems. > > > On Tue, Sep 27, 2011 at 11:59 PM, Abdelrahman Kamel <[EMAIL PROTECTED] > >wrote: > > > Thanks very much for all your fast replies. > > > > *Here is my terminal output:* > > > > hduser@hdmaster:/usr/local/hadoop$ bin/hadoop jar > > hadoop-0.20.2-examples.jar > > wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output > > 11/09/28 07:38:23 INFO input.FileInputFormat: Total input paths to > process > > : > > 5 > > 11/09/28 07:38:23 INFO mapred.JobClient: Running job: > job_201109280735_0002 > > 11/09/28 07:38:24 INFO mapred.JobClient: map 0% reduce 0% > > 11/09/28 07:38:42 INFO mapred.JobClient: map 20% reduce 0% > > 11/09/28 07:38:44 INFO mapred.JobClient: map 40% reduce 0% > > 11/09/28 07:38:45 INFO mapred.JobClient: map 60% reduce 0% > > 11/09/28 07:38:47 INFO mapred.JobClient: map 80% reduce 0% > > 11/09/28 07:38:51 INFO mapred.JobClient: map 100% reduce 0% > > 11/09/28 07:38:54 INFO mapred.JobClient: map 100% reduce 13% > > 11/09/28 07:39:01 INFO mapred.JobClient: map 100% reduce 20% > > > > *The terminal is stuck here.* > > > > *And here is my JobTracker log:* > > > > 2011-09-28 07:35:43,185 INFO org.apache.hadoop.mapred.JobTracker: > > STARTUP_MSG: > > /************************************************************ > > STARTUP_MSG: Starting JobTracker > > STARTUP_MSG: host = hdmaster/127.0.1.1 > > STARTUP_MSG: args = [] > > STARTUP_MSG: version = 0.20.2 > > STARTUP_MSG: build > > https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r > > 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 > > ************************************************************/ > > 2011-09-28 07:35:43,256 INFO org.apache.hadoop.mapred.JobTracker: > Scheduler > > configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, > > limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) > > 2011-09-28 07:35:43,310 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: > > Initializing RPC Metrics with hostName=JobTracker, port=54311 > > 2011-09-28 07:35:53,431 INFO org.mortbay.log: Logging to > > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > > org.mortbay.log.Slf4jLog > > 2011-09-28 07:35:53,510 INFO org.apache.hadoop.http.HttpServer: Port > > returned by webServer.getConnectors()[0].getLocalPort() before open() is > > -1. > > Opening the listener on 50030 > > 2011-09-28 07:35:53,511 INFO org.apache.hadoop.http.HttpServer: > > listener.getLocalPort() returned 50030 > > webServer.getConnectors()[0].getLocalPort() returned 50030 > > 2011-09-28 07:35:53,511 INFO org.apache.hadoop.http.HttpServer: Jetty > bound > > to port 50030 > > 2011-09-28 07:35:53,511 INFO org.mortbay.log: jetty-6.1.14 > > 2011-09-28 07:35:53,816 INFO org.mortbay.log: Started > > SelectChannelConnector@0.0.0.0:50030 > > 2011-09-28 07:35:53,817 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > > Initializing JVM Metrics with processName=JobTracker, sessionId> > 2011-09-28 07:35:53,818 INFO org.apache.hadoop.mapred.JobTracker: > > JobTracker > > up at: 54311 > > 2011-09-28 07:35:53,818 INFO org.apache.hadoop.mapred.JobTracker: > > JobTracker > > webserver: 50030 > > 2011-09-28 07:35:53,926 INFO org.apache.hadoop.mapred.JobTracker: > Cleaning > > up the system directory > > 2011-09-28 07:35:53,930 INFO org.apache.hadoop.mapred.JobTracker: problem > > cleaning system directory: > hdfs://master:54310/app/hadoop/tmp/mapred/system > > org.apache.hadoop.ipc.RemoteException: > > org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete > > /app/hadoop/tmp/mapred/system. Name node is in safe mode. > > The ratio of reported blocks 0.0000 has not reached the threshold 0.9990. > > Safe mode will be turned off automatically. > > at > > > > Abdelrahman Kamel |