Hadoop, mail # user - A problem with using 0 reducers in hadoop 0.20.2

Re: A problem with using 0 reducers in hadoop 0.20.2
madhu phatak 2011-02-12, 06:37
If u don't specify any reducer things work fine .. so no need to
specify no of reducers

On Friday, February 11, 2011, Sina Samangooei <[EMAIL PROTECTED]> wrote:
> Hi,
> I have a job that benefits many mappers, but the output of each of these mappers needs no further work and can be outputed directly to the HDFS as sequence files. I've set up a job to do this in java, specifying my mapper and setting reducers to 0 using:
> job.setNumReduceTasks(0);
> The mapper i have written works correctly when run locally through eclipse. However, when i submit my job to my hadoop cluster using:
> hadoop jar <some memory increase arguments> my.jar
> I am finding some problems. The following exception is thrown whenever i emit from one of my map tasks using the command:
> context.write(key, new BytesWritable(baos.toByteArray()));
> org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /data/quantised_features/ascii-sift-ukbench/_temporary/_attempt_201010211037_0140_m_000000_0/part-m-00000 File does not exist. Holder DFSClient_attempt_201010211037_0140_m_000000_0 does not have any open files.
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1378)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1369)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1290)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:469)
>         at sun.reflect.GeneratedMethodAccessor549.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:968)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:964)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:962)
>         at org.apache.hadoop.ipc.Client.call(Client.java:817)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
>         at $Proxy1.addBlock(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at $Proxy1.addBlock(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3000)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2881)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1900(DFSClient.java:2139)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2329)
> This seemed quite strange in itself. I proceeded to do some testing. At the setup() phase of the mapper, i can confirm that the output directory does not exist on the HDFS using:
> Path destination = FileOutputFormat.getWorkOutputPath(context);
> destination.getFileSystem(context.getConfiguration()).exists(destination)
> Therefore i create the the output directory (for test purposes) at the setup phase using the following command:
> destination.getFileSystem(context.getConfiguration()).mkdirs(destination);
> The output location then does exist, but only until the end of the setup() call. When the map() function is reached the output directory is gone again!