MapReduce >> mail # user >> RE: Can't initialize cluster

Kevin Burton 2013-04-30, 16:40
Mohammad Tariq 2013-04-30, 17:32
RE: Can't initialize cluster
We/I are/am making progress. Now I get the error:


13/04/30 12:59:40 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.

13/04/30 12:59:40 INFO mapred.JobClient: Cleaning up the staging area

13/04/30 12:59:40 ERROR security.UserGroupInformation:
PriviledgedActionException as:kevin (auth:SIMPLE)
cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input
path does not exist: hdfs://devubuntu05:9000/user/kevin/input

Exception in thread "main"
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does
not exist: hdfs://devubuntu05:9000/user/kevin/input


When I run it with java -jar the input and output is the local folder. When
running it with hadoop jar it seems to be expecting the folders (input and
output) to be on the HDFS file system. I am not sure why these two methods
of invocation don't make the same file system assumptions.


It is


hadoop jar WordCount.jar input output (which gives the above exception)




java -jar WordCount.jar input output (which outputs the wordcount statistics
to the output folder)


This is run in the local /home/kevin/WordCount folder.




Set "HADOOP_MAPRED_HOME" in your hadoop-env.sh file and re-run the job. See
if it helps.
To be clear when this code is run with 'java -jar' it runs without
exception. The exception occurs when I run with 'hadoop jar'.


I have a simple MapReduce job that I am trying to get to run on my cluster.
When I run it I get:


13/04/30 11:27:45 INFO mapreduce.Cluster: Failed to use
org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid
"mapreduce.jobtracker.address" configuration value for LocalJobRunner :

13/04/30 11:27:45 ERROR security.UserGroupInformation:
PriviledgedActionException as:kevin (auth:SIMPLE) cause:java.io.IOException:
Cannot initialize Cluster. Please check your configuration for
mapreduce.framework.name and the correspond server addresses.

Exception in thread "main" java.io.IOException: Cannot initialize Cluster.
Please check your configuration for mapreduce.framework.name and the
correspond server addresses.


My core-site.xml looks like:





  <description>The name of the default file system. A URI whose scheme and
authority determine the FileSystem implementation. </description>



So I am unclear as to why it is looking at devubuntu05:9001?


Here is the code:


    public static void WordCount( String[] args )  throws Exception {

        Configuration conf = new Configuration();

        String[] otherArgs = new GenericOptionsParser(conf,

        if (otherArgs.length != 2) {

            System.err.println("Usage: wordcount <in> <out>");



        Job job = new Job(conf, "word count");







org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(job, new

new Path(otherArgs[1]));

        System.exit(job.waitForCompletion(true) ? 0 : 1);



Harsh J 2013-05-01, 06:02
Kevin Burton 2013-04-30, 16:36
rkevinburton@... 2013-05-01, 13:42