Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> How do I set the intermediate output path when I use 2 mapreduce jobs?


Copy link to this message
-
Re: Re: How do I set the intermediate output path when I use 2 mapreduce jobs?
Hi Jun Tan,

Yes i use 0.21.0 version. So i have used those. Well the Hadoop Definitive
Guide has job dependency examples for 0.20.x.

Thank You,

2011/9/23 谭军 <[EMAIL PROTECTED]>

> Swathi.V.,
> ControlledJob cannot be resolved in my eclipse.
> My hadoop version is 0.20.2
> ControlledJob can only be resolved in hadoop 0.21.0 (+)?
> Or I need some certain plugins?
> Thanks
>
> --
>
> Regards!
>
> Jun Tan
>
> At 2011-09-22 00:56:54,"Swathi V" <[EMAIL PROTECTED]> wrote:
>
>
> Hi,
>
> This code might help you
> //JobDependancies.java snippet
>
> Configuration conf = new Configuration();
>     Job job1 = new Job(conf, "job1");
>     job1.setJarByClass(JobDependancies.class);
>     job1.setMapperClass(WordMapper.class);
>     job1.setReducerClass(WordReducer.class);
>     job1.setOutputKeyClass(Text.class);
>     job1.setOutputValueClass(IntWritable.class);
>     FileInputFormat.addInputPath(job1, new Path(args[0]));
>     String out=args[1]+System.nanoTime();
>     FileOutputFormat.setOutputPath(job1, new Path(out));
>
>
>
>     Configuration conf2 = new Configuration();
>     Job job2  = new Job(conf2, "job2");
>     job2.setJarByClass(JobDependancies.class);
>     job2.setOutputKeyClass(IntWritable.class);
>     job2.setOutputValueClass(Text.class);
>     job2.setMapperClass(SortWordMapper.class);
>     job2.setReducerClass(Reducer.class);
>     FileInputFormat.addInputPath(job2, new Path(out+"/part-r-00000"));
>     FileOutputFormat.setOutputPath(job2, new Path(args[1]));
>
>     ControlledJob controlledJob1 = new
> ControlledJob(job1.getConfiguration());
>     ControlledJob controlledJob2 = new
> ControlledJob(job2.getConfiguration());
>     controlledJob2.addDependingJob(controlledJob1);
>     JobControl jobControl= new JobControl("control");
>
>     jobControl.addJob(controlledJob1);
>     jobControl.addJob(controlledJob2);
>
>     Thread thread = new Thread(jobControl);
>     thread.start();
>     while(!jobControl.allFinished())
>     {
>      try {
>      Thread.sleep(10000);
>      } catch (InterruptedException e) {
>      // TODO Auto-generated catch block
>      e.printStackTrace();
>      }
>     }
>     jobControl.stop();
>     }
> }
>
>
> wordcount output => job1 is given to sort=> job2
> Irrespective of mappers and reducers, above mentioned is the way to handle
> many jobs.
>
> 2011/9/21 谭军 <[EMAIL PROTECTED]>
>
>> Hi,
>> I want to use 2 MR jobs sequentially.
>> And the first job produces intermediate result to a temp file.
>> The second job reads the result in temp file but not the FileInputPath.
>> I tried, but FileNotFoundException reported.
>> Then I checked the datanodes, temp file was created.
>> The first job was executed correctly.
>> Why the second job cannot find the file? The file was created before the
>> second job was executed.
>> Thanks!
>>
>> --
>>
>> Regards!
>>
>> Jun Tan
>>
>>
>>
>
>
> --
> Regards,
> Swathi.V.
>
>
>
>
--
Regards,
Swathi.V.