1- Does hadoop automatically use the content of the files written by
No. If Job1 and Job2 are run in sequence, then the o/p of Job1 can be i/p
to Job2. This has to be done programatically.
2-Are these files (files written by reducers) discarded? If so, when and
No, if the o/p of the reducers is dicarded, then there is no purpose of
running the Job.
3- How hadoop users can know about the address of these files (files
written by reducers)?
The source/destination are set on the InputFormat/OutputFormat while
defining the Job.
FileInputFormat.addInputPath(conf, new Path(args));
FileOutputFormat.setOutputPath(conf, new Path(args));
On Sun, Jan 1, 2012 at 8:04 PM, aliyeh saeedi <[EMAIL PROTECTED]> wrote:
> I have some questions and I would be really grateful to know the answer.
> As I read in hadoop tutorial "the output files written by the Reducers are
> then left in HDFS for user use, either by another
> MapReduce job, a separate program, for human inspection."
> 1- Does hadoop automatically use the content of the files written by
> reducers? I mean if 3 jobs are assigned to the hadoop, for example, and the
> 1st and 3rd job are the same, does hadoop do the 3rd job again or
> automatically use the results of first job? A more complicated Scenario is
> as follows:
> A) 3 MapReduce jobs are assigned to hadoop
> B) hadoop after doing 3 MapReduce jobs return the final result
> C) in next step, 2 other jobs are assigned to hadoop. Both are
> repetitive (hadoop have done them in step B)
> Now, does hadoop automatically reuse the results of step A or do them
> 2-Are these files (files written by reducers) discarded? If so, when and
> 3- How hadoop users can know about the address of these files (files
> written by reducers)?
> Regards :-)