Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> output files written by reducers


Copy link to this message
-
Re: output files written by reducers
1- Does hadoop automatically use the content of the files written by
reducers?

No. If Job1 and Job2 are run in sequence, then the o/p of Job1 can be i/p
to Job2. This has to be done programatically.

2-Are these files (files written by reducers) discarded? If so, when and
how?

No, if the o/p of the reducers is dicarded, then there is no purpose of
running the Job.

3- How hadoop users can know about the address of these files (files
written by reducers)?

The source/destination are set on the InputFormat/OutputFormat while
defining the Job.

        FileInputFormat.addInputPath(conf, new Path(args[0]));
        FileOutputFormat.setOutputPath(conf, new Path(args[1]));

Praveen

On Sun, Jan 1, 2012 at 8:04 PM, aliyeh saeedi <[EMAIL PROTECTED]> wrote:

>
>
> Hi
> I have some questions and I would be really grateful to know the answer.
> As I read in hadoop tutorial "the output files written by the Reducers are
> then left in HDFS for user use, either by another
> MapReduce job, a separate program, for human inspection."
>
> 1- Does hadoop automatically use the content of the files written by
> reducers? I mean if 3 jobs are assigned to the hadoop, for example, and the
> 1st and 3rd job are the same, does hadoop do the 3rd job again or
> automatically use the results of first job? A more complicated Scenario is
> as follows:
>     A) 3 MapReduce jobs are assigned to hadoop
>     B) hadoop after doing 3 MapReduce jobs return the final result
>     C) in next step, 2 other jobs are assigned to hadoop. Both are
> repetitive (hadoop have done them in step B)
> Now, does hadoop automatically reuse the results of step A or do them
> again?
>
> 2-Are these files (files written by reducers) discarded? If so, when and
> how?
>
> 3- How hadoop users can know about the address of these files (files
> written by reducers)?
>
> Regards :-)
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB