-Re: Is it always called part-00000?
Jeff Zhang 2010-01-18, 02:15
1. If you use the old API, the ouput file is named part-00000, and if you
use the new API, the output file will be part-r-00000, and there will be
usually more than 1 output files, the output file number is determined by
the reducer number of your map-reduce job.
2. If you'd like to consume the output of the first job, you just need to
set the output folder of the first job as the input of second job
On Mon, Jan 18, 2010 at 9:11 AM, Mark Kerzner <[EMAIL PROTECTED]> wrote:
> I am writing a second step to run after my first Hadoop job step finished.
> It is to pick up the results of the previous step and to do further
> processing on it. Therefore, I have two questions please.
> 1. Is the output file always called part-00000?
> 2. Am I perhaps better off reading all files in the output directory and
> how do I do it?
> Thank you,
> PS. Thank you guys for answering my questions - that's a tremendous help
> a great resource.