-Re: How to make a MapReduce job with no input?
Harsh J 2013-03-01, 04:15
The default # of map tasks is set to 2 (via mapred.map.tasks from
mapred-default.xml) - which explains your 2-map run for even one line
For running with no inputs, take a look at Sleep Job's EmptySplits
technique on trunk:
On Fri, Mar 1, 2013 at 2:46 AM, Mike Spreitzer <[EMAIL PROTECTED]> wrote:
> I am using the mapred API of Hadoop 1.0. I want to make a job that does not
> really depend on any input (the job conf supplies all the info needed in
> Mapper). What is a good way to do this?
> What I have done so far is write a job in which MyMapper.configure(..) reads
> all the real input from the JobConf, and MyMapper.map(..) ignores the given
> key and value, writing the output implied by the JobConf. I set the
> InputFormat to TextInputFormat and the input paths to be a list of one
> filename; the named file contains one line of text (the word "one"),
> terminated by a newline. When I run this job (on Linux, hadoop-1.0.0), I
> find it has two map tasks --- one reads the first two bytes of my non-input
> file, and other reads the last two bytes of my non-input file! How can I
> make a job with just one map task?