Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - How to make a MapReduce job with no input?


Copy link to this message
-
Re: How to make a MapReduce job with no input?
Harsh J 2013-03-01, 04:15
The default # of map tasks is set to 2 (via mapred.map.tasks from
mapred-default.xml) - which explains your 2-map run for even one line
of text.

For running with no inputs, take a look at Sleep Job's EmptySplits
technique on trunk:
http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/SleepJob.java?view=markup
(~line 70)

On Fri, Mar 1, 2013 at 2:46 AM, Mike Spreitzer <[EMAIL PROTECTED]> wrote:
> I am using the mapred API of Hadoop 1.0.  I want to make a job that does not
> really depend on any input (the job conf supplies all the info needed in
> Mapper).  What is a good way to do this?
>
> What I have done so far is write a job in which MyMapper.configure(..) reads
> all the real input from the JobConf, and MyMapper.map(..) ignores the given
> key and value, writing the output implied by the JobConf.  I set the
> InputFormat to TextInputFormat and the input paths to be a list of one
> filename; the named file contains one line of text (the word "one"),
> terminated by a newline.  When I run this job (on Linux, hadoop-1.0.0), I
> find it has two map tasks --- one reads the first two bytes of my non-input
> file, and other reads the last two bytes of my non-input file!  How can I
> make a job with just one map task?
>
> Thanks,
> Mike

--
Harsh J