Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Set variables in mapper


Copy link to this message
-
Re: Set variables in mapper
Hi,

It would also be worthwhile to look at the Tool interface
(http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Tool),
which is used by example programs in the MapReduce examples as well.
This would allow any arguments to be passed using the
-Dvar.name=var.value convention on command line.

Thanks
Hemanth

On Mon, Aug 2, 2010 at 10:33 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> And since it is an integer you're looking for, use the utility methods
> JobConf.setInt and JobConf.getInt:
>
> Integer N = Integer.parseInt(args[2]);
> JobConf.setInt("your.pack.some.name", N);
>
> And in the Mapper's "@Override void configure(JobConf conf)", do:
> conf.getInt("your.pack.some.name", 1 /* Or other default value */);
>
> On Mon, Aug 2, 2010 at 9:53 PM, Edward Capriolo <[EMAIL PROTECTED]> wrote:
>> On Mon, Aug 2, 2010 at 12:17 PM, Erik Test <[EMAIL PROTECTED]> wrote:
>>> Hi,
>>>
>>> I'm trying to set a variable in my mapper class by reading an argument from
>>> the command line and then passing the entry to the mapper from main. Is this
>>> possible?
>>>
>>>  public static void main(String[] args) throws Exception
>>>  {
>>>    JobConf conf = new JobConf(DistanceCalc2.class);
>>>    conf.setJobName("Calculate Distances");
>>>
>>>    conf.setOutputKeyClass(Text.class);
>>>    conf.setOutputValueClass(DoubleWritable.class);
>>>
>>>    conf.setMapperClass(Map.class);
>>>    //conf.setReducerClass(Reduce.class);
>>>
>>>    conf.setInputFormat(TextInputFormat.class);
>>>    conf.setOutputFormat(TextOutputFormat.class);
>>>
>>>    FileInputFormat.setInputPaths(conf, new Path(args[0]));
>>>    FileOutputFormat.setOutputPath(conf, new Path(args[1]));
>>>
>>>    Map.setN(args[2]);
>>>
>>>    JobClient.runJob(conf);
>>>  }//main
>>>
>>>
>>>  public static class Map extends MapReduceBase
>>>    implements Mapper<LongWritable, Text,
>>>      Text, DoubleWritable>
>>>        {
>>>               ...
>>>               private static int N;
>>>
>>>               ...
>>>
>>>               public void map(LongWritable key, Text value,
>>>                 OutputCollector<Text, DoubleWritable> output,
>>>                  Reporter reporter) throws IOException
>>>                {
>>>                    ....
>>>                    dim = tokens.length / N;
>>>                    ...
>>>                }
>>>
>>>               public static void setN(String newN)
>>>               {
>>>                  N = Integer.parseInt(newN);
>>>               }
>>>        }
>>>
>>> I've tried the code above but I get an error saying that I'm dividing by
>>> zero. Obviously, the argument I enter for N isn't being set as specified.
>>> Erik
>>>
>>
>> You can pass variables to the Job using the JobConf class.
>>
>> In your Driver class:
>> jobConf.set("clone_path", clonePath);
>>
>> Then in your mapper / reducer override configure:
>>
>>  private JobConf jobConf;
>>  public void configure(JobConf jobConf) {
>>        super.configure(jobConf);
>>        this.jobConf=jobConf;
>>  }
>>
>
>
>
> --
> Harsh J
> www.harshj.com
>