Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: passing arguments to hadoop job


Copy link to this message
-
Re: passing arguments to hadoop job
Hi,
  The driver code is actually the same as of java word count old example:
copying from site
public static void main(String[] args) throws Exception {
    JobConf conf = new JobConf(WordCount.class);
     conf.setJobName("wordcount");

     conf.setOutputKeyClass(Text.class);
     conf.setOutputValueClass(IntWritable.class);
     * conf.setInt("basecount",200000); // added this line*
     conf.setMapperClass(Map.class);
     conf.setCombinerClass(Reduce.class);
     conf.setReducerClass(Reduce.class);

     conf.setInputFormat(TextInputFormat.class);
     conf.setOutputFormat(TextOutputFormat.class);

     FileInputFormat.setInputPaths(conf, new Path(args[0]));
     FileOutputFormat.setOutputPath(conf, new Path(args[1]));

     JobClient.runJob(conf);
   }
Reducer class
 public static class Reduce extends MapReduceBase implements Reducer<Text,
IntWritable, Text, IntWritable> {
     *private static int baseSum ;*
*      public void configure(JobConf job){*
*      baseSum = Integer.parseInt(job.get("basecount"));*
*      *
*      }*
      public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, IntWritable> output, Reporter reporter) throws
IOException {
        int sum =* baseSum*;
        while (values.hasNext()) {
          sum += values.next().get();
        }
        output.collect(key, new IntWritable(sum));
      }
    }

On Mon, Jan 21, 2013 at 8:29 PM, Hemanth Yamijala <[EMAIL PROTECTED]>
wrote:
>
> Hi,
>
> Please note that you are referring to a very old version of Hadoop. the
current stable release is Hadoop 1.x. The API has changed in 1.x. Take a
look at the wordcount example here:
http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html#Example%3A+WordCount+v2.0
>
> But, in principle your method should work. I wrote it using the new API
in a similar fashion and it worked fine. Can you show the code of your
driver program (i.e. where you have main) ?
>
> Thanks
> hemanth
>
>
>
> On Tue, Jan 22, 2013 at 5:22 AM, jamal sasha <[EMAIL PROTECTED]>
wrote:
>>
>> Hi,
>>   Lets say I have the standard helloworld program
>>
http://hadoop.apache.org/docs/r0.17.0/mapred_tutorial.html#Example%3A+WordCount+v2.0
>>
>> Now, lets say, I want to start the counting not from zero but from
200000.
>> So my reference line is 200000.
>>
>> I modified the Reduce code as following:
>>  public static class Reduce extends MapReduceBase implements
Reducer<Text, IntWritable, Text, IntWritable> {
>>      private static int baseSum ;
>>      public void configure(JobConf job){
>>      baseSum = Integer.parseInt(job.get("basecount"));
>>
>>      }
>>       public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, IntWritable> output, Reporter reporter) throws
IOException {
>>         int sum = baseSum;
>>         while (values.hasNext()) {
>>           sum += values.next().get();
>>         }
>>         output.collect(key, new IntWritable(sum));
>>       }
>>     }
>>
>>
>> And in main added:
>>    conf.setInt("basecount",200000);
>>
>>
>>
>> So my hope was this should have done the trick..
>> But its not working. the code is running normally :(
>> How do i resolve this?
>> Thanks
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB