Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> different input/output formats


+
Mark question 2012-05-29, 19:57
+
samir das mohapatra 2012-05-29, 20:55
+
Mark question 2012-05-29, 21:15
+
samir das mohapatra 2012-05-30, 13:27
Copy link to this message
-
Re: different input/output formats
Hi
  I think attachment will not got thgrough the [EMAIL PROTECTED].

Ok Please have a look bellow.

MAP
------------------------
package test;

import java.io.IOException;

import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

public class myMapper extends MapReduceBase implements
Mapper<LongWritable,Text,FloatWritable,Text> {

   public void map(LongWritable offset, Text
val,OutputCollector<FloatWritable,Text> output, Reporter reporter)  throws
IOException {
       output.collect(new FloatWritable(1), val);
    }
}

REDUCER
------------------------------
Prepare reducer  what exactly you want for.

JOB
------------------------

package test;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.filecache.DistributedCache;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.SequenceFileOutputFormat;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class TestDemo extends Configured implements Tool{

    public static void main(String args[]) throws Exception{

            int res = ToolRunner.run(new Configuration(), new
TestDemo(),args);
            System.exit(res);

    }

    @Override
    public int run(String[] args) throws Exception {
        JobConf conf = new JobConf(TestDemo.class);
        String[] otherArgs = new GenericOptionsParser(conf,
args).getRemainingArgs();
        conf.setJobName("TestCustomInputOutput");
           conf.setMapperClass(myMapper.class);
           conf.setMapOutputKeyClass(FloatWritable.class);
           conf.setMapOutputValueClass(Text.class);
           conf.setNumReduceTasks(0);
           conf.setOutputKeyClass(FloatWritable.class);
           conf.setOutputValueClass(Text.class);

           conf.setInputFormat(TextInputFormat.class);
           conf.setOutputFormat(SequenceFileOutputFormat.class);

           TextInputFormat.addInputPath(conf, new Path(args[0]));
           SequenceFileOutputFormat.setOutputPath(conf, new Path(args[1]));

        JobClient.runJob(conf);
        return 0;
    }
}

On Wed, May 30, 2012 at 6:57 PM, samir das mohapatra <
[EMAIL PROTECTED]> wrote:

> PFA.
>
>
> On Wed, May 30, 2012 at 2:45 AM, Mark question <[EMAIL PROTECTED]>wrote:
>
>> Hi Samir, can you email me your main class.. or if you can check mine, it
>> is as follows:
>>
>> public class SortByNorm1 extends Configured implements Tool {
>>
>>    @Override public int run(String[] args) throws Exception {
>>
>>        if (args.length != 2) {
>>            System.err.printf("Usage:bin/hadoop jar norm1.jar <inputDir>
>> <outputDir>\n");
>>            ToolRunner.printGenericCommandUsage(System.err);
>>            return -1;
>>        }
>>        JobConf conf = new JobConf(new Configuration(),SortByNorm1.class);
>>        conf.setJobName("SortDocByNorm1");
>>        conf.setMapperClass(Norm1Mapper.class);
>>        conf.setMapOutputKeyClass(FloatWritable.class);
>>        conf.setMapOutputValueClass(Text.class);
>>        conf.setNumReduceTasks(0);
>>        conf.setReducerClass(Norm1Reducer.class);
>>         conf.setOutputKeyClass(FloatWritable.class);
>>        conf.setOutputValueClass(Text.class);
>>
>>        conf.setInputFormat(TextInputFormat.class);
>>        conf.setOutputFormat(SequenceFileOutputFormat.class);
+
samir das mohapatra 2012-05-29, 20:05
+
Mark question 2012-05-29, 20:30
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB