Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> map.input.file in 20.1


Copy link to this message
-
Re: map.input.file in 20.1
How about:
FileSplit fileSplit = (FileSplit) context.getInputSplit();
String sFileName = fileSplit.getPath().getName();

On Mon, Jul 12, 2010 at 2:56 PM, David Hawthorne <[EMAIL PROTECTED]> wrote:

> I'm trying to get the name of the file that the map job is operating on out
> of the Context passed to the setup function.  It's proving harder than seems
> proper.
>
> I've found several links via google on this topic, but I've seen no
> responses to previous questions.
>
> We have this from July 17, 2009:
>
> http://www.mail-archive.com/[EMAIL PROTECTED]/msg00535.html
>
> I attempted that solution and javac complained about using a deprecated
> API.
>
> It's very clearly spelled out in this doc:
>
> http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.html
>
> and yet the example source code for 20.1 is still using the mapred.*
> (deprecated) API that the prior link used as well.
>
> For the record, here's what I've tried, in the hopes that someone will just
> paste back a working solution:
>
> import java.io.IOException;
>
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.io.Text;
> import org.apache.hadoop.io.IntWritable;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.util.GenericOptionsParser;
> import org.apache.hadoop.mapreduce.Job;
> import org.apache.hadoop.mapreduce.Mapper;
> import org.apache.hadoop.mapreduce.Reducer;
> import org.apache.hadoop.mapreduce.RecordWriter;
> import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
> import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
> import org.apache.hadoop.mapreduce.lib.reduce.IntSumReducer;
> import org.apache.hadoop.mapred.FileSplit;
>
> public class Foo {
>        public static class FooMapper extends Mapper<Object, Text, Text,
> IntWritable> {
>
>                private org.apache.hadoop.io.Text input_file;
>
>                public void setup (Context context) {
>                        Configuration conf = context.getConfiguration();
>
>                        //
>                        // fails to compile due to use of deprecated mapred
> API:
>                        //
>                        FileSplit fileSplit > (FileSplit)context.getInputSplit();
>                        String input_fname = fileSplit.getPath().toString();
>                        input_file.set(input_fname);
>
>                        //
>                        // results in null pointer exception because
> conf.get returns null:
>                        //
>                        // input_file.set(conf.get("map.input.file"));
>                }
>        }
> }
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB