Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Getting filename in case of MultipleInputs


Copy link to this message
-
RE: Getting filename in case of MultipleInputs
Hi Subbu,

   I am not sure which input format you are using. If you are using FileInputFormat, you can get the file name this way in map function..

import org.apache.hadoop.mapred.FileSplit;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.Mapper;

public class TestMapper extends Mapper<Object, Object, Object, Object> {

  private String fileName;

  public void setup(
      org.apache.hadoop.mapreduce.Mapper<Object, Object, Object, Object>.Context context)
      throws java.io.IOException, InterruptedException {
    InputSplit inputSplit = context.getInputSplit();
    fileName = ((FileSplit) inputSplit).getPath().getName();
  }

  protected void map(
      Object key,
      Object value,
      org.apache.hadoop.mapreduce.Mapper<Object, Object, Object, Object>.Context context)
      throws java.io.IOException, InterruptedException {

    // you can use the fileName here
    System.out.println(fileName);
  }
}
Thanks

Devaraj

________________________________
From: Kasi Subrahmanyam [[EMAIL PROTECTED]]
Sent: Thursday, May 03, 2012 6:25 PM
To: [EMAIL PROTECTED]
Subject: Getting filename in case of MultipleInputs

Hi,

Could anyone suggest how to get the filename in the mapper.
I have gone through the JIRA ticket that map.input.file doesnt work in case of multiple inputs,TaggedInputSplit also doesnt work in case of 0.20.2 version as it is not a public class.
I tried to find any other approach than this but i could find none in the search
Could anyone suggest a solution other tan these

Thanks in advance;
Subbu.