Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Getting filename in case of MultipleInputs


Copy link to this message
-
RE: Getting filename in case of MultipleInputs
Hi Subbu,

   I am not sure which input format you are using. If you are using FileInputFormat, you can get the file name this way in map function..

import org.apache.hadoop.mapred.FileSplit;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.Mapper;

public class TestMapper extends Mapper<Object, Object, Object, Object> {

  private String fileName;

  public void setup(
      org.apache.hadoop.mapreduce.Mapper<Object, Object, Object, Object>.Context context)
      throws java.io.IOException, InterruptedException {
    InputSplit inputSplit = context.getInputSplit();
    fileName = ((FileSplit) inputSplit).getPath().getName();
  }

  protected void map(
      Object key,
      Object value,
      org.apache.hadoop.mapreduce.Mapper<Object, Object, Object, Object>.Context context)
      throws java.io.IOException, InterruptedException {

    // you can use the fileName here
    System.out.println(fileName);
  }
}
Thanks

Devaraj

________________________________
From: Kasi Subrahmanyam [[EMAIL PROTECTED]]
Sent: Thursday, May 03, 2012 6:25 PM
To: [EMAIL PROTECTED]
Subject: Getting filename in case of MultipleInputs

Hi,

Could anyone suggest how to get the filename in the mapper.
I have gone through the JIRA ticket that map.input.file doesnt work in case of multiple inputs,TaggedInputSplit also doesnt work in case of 0.20.2 version as it is not a public class.
I tried to find any other approach than this but i could find none in the search
Could anyone suggest a solution other tan these

Thanks in advance;
Subbu.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB