Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - How to specify delimiters in MultipleInputPaths

Copy link to this message
How to specify delimiters in MultipleInputPaths
Inder Pall 2013-10-31, 16:08
I want to use MultipleInputs and use multiple mappers to process different
Let's say in all mappers i want to use KeyValueTextInputFormat. The
challenge is that separator for this input format seems to be set at a job

So if i have two files where one is COMMA separated and the other is TAB
separated, can it be handled?

An example code of what i am trying to do

        Configuration configuration = new Configuration();
        configuration.set("key.value.separator.in.input.line", ",");

        Job job = new Job(configuration, "multiple-inputs-mapper");

        //TODO: how to set different delimiters for KeyValueTextInputFormat
for different Mappers
        MultipleInputs.addInputPath(job, new
KeyValueTextInputFormat.class, Mapper1.class);
        MultipleInputs.addInputPath(job, new
KeyValueTextInputFormat.class, Mapper2.class);
        //TODO: How to set delimiter between key and values in the

        //set the mapper output types for keys and values as we we have
used TextOutputFormat

        FileOutputFormat.setOutputPath(job, new
- Inder
"You are average of the 5 people you spend the most time with"