-RE: Which InputFormat to use?
Otto Mok 2013-07-05, 03:28
A trainer at Hortonworks told me that org.apache.hadoop.mapred is the old package.
So for all intent and purposes use the new one: org.apache.hadoop.mapreduce.
From: Ahmed Eldawy [mailto:[EMAIL PROTECTED]]
Sent: July-04-13 2:30 PM
To: [EMAIL PROTECTED]
Subject: Which InputFormat to use?
Hi I'm developing a new set of InputFormats that are used for a project I'm doing. I found that there are two ways to create a new InputFormat.
1- Extend the abstract class org.apache.hadoop.mapreduce.InputFormat
2- Implement the interface org.apache.hadoop.mapred.InputFormat
I don't know why there are two versions which are incompatible. I found out that for each one, there is a whole set of interfaces for different classes such as InputSplit, RecordReader and MapReduce job. Unfortunately, each set of classes is not compatible with the other one. This means that I have to choose one of the interfaces and go with it till the end. I have two questions basically.
1- Which of these two interfaces I should go with? I didn't find any deprecation in one of them so they both seem legitimate. Is there any plan to retire one of them?
2- I already have some classes implemented in one of the formats, does it worth refactoring these classes to use the other interface, in case I used he old format.
Thanks in advance for your help.