Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> RE: Which InputFormat to use?


Copy link to this message
-
RE: Which InputFormat to use?
A trainer at Hortonworks told me that org.apache.hadoop.mapred is the old package.

So for all intent and purposes use the new one: org.apache.hadoop.mapreduce.

Otto out!

From: Ahmed Eldawy [mailto:[EMAIL PROTECTED]]
Sent: July-04-13 2:30 PM
To: [EMAIL PROTECTED]
Subject: Which InputFormat to use?

Hi I'm developing a new set of InputFormats that are used for a project I'm doing. I found that there are two ways to create  a new InputFormat.
1- Extend the abstract class org.apache.hadoop.mapreduce.InputFormat
2- Implement the interface org.apache.hadoop.mapred.InputFormat
I don't know why there are two versions which are incompatible. I found out that for each one, there is a whole set of interfaces for different classes such as InputSplit, RecordReader and MapReduce job. Unfortunately, each set of classes is not compatible with the other one. This means that I have to choose one of the interfaces and go with it till the end. I have two questions basically.
1- Which of these two interfaces I should go with? I didn't find any deprecation in one of them so they both seem legitimate. Is there any plan to retire one of them?
2- I already have some classes implemented in one of the formats, does it worth refactoring these classes to use the other interface, in case I used he old format.
Thanks in advance for your help.
Best regards,
Ahmed Eldawy