You don't have to convert the data in order to copy it into the HDFS. But
you might have to think about the MR processing of these files because of
the format of these files.
You could probably make use of Sqoop <http://sqoop.apache.org/>.
I also came across DMX-H a few days ago while browsing. I don't know
anything about the licensing and how good it is. Just thought of sharing it
with you. You can visit their
page<http://www.syncsort.com/en/Data-Integration/Home>to see more.
They also provide a VM(includes CDH) to get started quickly.
On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <[EMAIL PROTECTED]>wrote:
> Hi ,
> "How to copy datasets from Mainframe to HDFS directly? I know that we can
> NDM files to Linux box and then we can use simple put command to copy data
> to HDFS. But, how to copy data directly from mainframe to HDFS? I have
> PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce.
> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
> Sandeep Nemuri