If the data is straight EBCDIC you have somewhat splittable data, however its really better to do this in a single stream.
If the data is COMP-3 (Zoned and Packed Data), you will be unable to split the file in to pieces. You will also need to know the fixed length format of the record.
From my personal experience, most of the data that I had seen was COMP-3 records which required knowing the data structures.
Of course YMMV
On Feb 8, 2013, at 9:23 PM, Jagat Singh <[EMAIL PROTECTED]> wrote:
> I am thinking to write some mapper to do conversion of mainframe files to ascii format and contribute back.
> And before even i do something i wanted to confirm from you guys the following
> Do we already have some mapreduce library doing the same work ?
> Is there anything in Hadoop which makes such kind of conversion not possible , so that i dont end up spending time on something which cannot be done
> I am not mainframe guy so wanted to ask upfront.
> Here is what in my mind till now
> In Oracle JDK following are supported encodings  , i plan to use already existing libraries such as  or  to do the conversion.
> Thank you for your time and guidance.
> Jagat Singh
> 1) http://docs.oracle.com/javase/6/docs/technotes/guides/intl/encoding.doc.html
> 2) http://sourceforge.net/projects/jrecord/
> 3) http://sourceforge.net/projects/cb2java/
Michael Segel | (m) 312.755.9623
Segel and Associates