On 28 August 2012 09:24, Siddharth Tiwari <[EMAIL PROTECTED]> wrote:
> Hi Users.
> We have flat files on mainframes with around a billion records. We need to
> sort them and then use them with different jobs on mainframe for report
> generation. I was wondering was there any way I could integrate the
> mainframe with hadoop do the sorting and keep the file on the sever itself
> ( I do not want to ftp the file to a hadoop cluster and then ftp back the
> sorted file to Mainframe as it would waste MIPS and nullify the advantage
> ). This way I could save on MIPS and ultimately improve profitability.
Can you NFS-mount the mainframe filesystem from the Hadoop cluster?
Otherwise, do you or your mainframe vendor have a custom Hadoop filesystem
binding for the mainframe?
If not, you should be able to use ftp:// URLs as the source of data for the
initial MR job; at the end of the sequence of MR jobs the result can go
back to the mainframe;