|
|
-
Re: Handling files with unclear boundariesMohammad Tariq 2012-08-06, 19:22
Thanku guys.
Syed : thanku for the pointer Regards, Mohammad Tariq On Mon, Aug 6, 2012 at 11:54 PM, syed kather <[EMAIL PROTECTED]> wrote: > Hi tariq , > > Have a look on this link which can guide you .. > There was discussion happen previously for the same type of issue > > search-hadoop.com/m/ydCoSysmTd1 > > Syed Abdul kather > send from Samsung S3 > > On Aug 6, 2012 11:48 PM, "Manoj Khangaonkar" <[EMAIL PROTECTED]> wrote: >> >> Hi, >> >> I think you might need to extend FileInputFormat ( or one of its >> derived classes) as well as >> implement a RecordReader. >> >> regards >> >> On Mon, Aug 6, 2012 at 8:30 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote: >> > Hello list, >> > >> > I need some guidance on how to handle files where we don't have >> > any proper delimiters or record boundaries. Actually I am trying to >> > process a set of file that are totally alien to me (SAS XPT files) >> > through MR. But one thing that is always fixed is that each time I >> > have to read 107 bytes from the line. Is it possible to use this >> > length as a delimiter for creating splits some how??And if so which >> > InputFormat would be appropriate??Many thanks. >> > >> > Regards, >> > Mohammad Tariq >> >> >> >> -- >> http://khangaonkar.blogspot.com/ |