-Re: Handling files with unclear boundaries
Mohammad Tariq 2012-08-06, 19:22
Syed : thanku for the pointer
On Mon, Aug 6, 2012 at 11:54 PM, syed kather <[EMAIL PROTECTED]> wrote:
> Hi tariq ,
> Have a look on this link which can guide you ..
> There was discussion happen previously for the same type of issue
> Syed Abdul kather
> send from Samsung S3
> On Aug 6, 2012 11:48 PM, "Manoj Khangaonkar" <[EMAIL PROTECTED]> wrote:
>> I think you might need to extend FileInputFormat ( or one of its
>> derived classes) as well as
>> implement a RecordReader.
>> On Mon, Aug 6, 2012 at 8:30 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
>> > Hello list,
>> > I need some guidance on how to handle files where we don't have
>> > any proper delimiters or record boundaries. Actually I am trying to
>> > process a set of file that are totally alien to me (SAS XPT files)
>> > through MR. But one thing that is always fixed is that each time I
>> > have to read 107 bytes from the line. Is it possible to use this
>> > length as a delimiter for creating splits some how??And if so which
>> > InputFormat would be appropriate??Many thanks.
>> > Regards,
>> > Mohammad Tariq