Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Handling files with unclear boundaries


+
Mohammad Tariq 2012-08-06, 15:30
+
Manoj Khangaonkar 2012-08-06, 18:18
+
syed kather 2012-08-06, 18:24
Copy link to this message
-
Re: Handling files with unclear boundaries
Thanku guys.

Syed : thanku for the pointer

Regards,
    Mohammad Tariq
On Mon, Aug 6, 2012 at 11:54 PM, syed kather <[EMAIL PROTECTED]> wrote:
> Hi tariq ,
>
>    Have a look on this link which can guide you ..
> There was discussion happen previously for the same type of issue
>
> search-hadoop.com/m/ydCoSysmTd1
>
> Syed Abdul kather
> send from Samsung S3
>
> On Aug 6, 2012 11:48 PM, "Manoj Khangaonkar" <[EMAIL PROTECTED]> wrote:
>>
>> Hi,
>>
>> I think you might need to extend FileInputFormat ( or one of its
>> derived classes)  as well as
>> implement a RecordReader.
>>
>> regards
>>
>> On Mon, Aug 6, 2012 at 8:30 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
>> > Hello list,
>> >
>> >      I need some guidance on how to handle files where we don't have
>> > any proper delimiters or record boundaries. Actually I am trying to
>> > process a set of file that are totally alien to me (SAS XPT files)
>> > through MR. But one thing that is always fixed is that each time I
>> > have to read 107 bytes from the line. Is it possible to use this
>> > length as a delimiter for creating splits some how??And if so which
>> > InputFormat would be appropriate??Many thanks.
>> >
>> > Regards,
>> >     Mohammad Tariq
>>
>>
>>
>> --
>> http://khangaonkar.blogspot.com/
+
rahul p 2012-08-06, 15:45