Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Which FileInputFormat to use for fixed length records?


Copy link to this message
-
Re: Which FileInputFormat to use for fixed length records?
I think these would be good to add to mapreduce in the
{{org.apache.hadoop.mapreduce.lib.input}} package. Please file a JIRA and
apply a patch!
- Aaron

On Wed, Oct 28, 2009 at 11:15 AM, yz5od2 <[EMAIL PROTECTED]>wrote:

> Hi all,
> I am working on writing a FixedLengthInputFormat class and a corresponding
> FixedLengthRecordReader.
>
> Would the Hadoop commons project have interest in these? Basically these
> are for reading inputs of textual record data, where each record is a fixed
> length, (no carriage returns or separators etc)
>
> thanks
>
>
>
> On Oct 20, 2009, at 11:00 PM, Aaron Kimball wrote:
>
>  You'll need to write your own, I'm afraid. You should subclass
>> FileInputFormat and go from there. You may want to look at TextInputFormat
>> /
>> LineRecordReader for an example of how an IF/RR gets put together, but
>> there
>> isn't an existing fixed-len record reader.
>>
>> - Aaron
>>
>> On Tue, Oct 20, 2009 at 12:59 PM, yz5od2 <[EMAIL PROTECTED]
>> >wrote:
>>
>>  Hi,
>>> I have input files, that contain NO carriage returns/line feeds. Each
>>> record is a fixed length (i.e. 202 bytes).
>>>
>>> Which FileInputFormat should I be using? so that each call to my Mapper
>>> receives one K,V pair, where the KEY is null or something (I don't care)
>>> and
>>> the VALUE is the 202 byte record?
>>>
>>> thanks!
>>>
>>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB