Search Hadoop and all its sub project:

Switch to Threaded View
Subject: Re: What if file format is dependent upon first few lines?
A mapper's record reader implementation need not be restricted to
strictly only the input split boundary. It is a loose relationship -
you can always seek(0), read the lines you need to prepare, then
seek(offset) and continue reading.

Apache Avro ( has a similar format - header
contains the schema a reader needs to work.

On Thu, Feb 27, 2014 at 1:59 AM, Fengyun RAO <[EMAIL PROTECTED]> wrote:

Harsh J

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB