A mapper's record reader implementation need not be restricted to
strictly only the input split boundary. It is a loose relationship -
you can always seek(0), read the lines you need to prepare, then
seek(offset) and continue reading.
Apache Avro (http://avro.apache.org) has a similar format - header
contains the schema a reader needs to work.
On Thu, Feb 27, 2014 at 1:59 AM, Fengyun RAO <[EMAIL PROTECTED]> wrote: