Yes this is possible (and actually does happen in regular MR scenario anyway - when the input is split across several locations). You'll need a custom InputFormat#getSplits implementation to do this (create input splits with the first offset itself set to the known offset location, instead of 0).
On Mon, Sep 10, 2012 at 5:01 PM, Anit Alexander <[EMAIL PROTECTED]> wrote: > Hello list, > > Is it possible to start the mapper from a particular byte > location in a file which is in hdfs? > > Regards, > Anit
When the input to the mapper is a key,value pair, the key is the byte offset of the file contents. So, may be we can check if the file byte offset meets your criterion to do the mapper task or not.
With Regards, Abhishek S On Mon, Sep 10, 2012 at 5:04 PM, Michael Segel <[EMAIL PROTECTED]>wrote:
> Maybe. > It depends on what you're trying to do. > > On Sep 10, 2012, at 6:31 AM, Anit Alexander <[EMAIL PROTECTED]> wrote: > > > Hello list, > > > > Is it possible to start the mapper from a particular byte > > location in a file which is in hdfs? > > > > Regards, > > Anit > > > >
Abhishek Shivkumar 2012-09-10, 11:37
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext