InputSplit.getLength() and RecordReader.getProgress() is important for the
MR framework to be able to show progress etc. It would be good to return
raw data sizes in getLength() computed from region's total size of store
files, and progress being calculated from scanner's amount of raw data seen.
On Fri, Jan 24, 2014 at 10:57 AM, Nick Dimiduk <[EMAIL PROTECTED]> wrote: