If you are having files smaller than hdfs block size and if you are using
Default TextInputFormat with the default properties for split sizes there
would be just one file per mapper.
If you are having larger file sizes, greater than the size of a hdfs block.
Please take a look at a sample implemention of 'WholeFileInputFormat' from
'Hadoop - The Definitive Guide' by Tom White.
On Mon, Oct 8, 2012 at 7:51 PM, Terry Healy <[EMAIL PROTECTED]> wrote:
> I know that it is contrary to normal Hadoop operation, but how can I
> configure my M/R job to send one complete file to each mapper task? This
> is intended to be used on many files in the 1.5 MB range as the first
> step in a chain of processes.