thanks Bejoy.
...Feeling a bit foolish as Tom White's book was 2 feet away....
On 10/08/2012 10:28 AM, Bejoy Ks wrote:
> Hi Terry
>
> If you are having files smaller than hdfs block size and if you are
> using Default TextInputFormat with the default properties for split
> sizes there would be just one file per mapper.
>
> If you are having larger file sizes, greater than the size of a hdfs
> block. Please take a look at a sample implemention of
> 'WholeFileInputFormat' from 'Hadoop - The Definitive Guide' by Tom White.
>
http://books.google.co.in/books?id=Nff49D7vnJcC&pg=PA206&lpg=PA206&dq=wholefileinputformat&source=bl&ots=IifzWlbwQs&sig=9CDmS45S8pGDOaCYl6xGXnyDFE8&hl=en&sa=X&ei=VeJyUKfEE4rMrQe654G4DA&ved=0CCsQ6AEwAg#v=onepage&q=wholefileinputformat&f=false>
>
>
> On Mon, Oct 8, 2012 at 7:51 PM, Terry Healy <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
> Hello-
>
> I know that it is contrary to normal Hadoop operation, but how can I
> configure my M/R job to send one complete file to each mapper task? This
> is intended to be used on many files in the 1.5 MB range as the first
> step in a chain of processes.
>
> thanks.
>
>
--
Terry Healy / [EMAIL PROTECTED]
Cyber Security Operations
Brookhaven National Laboratory
Building 515, Upton N.Y. 11973