Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Big split file to Partitioner

Copy link to this message
Re: Big split file to Partitioner
It is a good idea to ask what the meaning of ths split is. Typically a split
is one per line but I have written splits
which return the entire file for a small file - say an xml document

Combiners are special and add the outputs of splits - these are only
occasionally used and when the output is combinable - usually summanbe

On Sat, Aug 21, 2010 at 10:29 AM, Pedro Costa <[EMAIL PROTECTED]> wrote:

> Hi,
> 1 - I'm running the wordcount examples with one input file with size
> of 50Mb and with 2 reduces defined. At the end of the execution of the
> wordcount, the 2 reduces deals with each part of the input file. For
> example, one reduce gets one part of the split file, and the other
> reduce gets the other part producing two output files. Why the split
> file content is divided?  This is the case where the partitioner
> concept enters?
> 2 - It also means that it's produced 2 map outputs, where each map
> output is sent to each reducer?
> Thanks
> --

Steven M. Lewis PhD
Institute for Systems Biology
Seattle WA