Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Big split file to Partitioner


Copy link to this message
-
Re: Big split file to Partitioner
It is a good idea to ask what the meaning of ths split is. Typically a split
is one per line but I have written splits
which return the entire file for a small file - say an xml document

Combiners are special and add the outputs of splits - these are only
occasionally used and when the output is combinable - usually summanbe

On Sat, Aug 21, 2010 at 10:29 AM, Pedro Costa <[EMAIL PROTECTED]> wrote:

> Hi,
>
> 1 - I'm running the wordcount examples with one input file with size
> of 50Mb and with 2 reduces defined. At the end of the execution of the
> wordcount, the 2 reduces deals with each part of the input file. For
> example, one reduce gets one part of the split file, and the other
> reduce gets the other part producing two output files. Why the split
> file content is divided?  This is the case where the partitioner
> concept enters?
>
> 2 - It also means that it's produced 2 map outputs, where each map
> output is sent to each reducer?
>
> Thanks
> --
> PSC
>

--
Steven M. Lewis PhD
Institute for Systems Biology
Seattle WA
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB