Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> split locations

Copy link to this message
Re: split locations
On Fri, Jan 14, 2011 at 3:09 AM, Pedro Costa <[EMAIL PROTECTED]> wrote:

> Hi,
> If a split location contains more that one location, it means that
> this split file is replicated through all locations, or it means that
> a split is divided into several blocks, and each block is in one
> location?
It requests that the map runs on one of those machines or on the same rack
as one of those machines. Currently there is no way to weight if one machine
in the list is "better" than another. If an input split covers multiple
blocks, the InputFormat is best served by picking the top N machines that
are close a copy of most of the data, where N is roughly 3 to 5.

-- Owen