Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> split locations


Copy link to this message
-
Re: split locations
On Fri, Jan 14, 2011 at 3:09 AM, Pedro Costa <[EMAIL PROTECTED]> wrote:

> Hi,
>
> If a split location contains more that one location, it means that
> this split file is replicated through all locations, or it means that
> a split is divided into several blocks, and each block is in one
> location?
It requests that the map runs on one of those machines or on the same rack
as one of those machines. Currently there is no way to weight if one machine
in the list is "better" than another. If an input split covers multiple
blocks, the InputFormat is best served by picking the top N machines that
are close a copy of most of the data, where N is roughly 3 to 5.

-- Owen
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB