-Re: The location of the map execution
Mohit Anchlia 2012-03-04, 04:44
On Sat, Mar 3, 2012 at 7:41 PM, Joey Echeverria <[EMAIL PROTECTED]> wrote:
> Sorry, I meant have you set the mapred.jobtracker.taskScheduler
> property in your mapred-site.xml file. If not, you're using the
> standard, FIFO scheduler. The default scheduler doesn't do data-local
> scheduling, but the fair scheduler and capacity scheduler do. You want
> to set mapred.jobtracker.taskScheduler to either
> org.apache.hadoop.mapred.FairScheduler (for the fair scheduler) or
> org.apache.hadoop.mapred.CapacityTaskScheduler (for the capacity
> scheduler) and then restart the JobTracker. You can read about the two
> schedulers here:
I thought by default tasks are scheduled on those nodes that have those
data blocks. I thought that was inherent. In the faire scheduler link I
don't see anything about data-local
> On Sat, Mar 3, 2012 at 6:32 PM, Hassen Riahi <[EMAIL PROTECTED]> wrote:
> > The jobtracker is running in another machine (node C)
> > Hassen
> >> Which scheduler are you using?
> >> -Joey
> >> On Mar 3, 2012, at 18:52, Hassen Riahi <[EMAIL PROTECTED]> wrote:
> >>> Hi all,
> >>> We tried using mapreduce to execute a simple map code which read a txt
> >>> file stored in HDFS and write then the output.
> >>> The file to read is a very small one. It was not split and written
> >>> entirely and only in a single datanode (node A). This node is
> >>> also as a tasktracker node
> >>> While we was expecting that the location of the map execution is node A
> >>> (since the input is stored there), from log files, we see that the map
> >>> executed in another tasktracker (node B) of the cluster.
> >>> Am I missing something?
> >>> Thanks for the help!
> >>> Hassen
> Joseph Echeverria
> Cloudera, Inc.