Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Increase number of map slots

Copy link to this message
Re: Increase number of map slots
Hey Keith,

Sorry for the late response here (I had meant to reply but I believe I
got distracted and forgot all about it):

I agree with you on all counts. The config is indeed for service-level
slots. My reply was to only correct Kartheek's assumptions.

Regarding documentation - I'd love to be able to correct them up
myself, but lack the focussed time at the moment to do so right away.
I am willing to review and commit it in for you though, if you're
willing to contribute! Please just let me know the JIRA after you've
filed one and I will track it.

Note that if you use YARN to run the new MR2 code (MR API is the same,
just the platform/submission-execution model has changed), the concept
of hard slots have gone away and presently the slots are determined
via the job's memory request (mapreduce.{map/reduce}.memory.mb)
against a NodeManager's total offered memory for service. There is no
longer a single hard config that controls max number of tasks that may
run simultaneously per node (but can be achieved via some node
manager/scheduler memory resource config hacks, ending up to be
brittle though).

CPU-specific requests are coming soon for YARN:

On Wed, Jun 6, 2012 at 10:38 PM, Keith Wiley <[EMAIL PROTECTED]> wrote:
> On Jun 6, 2012, at 03:42 , Harsh J wrote:
>>> I think mapred.tasktracker.map.tasks.maximum sets the number of map
>> tasks and not slots.
>> This is incorrect. The property does configure slots. Please also see
>> http://wiki.apache.org/hadoop/HowManyMapsAndReduces and
>> http://wiki.apache.org/hadoop/FAQ#I_see_a_maximum_of_2_maps.2BAC8-reduces_spawned_concurrently_on_each_TaskTracker.2C_how_do_I_increase_that.3F
>> for more.
> But Harsh, wouldn't you agree that the first reference you provided above is talking about the number of tasks spawned for a given job at job-runtime and not the number of slots hard-configured into the cluster at cluster-spinup time?
> Incidentally, the second reference above is partially broken.  It attempts to offer links to dig into further detail about mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum, but the links are broken.  For example, one of the two broken links is:
> http://hadoop.apache.org/common/docs/current/hadoop-default.html#mapred.tasktracker.map.tasks.maximum
> It's still broken even if you remove the anchor from the end of the URL, which is to say the hadoop-default.html webpage doesn't even exist.
> In fact, it is difficult find any official documentation on those properties (Google searches for the terms do not provide links to any proper documentation within apache, but rather just lots of back and forth forum discussions about the properties).  One thing I did find was a claim that those properties are deprecated in 2.0.0:
> http://hadoop.apache.org/common/docs/current/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
> That page indicates that they were replaced with equivalents in which the first component is now 'mapreduce', not 'mapred'.  Even with the new terms however, Google still doesn't link to any formal documentation describing those properties.  In fact, I have yet to find a webpage anywhere which officially states the purpose/effect of mapred(uce).tasktracker.map.tasks.maximum.
> That said, I agree that the consensus of discussion and description seems to imply that these properties have a cluster-level (not job-level) effect on the number of map/reduce slots on the cluster, not the number of tasks spawned for a given job.  Such a concept obviously convolutes the intuition that slots correspond to cores as I suggested in an earlier post and I apologize for that.
> ________________________________________________________________________________
> Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com
> "Yet mark his perfect self-contentment, and hence learn his lesson, that to be
> self-contented is to be vile and ignorant, and that to aspire is better than to

Harsh J