|
|
-
mappers-node relationship
jamal sasha 2013-01-25, 07:46
Hi. A very very lame question. Does numbers of mapper depends on the number of nodes I have? How I imagine map-reduce is this. For example in word count example I have bunch of slave nodes. The documents are distributed across these slave nodes. Now depending on how big the data is, it will spread across the slave nodes.. and that is how my number of mappers are decided. I am sure, this is wrong understanding. As in pseudo-distributed node, i can see multiple mappers. So question is.. how does a single node machine runs multiple mappers? is it run in parallel or sequentially?? Any resources to learn these Thanks
+
jamal sasha 2013-01-25, 07:46
-
Re: mappers-node relationship
Mahesh Balija 2013-01-25, 09:18
Mappers and Reducers will run in Task instances mapper/reducer instances also called as mapper/reducer slots. Each node can have multiple slots (I mean multiple mapper instances, each run in a child JVM). And this is configurable with properties like mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum. Also they run in parallel.
Best, Mahesh Balija, CalsoftLabs.
On Fri, Jan 25, 2013 at 1:16 PM, jamal sasha <[EMAIL PROTECTED]> wrote:
> Hi. > A very very lame question. > Does numbers of mapper depends on the number of nodes I have? > How I imagine map-reduce is this. > For example in word count example > I have bunch of slave nodes. > The documents are distributed across these slave nodes. > Now depending on how big the data is, it will spread across the slave > nodes.. and that is how my number of mappers are decided. > I am sure, this is wrong understanding. As in pseudo-distributed node, i > can see multiple mappers. > So question is.. how does a single node machine runs multiple mappers? is > it run in parallel or sequentially?? > Any resources to learn these > Thanks >
+
Mahesh Balija 2013-01-25, 09:18
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext