A very very lame question.
Does numbers of mapper depends on the number of nodes I have?
How I imagine map-reduce is this.
For example in word count example
I have bunch of slave nodes.
The documents are distributed across these slave nodes.
Now depending on how big the data is, it will spread across the slave
nodes.. and that is how my number of mappers are decided.
I am sure, this is wrong understanding. As in pseudo-distributed node, i
can see multiple mappers.
So question is.. how does a single node machine runs multiple mappers? is
it run in parallel or sequentially??
Any resources to learn these