Is there any page/document that describes the methods/techniques used by
Hive to arrive at the optimum number of map tasks & optimum number of reduce
I'm running a 3-node Amazon EMR cluster, and Hive has determined that 34 map
& 2 reduce tasks are optimum. Out of the 34 map tasks only 4 are actively
running at any given instant. Any explanations why this exact number?