-Multi-level aggregation with combining the result of maps per node/rack
Tsuyoshi OZAWA 2012-07-31, 01:11
We consider the shuffle cost is a main concern in MapReduce,
in particular, aggregation processing.
The shuffle costs is also expensive in Hadoop in spite of the
existence of combiner, because the scope of combining is limited
within only one MapTask.
To solve this problem, I've implemented the prototype that
combines the result of multiple maps per node.
This is the first step to make hadoop faster with multi-level
aggregation technique like Google Dremel.
I took a benchmark with the prototype.
We used WordCount program with in-mapper combining optimization
as the benchmark. The benchmark is taken under 40 nodes .
The input data set is 300GB, 500GB, 1TB, and 2TB texts which is generated
by default RandomTextWriter. Reducer is configured
as 1 on the assumption that some workload forces 1 reducer
like Google Dremel. The result is as follows:
| 300GB | 500GB | 1TB | 2TB |
Normal (sec) | 4004 | 5551 | 12177 | 27608 |
Combining per node (sec) | 3678 | 3844 | 7440 | 15591 |
Note that a MapTask runs combiner per node every 3 minutes in
the current prototype, so the aggregation rate is very limited.
"Normal" is the result of current hadoop, and "Combining per node"
is the result with my optimization. Regardless of the 3-minutes
restriction, the prototype is 1.7 times faster than normal hadoop
in 2TB case. Another benchmark also shows that the shuffle costs
is cut down by 50%.
I want to know from you guys, do you think is it a useful feature?
If yes, I will work for contributing it.
It is also welcome to tell me the benchmark that you want me to do
with my prototype.
 The idea is also described in Hadoop wiki:
 Dremel paper is available at:
 The specification of each nodes is as follows:
CPU Core(TM)2 Duo CPU E7400 2.80GHz x 2
Memory 8 GB
Network 1 GbE
Bikas Saha 2012-07-31, 17:32
Tsuyoshi OZAWA 2012-08-01, 07:51
Robert Evans 2012-07-31, 13:46
Tsuyoshi OZAWA 2012-08-01, 06:48