Obviously the algorithm matters, but here are some very old numbers (things today are much better), but you do see the 'linear' scaling with both nodes and datasets:
100TB Sort - 97 mins
1000 TB Sort - 975 mins
On Jan 17, 2013, at 7:09 PM, Thiago Vieira wrote:
> Is common to see this sentence: "Hadoop Scales Linearly". But, is there any performance evaluation to confirm this?
> In my evaluations, Hadoop processing capacity scales linearly, but not proportional to number of nodes, the processing capacity achieved with 20 nodes is not the double of the processing capacity achieved with 10 nodes. Is there any evaluation about this?
> Thank you!
> Thiago Vieira
Arun C. Murthy