Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Re: Hadoop Scalability


Copy link to this message
-
Re: Hadoop Scalability
Arun C Murthy 2013-01-18, 22:53
Obviously the algorithm matters, but here are some very old numbers (things today are much better), but you do see the 'linear' scaling with both nodes and datasets:

http://developer.yahoo.com/blogs/hadoop/posts/2009/05/hadoop_sorts_a_petabyte_in_162/
100TB Sort - 97 mins
1000 TB Sort - 975 mins

Arun

On Jan 17, 2013, at 7:09 PM, Thiago Vieira wrote:

> Hello!
>
> Is common to see this sentence: "Hadoop Scales Linearly". But, is there any performance evaluation to confirm this?
>
> In my evaluations, Hadoop processing capacity scales linearly, but not proportional to number of nodes, the processing capacity achieved with 20 nodes is not the double of the processing capacity achieved with 10 nodes. Is there any evaluation about this?
>
> Thank you!
>
> --
> Thiago Vieira

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/