Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - How does performance scale with the size of the data?


Copy link to this message
-
How does performance scale with the size of the data?
Steve Lewis 2010-07-01, 05:15
Assume we have a medium size cluster - say 20 nodes and that the cluster is
used for one job and cannot change in size.
Assume we are sorting a large data set. As we increase the size of the data
sorted say from 100GB to 1000GB to 10000GB does the
time for the sort scale as N or as NLogN? I have heard both answers with
NLogN coming largely from folks less familiar with hadoop and
as N from others with more experience but I am skeptical - has anyone done
tests and can contribute real data

--
Steven M. Lewis PhD
Institute for Systems Biology
Seattle WA