|
|
-
Re: How to estimate hadoop?Jie Li 2012-02-25, 15:44
Hi Jinyan,
I'd like to introduce you our system Starfish, which can be used to analyze and estimate the Hadoop performance and memory usage. With Starfish, you can analyze the performance of your Hadoop job at fine grained levels, e.g. the time for map processing, spilling, merging, shuffling, sorting, and reduce processing. So you can understand which part is the bottleneck of the performance. You can also ask "what-if" questions, e.g. "What if I double io.sort.mb ?", and Starfish will predict the new behaviour of the job, so you can better understand how these parameters work, and estimate the time for the new job. In addition, you can simply delegate Starfish to find the optimal configurations for you to achieve the best performance. Welcome to join our Google Group to discuss more about Starfish and any feedback will be appreciated. If you meet any problems, please don't hesitate to let us know. The Group address is http://groups.google.com/group/hadoop-starfish. Thanks, Jie ------------------------ Starfish Group, Duke University Starfish Homepage: www.cs.duke.edu/starfish/ Starfish Google Group: http://groups.google.com/group/hadoop-starfish On Wed, Feb 15, 2012 at 11:26 PM, Srinivas Surasani <[EMAIL PROTECTED]>wrote: > Hey, > > It completely depends on your data sizes and processing. You can have > one node cluster to thousands (many more) depending on your needs. > Following link may help you. > > http://wiki.apache.org/hadoop/HardwareBenchmarks > > Regards, > > > On Wed, Feb 15, 2012 at 10:17 PM, Jinyan Xu <[EMAIL PROTECTED]> wrote: > > Hi all, > > > > I want to used hadoop system, but I need a overall system info about > > hadoop, for example, > > system performance, mem used, cpu utilization and so on. So do anyone > have > > a system estimate about hadoop? which tool can do this? > > > > yours > > rock > > > > -- > -- Srinivas > [EMAIL PROTECTED] > > |