|
|
-
Re: Hadoop 1.0.4 Performance ProblemTodd Lipcon 2012-12-21, 01:06
Hi Jon,
FYI, this issue in the fair scheduler was fixed by https://issues.apache.org/jira/browse/MAPREDUCE-2905 for 1.1.0. Though it is present again in MR2: https://issues.apache.org/jira/browse/MAPREDUCE-3268 -Todd On Wed, Nov 28, 2012 at 2:32 PM, Jon Allen <[EMAIL PROTECTED]> wrote: > Jie, > > Simple answer - I got lucky (though obviously there are thing you need to > have in place to allow you to be lucky). > > Before running the upgrade I ran a set of tests to baseline the cluster > performance, e.g. terasort, gridmix and some operational jobs. Terasort by > itself isn't very realistic as a cluster test but it's nice and simple to > run and is good for regression testing things after a change. > > After the upgrade the intention was to run the same tests and show that the > performance hadn't degraded (improved would have been nice but not worse was > the minimum). When we ran the terasort we found that performance was about > 50% worse - execution time had gone from 40 minutes to 60 minutes. As I've > said, terasort doesn't provide a realistic view of operational performance > but this showed that something major had changed and we needed to understand > it before going further. So how to go about diagnosing this ... > > First rule - understand what you're trying to achieve. It's very easy to > say performance isn't good enough but performance can always be better so > you need to know what's realistic and at what point you're going to stop > tuning things. I had a previous baseline that I was trying to match so I > knew what I was trying to achieve. > > Next thing to do is profile your job and identify where the problem is. We > had the full job history from the before and after jobs and comparing these > we saw that map performance was fairly consistent as were the reduce sort > and reduce phases. The problem was with the shuffle, which had gone from 20 > minutes pre-upgrade to 40 minutes afterwards. The important thing here is > to make sure you've got as much information as possible. If we'd just kept > the overall job time then there would have been a lot more areas to look at > but knowing the problem was with shuffle allowed me to focus effort in this > area. > > So what had changed in the shuffle that may have slowed things down. The > first thing we thought of was that we'd moved from a tarball deployment to > using the RPM so what effect might this have had on things. Our operational > configuration compresses the map output and in the past we've had problems > with Java compression libraries being used rather than native ones and this > has affected performance. We knew the RPM deployment had moved the native > library so spent some time confirming to ourselves that these were being > used correctly (but this turned out to not be the problem). We then spent > time doing some process and server profiling - using dstat to look at the > server bottlenecks and jstack/jmap to check what the task tracker and reduce > processes were doing. Although not directly relevant to this particular > problem doing this was useful just to get my head around what Hadoop is > doing at various points of the process. > > The next bit was one place where I got lucky - I happened to be logged onto > one of the worker nodes when a test job was running and I noticed that there > weren't any reduce tasks running on the server. This was odd as we'd > submitted more reducers than we have servers so I'd expected at least one > task to be running on each server. Checking the job tracker log file it > turned out that since the upgrade the job tracker had been submitting reduce > tasks to only 10% of the available nodes. A different 10% each time the job > was run so clearly the individual task trackers were working OK but there > was something odd going on with the task allocation. Checking the job > tracker log file showed that before the upgrade tasks had been fairly evenly > distributed so something had changed. After that it was a case of digging Todd Lipcon Software Engineer, Cloudera |