-Performance test practices for hadoop jobs - capturing metrics
Bejoy Ks 2011-11-15, 07:31
I'm currently working out to incorporate a performance test plan
for a series of hadoop jobs.My entire application consists of map reduce,
hive and flume jobs chained one after another and I need to do some
rigorous performance testing to ensure that it would never break under
most circumstances. I’m planning to test individually each of the
components as well as end to end tests and some overlapping tests across
the components. The tests would be like
· Regression test
· Stress/Load test
· Simultaneous run tests
And for all these tests I'm planning to capture metrics from two sources
· metrics related to map reduce job from the Job Tracker web UI
· metrics related to IO,Memory and CPU usage from Ganglia
The tests are triggred using simple shell scripts which is absolutely fine.
And capturing metrics from Job Tracker and Ganglia for individual jobs are
also fine. But the challenge comes when we are capturing metrics for
regression test. In regression tests we’d be running a particular job (say
hive job) continuously for 24 hours looped in a shell script and for my
test data set that ranges a few gigs it would be kind of running nearly 130
times. It looks not a great solution to capture the metrics manually for
all these 130 runs. I have a few queries around this like
Is there any automated tool that would help us in capturing these metrics?
Also is there any best practice to be followed on performance testing?
Does anyone have any metrics sheet that defines what all details are to be
captured during performance tests?
It’d be great if you all could share your experiences with performance
testing and the practices you follow for your hadoop projects. Also the dos
Awaiting all your valuable responses.
Thanks a lot