We are seeing our map-reduce jobs crashing once in a while and have to go
through the logs on all the nodes to figure out what went wrong. Sometimes
it is low resources and sometimes it is a programming error which is
triggered on specific inputs.. Same is true for some of our hive queries.
Are there any tools (free/paid) which help us to do this debugging quickly?
I am planning to write a debugging tool for sifting through the
distributed logs of hadoop but wanted to check if there are already any
useful tools for this.
Gopi | www.wignite.com