Harsh: thanks for the quick response.
we often see an error response such as "Failed(Query returned non-zero
code: 2, cause: FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.MapRedTask)" and then go through all the
logs to figure out what happened. I use the jobtracker UI to go to the
error logs and see what happened.
I was thinking a log parsing tool with a good UI to go through the
distributed-logs and help you find errors, get stats on similar effors in
prev runs etc will be useful. HADOOP-9861 might help in getting good info,
but might be still not very easy for quick debugging.
Has anybody faced similar issues as part of their development? Are there
any better ways to pin point the cause of error?
Gopi | www.wignite.com
On Tue, Aug 27, 2013 at 10:42 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> We set a part of the failure reason as the diagnostic message for a
> failed task that a JobClient API retrieves/can retrieve:
> Often this is
> 'useless' given the stack trace's top part isn't always carrying the
> most relevant information, so perhaps HADOOP-9861 may help here once
> it is checked in.
> On Tue, Aug 27, 2013 at 10:34 AM, Gopi Krishna M <[EMAIL PROTECTED]> wrote:
> > Hi
> > We are seeing our map-reduce jobs crashing once in a while and have to go
> > through the logs on all the nodes to figure out what went wrong.
> > it is low resources and sometimes it is a programming error which is
> > triggered on specific inputs.. Same is true for some of our hive
> > Are there any tools (free/paid) which help us to do this debugging
> > I am planning to write a debugging tool for sifting through the
> > logs of hadoop but wanted to check if there are already any useful tools
> > this.
> > Thx
> > Gopi | www.wignite.com
> Harsh J