Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Hive / Hadoop Log Retrieval Problem


Copy link to this message
-
Re: Hive / Hadoop Log Retrieval Problem
I was just trying to build a patch for this, to submit against
https://issues.apache.org/jira/browse/HIVE-1579 and it seems that the
default hadoop build is 0.20.1, which has a different set of
parameters required in the Hadoop TaskLogServlet to a more recent
version, like 0.20.205. Are there any plans to update the default
hadoop build?

Cheers,

Phil.

On 6 March 2012 14:43, Philip Tromans <[EMAIL PROTECTED]> wrote:
> Hi,
>
> It appears that no recent version of hadoop supports this - the hadoop
> code appears to have changed in 2010:
>
> http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/src/java/org/apache/hadoop/mapred/TaskLogServlet.java?r1=918036&r2=918037&diff_format=f#l209
>
> If someone could confirm that I'm working on the right lines then I'll
> open up a JIRA (or add to HIVE-1579) and submit a patch.
>
> Cheers,
>
> Phil.
>
> On 6 March 2012 14:13, Philip Tromans <[EMAIL PROTECTED]> wrote:
>> Hi all,
>>
>> I'm running into a problem - I'm using Hive trunk (pretty recent, but
>> I see the bug's in trunk at time of writing as well), with Hadoop
>> 0.20.205.0. I have a job which fails (for a reason which is entirely
>> my own fault), and when it does fail Hive dies with the following
>> exception:
>>
>> Ended Job = job_201202291327_1399 with errors
>> Error during job, obtaining debugging information...
>> Examining task ID: task_201202291327_1399_m_000003 (and more) from job
>> job_201202291327_1399
>> Examining task ID: task_201202291327_1399_r_000002 (and more) from job
>> job_201202291327_1399
>> Exception in thread "Thread-342" java.lang.RuntimeException: Error
>> while reading from task log url
>>       at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
>>       at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
>>       at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
>>       at java.lang.Thread.run(Thread.java:662)
>> Caused by: java.io.IOException: Server returned HTTP response code:
>> 400 for URL: http://...:50060/tasklog?taskid=attempt_201202291327_1399_r_000002_2&start=-8193
>>       at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
>>       at java.net.URL.openStream(URL.java:1010)
>>       at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
>>       ... 3 more
>>
>> When I point a web browser at the given URL, I get the following message:
>>
>> HTTP ERROR 400
>>
>> Problem accessing /tasklog. Reason:
>>
>>   Argument attemptid is required
>>
>> If I change taskid to attemptid, everything works perfectly. The code
>> which generates this URL appears to be in
>> org.apache.hadoop.hive.ql.exec.JobDebugger.java. I presume that this
>> code is correct for a given version of Hadoop. Which version is
>> currently in use in the Jenkins/Hudson build environment? I'd be happy
>> to change it and submit a patch to JIRA, but I guess that'd probably
>> break the other version of Hadoop, so perhaps some more profound
>> versioning type thing might be needed.
>>
>> This is the issue that HIVE-1579 is referring to.
>>
>> Cheers,
>>
>> Phil.