Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Hive / Hadoop Log Retrieval Problem


Copy link to this message
-
Re: Hive / Hadoop Log Retrieval Problem
I was just trying to build a patch for this, to submit against
https://issues.apache.org/jira/browse/HIVE-1579 and it seems that the
default hadoop build is 0.20.1, which has a different set of
parameters required in the Hadoop TaskLogServlet to a more recent
version, like 0.20.205. Are there any plans to update the default
hadoop build?

Cheers,

Phil.

On 6 March 2012 14:43, Philip Tromans <[EMAIL PROTECTED]> wrote:
> Hi,
>
> It appears that no recent version of hadoop supports this - the hadoop
> code appears to have changed in 2010:
>
> http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/src/java/org/apache/hadoop/mapred/TaskLogServlet.java?r1=918036&r2=918037&diff_format=f#l209
>
> If someone could confirm that I'm working on the right lines then I'll
> open up a JIRA (or add to HIVE-1579) and submit a patch.
>
> Cheers,
>
> Phil.
>
> On 6 March 2012 14:13, Philip Tromans <[EMAIL PROTECTED]> wrote:
>> Hi all,
>>
>> I'm running into a problem - I'm using Hive trunk (pretty recent, but
>> I see the bug's in trunk at time of writing as well), with Hadoop
>> 0.20.205.0. I have a job which fails (for a reason which is entirely
>> my own fault), and when it does fail Hive dies with the following
>> exception:
>>
>> Ended Job = job_201202291327_1399 with errors
>> Error during job, obtaining debugging information...
>> Examining task ID: task_201202291327_1399_m_000003 (and more) from job
>> job_201202291327_1399
>> Examining task ID: task_201202291327_1399_r_000002 (and more) from job
>> job_201202291327_1399
>> Exception in thread "Thread-342" java.lang.RuntimeException: Error
>> while reading from task log url
>>       at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
>>       at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
>>       at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
>>       at java.lang.Thread.run(Thread.java:662)
>> Caused by: java.io.IOException: Server returned HTTP response code:
>> 400 for URL: http://...:50060/tasklog?taskid=attempt_201202291327_1399_r_000002_2&start=-8193
>>       at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
>>       at java.net.URL.openStream(URL.java:1010)
>>       at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
>>       ... 3 more
>>
>> When I point a web browser at the given URL, I get the following message:
>>
>> HTTP ERROR 400
>>
>> Problem accessing /tasklog. Reason:
>>
>>   Argument attemptid is required
>>
>> If I change taskid to attemptid, everything works perfectly. The code
>> which generates this URL appears to be in
>> org.apache.hadoop.hive.ql.exec.JobDebugger.java. I presume that this
>> code is correct for a given version of Hadoop. Which version is
>> currently in use in the Jenkins/Hudson build environment? I'd be happy
>> to change it and submit a patch to JIRA, but I guess that'd probably
>> break the other version of Hadoop, so perhaps some more profound
>> versioning type thing might be needed.
>>
>> This is the issue that HIVE-1579 is referring to.
>>
>> Cheers,
>>
>> Phil.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB