-Re: Job jar not removed from staging directory on job failure/how to share a job jar using distributed cache
Harsh J 2012-10-06, 16:11
Yes this is an unfortunate edge case. Though, this is fixed in the
trunk/2.x client rewrite and tracked as a test now by
On Fri, Oct 5, 2012 at 10:28 PM, Bertrand Dechoux <[EMAIL PROTECTED]> wrote:
> I am launching my job using the command line and I observed that when the
> provided input path do not match any files, the jar in the staging
> repository is not removed.
> It is removed on job termination (success or failure) but here the job isn't
> even really started so it may be an edge case.
> Has anyone seen the same behaviour? (I am using 1.0.3)
> Here is an extract of the stack trace with hadoop related classes.
>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
>> does not exist: [removed]
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:838)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:791)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:465)
>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:494)
> Second question is a bit related because one of its consequence would
> nullify the impact of the above 'bug'.
> Is it possible to set directly the main job jar as a jar already inside
> From what I know, the configuration points to a local jar archive which is
> uploaded each time to the staging repository.
> The same question was asked in the jira but without clear resolution.
> My question might be related to
> which is resolved for next version. But it seems to be only about uberjar
> and I am using a standard jar.
> If it works with a hdfs location, what are the details? Won't it be cleaned
> during job termination? Why not? Will it also be setup within the
> distributed cache?
> PS : I know there are others solutions to my problem. I will look at Oozie.
> And worst case, I can create a FileSystem instance myself to check whether
> the job should be really launched or not. Both could work but both seem
> overkill in my context.