Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # dev - DistributedCache.addFileToClassPath()


Copy link to this message
-
Re: DistributedCache.addFileToClassPath()
Ted Yu 2011-02-08, 00:40
The issue was due to (incomplete) refactoring done by my coworker.

FYI

On Sat, Feb 5, 2011 at 4:36 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> I found that jobCache directory on grid nodes wasn't created. Normally I
> should find the following:
> /tmp/hadoop-hadoop/mapred/local/taskTracker/archive/
> us01-ciqps1-name01.carrieriq.com/jobCache/opt/msip/clients/CIQ-Performance/m2mDeployment/3.1.0.8-282685/mmp-dist/work/sim/working/1296857601150
>
> I found this in TaskRunner.java:
>
>             p[i] = DistributedCache.getLocalCache(archives[i], conf,
>                                                   new Path(baseDir),
>                                                   fileStatus,
>                                                   true, Long.parseLong(
>
> archivesTimestamps[i]),
>                                                   new Path(workDir.
>                                                         getAbsolutePath()),
>
>                                                   false,
>
> tracker.getAsyncDiskService());
> But I don't see log statements in DistributedCache.getLocalCache().
>
> Hint on what I should check next is appreciated.
>
>
> On Sat, Feb 5, 2011 at 6:17 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>> We use cdh3b2.
>>
>> Recently we experience map task failure because of:
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) - 2011-02-05 02:17:23,855
>> WARN org.apache.hadoop.mapred.TaskTracker: Error running child
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) - java.io.IOException:
>> Split class com.carrieriq.m2m.platform.mmp2.input.FileListInputSplit not
>> found
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:326)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> org.apache.hadoop.mapred.Child.main(Child.java:170)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) - Caused by:
>> java.lang.ClassNotFoundException:
>> com.carrieriq.m2m.platform.mmp2.input.FileListInputSplit
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> java.security.AccessController.doPrivileged(Native Method)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> java.lang.Class.forName0(Native Method)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> java.lang.Class.forName(Class.java:247)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:907)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       at
>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323)
>> INFO [2011-02-04 18:17:36] (ExecUtil.java:261) -       ... 2 more
>>
>> I found that the following config parameter was missing from the
>> underlying Job Conf:
>> mapred.job.classpath.files
>>
>> We use the following code:
>>                 Path dest = copyToDfs(jar, jobConf);
>>              // add URL into class path for grid based job also to ensure
>> flow validation can work using mmp command run
>>                 ClassUtil.addURL(jar.toURL());
>>                 DistributedCache.addFileToClassPath(dest, jobConf);
>>
>> From log, I verified that ClassUtil.addURL() was called.
>>
>> The following API doesn't have return code, no logging: