Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Moving Files to Distributed Cache in MapReduce


Copy link to this message
-
RE: Moving Files to Distributed Cache in MapReduce

I could have sworn that I gave an example earlier this week on how to push and pull stuff from distributed cache.
> Date: Fri, 29 Jul 2011 14:51:26 -0700
> Subject: Re: Moving Files to Distributed Cache in MapReduce
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
>
> jobConf is deprecated in 0.20.2 I believe; you're supposed to be using
> Configuration for that
>
> On Fri, Jul 29, 2011 at 1:59 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:
>
> > Is this what you are looking for?
> >
> > http://hadoop.apache.org/common/docs/current/mapred_tutorial.html
> >
> > search for jobConf
> >
> > On Fri, Jul 29, 2011 at 1:51 PM, Roger Chen <[EMAIL PROTECTED]> wrote:
> > > Thanks for the response! However, I'm having an issue with this line
> > >
> > > Path[] cacheFiles = DistributedCache.getLocalCacheFiles(conf);
> > >
> > > because conf has private access in org.apache.hadoop.configured
> > >
> > > On Fri, Jul 29, 2011 at 11:18 AM, Mapred Learn <[EMAIL PROTECTED]
> > >wrote:
> > >
> > >> I hope my previous reply helps...
> > >>
> > >> On Fri, Jul 29, 2011 at 11:11 AM, Roger Chen <[EMAIL PROTECTED]>
> > wrote:
> > >>
> > >> > After moving it to the distributed cache, how would I call it within
> > my
> > >> > MapReduce program?
> > >> >
> > >> > On Fri, Jul 29, 2011 at 11:09 AM, Mapred Learn <
> > [EMAIL PROTECTED]
> > >> > >wrote:
> > >> >
> > >> > > Did you try using -files option in your hadoop jar command as:
> > >> > >
> > >> > > /usr/bin/hadoop jar <jar name> <main class name> -files  <absolute
> > path
> > >> > of
> > >> > > file to be added to distributed cache> <input dir> <output dir>
> > >> > >
> > >> > >
> > >> > > On Fri, Jul 29, 2011 at 11:05 AM, Roger Chen <[EMAIL PROTECTED]>
> > >> > wrote:
> > >> > >
> > >> > > > Slight modification: I now know how to add files to the
> > distributed
> > >> > file
> > >> > > > cache, which can be done via this command placed in the main or
> > run
> > >> > > class:
> > >> > > >
> > >> > > >        DistributedCache.addCacheFile(new
> > >> > URI("/user/hadoop/thefile.dat"),
> > >> > > > conf);
> > >> > > >
> > >> > > > However I am still having trouble locating the file in the
> > >> distributed
> > >> > > > cache. *How do I call the file path of thefile.dat in the
> > distributed
> > >> > > cache
> > >> > > > as a string?* I am using Hadoop 0.20.2
> > >> > > >
> > >> > > >
> > >> > > > On Fri, Jul 29, 2011 at 10:26 AM, Roger Chen <[EMAIL PROTECTED]
> > >
> > >> > > wrote:
> > >> > > >
> > >> > > > > Hi all,
> > >> > > > >
> > >> > > > > Does anybody have examples of how one moves files from the local
> > >> > > > > filestructure/HDFS to the distributed cache in MapReduce? A
> > Google
> > >> > > search
> > >> > > > > turned up examples in Pig but not MR.
> > >> > > > >
> > >> > > > > --
> > >> > > > > Roger Chen
> > >> > > > > UC Davis Genome Center
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > Roger Chen
> > >> > > > UC Davis Genome Center
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Roger Chen
> > >> > UC Davis Genome Center
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Roger Chen
> > > UC Davis Genome Center
> > >
> >
>
>
>
> --
> Roger Chen
> UC Davis Genome Center