Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Moving Files to Distributed Cache in MapReduce


Copy link to this message
-
Re: Moving Files to Distributed Cache in MapReduce

We really need to build a working example to the wiki and add a link from the FAQ page.  Any volunteers?

On Jul 29, 2011, at 7:49 PM, Michael Segel wrote:

>
> Here's the meat of my post earlier...
> Sample code on putting a file on the cache:
> DistributedCache.addCacheFile(new URI(path+"MyFileName",conf));
>
> Sample code in pulling data off the cache:
>       private Path[] localFiles = DistributedCache.getLocalCacheFiles(context.getConfiguration());
>        boolean exitProcess = false;
>       int i=0;
>        while (!exit){
>            fileName = localFiles[i].getName();
>           if (fileName.equalsIgnoreCase("model.txt")){
>                 // Build your input file reader on localFiles[i].toString()
>                 exitProcess = true;
>           }
>            i++;
>        }
>
>
> Note that this is SAMPLE code. I didn't trap the exit condition if the file isn't there and you go beyond the size of the array localFiles[].
> Also I set exit to false because its easier to read this as "Do this loop until the condition exitProcess is true".
>
> When you build your file reader you need the full path, not just the file name. The path will vary when the job runs.
>
> HTH
>
> -Mike
>
>
>> From: [EMAIL PROTECTED]
>> To: [EMAIL PROTECTED]
>> Subject: RE: Moving Files to Distributed Cache in MapReduce
>> Date: Fri, 29 Jul 2011 21:43:37 -0500
>>
>>
>> I could have sworn that I gave an example earlier this week on how to push and pull stuff from distributed cache.
>>
>>
>>> Date: Fri, 29 Jul 2011 14:51:26 -0700
>>> Subject: Re: Moving Files to Distributed Cache in MapReduce
>>> From: [EMAIL PROTECTED]
>>> To: [EMAIL PROTECTED]
>>>
>>> jobConf is deprecated in 0.20.2 I believe; you're supposed to be using
>>> Configuration for that
>>>
>>> On Fri, Jul 29, 2011 at 1:59 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:
>>>
>>>> Is this what you are looking for?
>>>>
>>>> http://hadoop.apache.org/common/docs/current/mapred_tutorial.html
>>>>
>>>> search for jobConf
>>>>
>>>> On Fri, Jul 29, 2011 at 1:51 PM, Roger Chen <[EMAIL PROTECTED]> wrote:
>>>>> Thanks for the response! However, I'm having an issue with this line
>>>>>
>>>>> Path[] cacheFiles = DistributedCache.getLocalCacheFiles(conf);
>>>>>
>>>>> because conf has private access in org.apache.hadoop.configured
>>>>>
>>>>> On Fri, Jul 29, 2011 at 11:18 AM, Mapred Learn <[EMAIL PROTECTED]
>>>>> wrote:
>>>>>
>>>>>> I hope my previous reply helps...
>>>>>>
>>>>>> On Fri, Jul 29, 2011 at 11:11 AM, Roger Chen <[EMAIL PROTECTED]>
>>>> wrote:
>>>>>>
>>>>>>> After moving it to the distributed cache, how would I call it within
>>>> my
>>>>>>> MapReduce program?
>>>>>>>
>>>>>>> On Fri, Jul 29, 2011 at 11:09 AM, Mapred Learn <
>>>> [EMAIL PROTECTED]
>>>>>>>> wrote:
>>>>>>>
>>>>>>>> Did you try using -files option in your hadoop jar command as:
>>>>>>>>
>>>>>>>> /usr/bin/hadoop jar <jar name> <main class name> -files  <absolute
>>>> path
>>>>>>> of
>>>>>>>> file to be added to distributed cache> <input dir> <output dir>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jul 29, 2011 at 11:05 AM, Roger Chen <[EMAIL PROTECTED]>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Slight modification: I now know how to add files to the
>>>> distributed
>>>>>>> file
>>>>>>>>> cache, which can be done via this command placed in the main or
>>>> run
>>>>>>>> class:
>>>>>>>>>
>>>>>>>>>       DistributedCache.addCacheFile(new
>>>>>>> URI("/user/hadoop/thefile.dat"),
>>>>>>>>> conf);
>>>>>>>>>
>>>>>>>>> However I am still having trouble locating the file in the
>>>>>> distributed
>>>>>>>>> cache. *How do I call the file path of thefile.dat in the
>>>> distributed
>>>>>>>> cache
>>>>>>>>> as a string?* I am using Hadoop 0.20.2
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Jul 29, 2011 at 10:26 AM, Roger Chen <[EMAIL PROTECTED]
>>>>>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Does anybody have examples of how one moves files from the local