Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Transfer archives (or any file) from Mapper to Reducer?

biro lehel 2012-05-21, 08:12
Harsh J 2012-05-21, 09:02
Copy link to this message
Re: Transfer archives (or any file) from Mapper to Reducer?
Be careful putting them in HDFS.  It does not scale very well, as the number of file opens will be on the order of Number of Mappers * Number of Reducers.  You can quickly do a denial of service on the namenode if you have a lot of mappers and reducers.

--Bobby Evans

On 5/21/12 4:02 AM, "Harsh J" <[EMAIL PROTECTED]> wrote:


I guess you could write these archives onto HDFS, and have your
reducers read it from a location there, but this method may be a bit
ugly. See http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2BAC8-write-to_hdfs_files_directly_from_map.2BAC8-reduce_tasks.3F
for properly writing files from tasks onto a DFS, or look at
MultipleOutputs API class.

Depending on how large these files are, you can also perhaps ship them
in via the KV pairs itself. A custom key or sort comparator can
further ensure that they are delivered in the first iteration of the
reducer - if the file is required before regular reduce() ops can

On Mon, May 21, 2012 at 1:42 PM, biro lehel <[EMAIL PROTECTED]> wrote:
> Dear all,
> In my Mapper, I run a script that processes my set of input text files, creates from them some other text files (this is done locally on the FS on my nodes), and as a result, each MapTask will produce an archive as a result. My issue is, that I'm looking for a way for the Reducer to "take" these archives as some kind of an input. I understood that the communication between Mapper-Reducer is done through the means of the key-value pairs in the Context, but what I would need is the transferring of these archive files to the respective Reducer (I would probably have one single Reducer, so all the files should be transferred/copied there somehow).
> Is this possible? Is there a way to transfer files from Mapper to Reducer? If not, what is the best approach in scenarios like mine? Any suggestions would be greatly appreciated.
> Thank you in advance,
> Lehel.

Harsh J