Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Map output files and partitions.


Copy link to this message
-
Re: Map output files and partitions.
Harsh J 2012-12-14, 07:29
Map output files, by which you perhaps mean intermediate data files
for temporary K/V persistence, are stored in IFiles. They do not use
text nor sequence files (historically though, they did use sequence
files at some point).

You can read the IFile's sources at
http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/IFile.java
for more technical details on it. It is very similar to SequenceFiles
in some ways.

On Fri, Dec 14, 2012 at 12:45 PM, Pedro Sá da Costa <[EMAIL PROTECTED]> wrote:
> Hi,
>
> There only 2 types of map output files, Sequence and Text files. If
> those files are going to be used as input to several reduce tasks,
> they need to be partitioned into blocks. Is there any SEPARATOR bits
> that limits each partition? Can I read a specific partition of a map
> output file? Is there an API for that?
>
> --
> Best regards,

--
Harsh J