|
|
+
Jay Vyas 2012-12-03, 05:52
+
Jeff LI 2012-12-02, 22:03
-
Re: Input splits for sequence file inputHarsh J 2012-12-03, 04:27
Hi Jeff,
This has been asked several times before (check out http://search-hadoop.com please). The answer is (3) for SequenceFiles (due to no notion of records) and (2) as a general thought (i.e. text files, etc.). On Mon, Dec 3, 2012 at 3:33 AM, Jeff LI <[EMAIL PROTECTED]> wrote: > Hello, > > I was reading on the relationship between input splits and HDFS blocks and a > question came up to me: > > If a logical record crosses HDFS block boundary, let's say block#1 and > block#2, does the mapper assigned with this input split asks for (1) both > blocks, or (2) block#1 and just the part of block#2 that this logical record > extends to, or (3) block#1 and part of block#2 up to some sync point that > covers this particular logical record? Note the input is sequence file. > > I guess my question really is: does Hadoop operate on a block basis or does > it respect some sort of logical structure within a block when it's trying to > feed the mappers with input data. > > Cheers > > Jeff > -- Harsh J |