Ramasubramanian Narayanan... 2012-11-07, 14:52
Blocks are split at arbitrary block size boundaries. Readers can read
the whole file by reading all blocks together (this is transparently
handled by the underlying DFS reader classes itself, a developer does
not have to care about it).
HDFS does not care about what _type_ of file you store, its agnostic
and just splits it based on the block size. Its up to the apps to not
split a reader across blocks if it can't be parallelized.
On Wed, Nov 7, 2012 at 8:22 PM, Ramasubramanian Narayanan
<[EMAIL PROTECTED]> wrote:
> I have basic doubt... How Hadoop splits an Image file into blocks and puts
> in HDFS? Usually Image file cannot be splitted right how it is happening in