Blocks are split at arbitrary block size boundaries. Readers can read the whole file by reading all blocks together (this is transparently handled by the underlying DFS reader classes itself, a developer does not have to care about it).
HDFS does not care about what _type_ of file you store, its agnostic and just splits it based on the block size. Its up to the apps to not split a reader across blocks if it can't be parallelized.
On Wed, Nov 7, 2012 at 8:22 PM, Ramasubramanian Narayanan <[EMAIL PROTECTED]> wrote: > Hi, > > I have basic doubt... How Hadoop splits an Image file into blocks and puts > in HDFS? Usually Image file cannot be splitted right how it is happening in > Hadoop? > > regards, > Rams
-- Harsh J
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext