Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Why big block size for HDFS.


Copy link to this message
-
Why big block size for HDFS.
Hi,

Many places it has been written that to avoid huge no of disk seeks , we
store big blocks in HDFS , so that once we seek to the location , then
there is only data transfer rate which would be predominant , no more
seeks. I am not sure if I have understood this correctly.

My question is , no matter what the block size we decide , finally its
getting written to the computers HDD , which would be formatted and would
have a block size in KB's and also while writing to the FS (not HDFS) , its
not guaranteed that the blocks that we write are continuous , so there
would be disk seeks anyways .The assumption of HDFS would be only true if
the underlying Fs guarentees to write the data in continuous blocks.

Can someone explain a bit.
Thanks,
Rahul
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB