Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Why big block size for HDFS.


+
Rahul Bhattacharjee 2013-03-31, 16:55
+
John Lilley 2013-03-31, 18:58
Copy link to this message
-
Re: Why big block size for HDFS.
Thanks a lot John , Azurya.

I guessed about the optimization of HDD. Then it might be good to defrag
the underlying disk during general maintenance downtime.

Thanks,
Rahul
On Mon, Apr 1, 2013 at 12:28 AM, John Lilley <[EMAIL PROTECTED]>wrote:

>  ** **
>
> *From:* Rahul Bhattacharjee [mailto:[EMAIL PROTECTED]]
> *Subject:* Why big block size for HDFS.****
>
> ** **
>
> >Many places it has been written that to avoid huge no of disk seeks , we
> store big blocks in HDFS , so that once we seek to the location , then
> there is only data transfer rate which would be predominant , no more
> seeks. I am not sure if I have understood this correctly.****
>
> >My question is , no matter what the block size we decide , finally its
> getting written to the computers HDD , which would be formatted and would
> have a block size in KB's and also while writing to the FS (not HDFS) , its
> not guaranteed that the blocks that we write are continuous , so there
> would be disk seeks anyways .The assumption of HDFS would be only true if
> the underlying Fs guarentees to write the data in continuous blocks.****
>
>
> >Can someone explain a bit.****
>
> >Thanks,
> >Rahul  ****
>
> ** **
>
> While there are no guarantees that disk storage will be contiguous, the OS
> will attempt to keep large files contiguous (and may even defrag over
> time), and if all files are written using large blocks, this is more likely
> to be the case.  If storage is contiguous, you can write a complete track
> without seeking.  A complete track size varies, but a 1TB disk might have
> 500KB/track.  Stepping adjacent close tracks is also much cheaper than the
> average seek time, and as you might expect, has been optimized in hardware
> to assist sequential I/O.  However, if you switch storage units, you will
> probably encounter at least one full seek at the start of the block (since
> it was probably somewhere else at the time).  The result is that, on
> average, writing sequential files is very fast (>100MB/sec on typical
> SATA).  But I think that the blocks overhead has more to do with finding
> where to read the next block from, assuming that data has been distributed
> evenly.****
>
> ** **
>
> So consider connection overhead when the data is distributed.  I am no
> expert on the Hadoop internals, but I suspect that somewhere, a TCP
> connection is opened to transfer data.  Whether connection overhead is
> reduced by maintaining open connection pools, I don’t know.  But let’s
> assume that there is **some** overhead for switching data transfer from
> machine “A”  that owns block “1000” and machine “B” that owns block
> “1001”.  The larger the block size, the less significant will be this
> overhead relative to the sequential transfer rate.  ****
>
> ** **
>
> In addition, MapR/YARN has an easier time of scheduling if there are fewer
> blocks.****
>
> --john****
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB