Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - HDFS dfs.client.read.shortcircuit.skip.checksum


+
jlei liu 2012-09-15, 10:12
Copy link to this message
-
Re: HDFS dfs.client.read.shortcircuit.skip.checksum
Todd Lipcon 2012-09-17, 21:36
Hi LiuLei,

Since you're using CDH3 (a 1.x derived distribution) you are using the old
checksum implementations written in Java.

In Hadoop 2.0 (or CDH4), we have JNI-based checksumming which uses
Nehalem's hardware CRC support. This is several times faster.

My guess is that this accounts for the substantial difference. You could
try re-running your test on a newer version to confirm.

-Todd

On Sat, Sep 15, 2012 at 7:13 AM, jlei liu <[EMAIL PROTECTED]> wrote:

> I read  64k data from file every time.
>>
>>
>>
>
--
Todd Lipcon
Software Engineer, Cloudera