Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Fwd: problem with HDFS caching in Hadoop 2.3


Copy link to this message
-
Fwd: problem with HDFS caching in Hadoop 2.3
add dev.
From: "hwpstorage" <[EMAIL PROTECTED]>
Date: Mar 7, 2014 11:38 PM
Subject: problem with HDFS caching in Hadoop 2.3
To: <[EMAIL PROTECTED]>
Cc:

Hello,

It looks like the HDFS caching does not work well.
The cached log file is around 200MB. The hadoop cluster has 3 nodes, each
has 4GB memory.

-bash-4.1$ hdfs cacheadmin -addPool wptest1
Successfully added cache pool wptest1.

-bash-4.1$ /hadoop/hadoop-2.3.0/bin/hdfs cacheadmin -listPools
Found 1 result.
NAME     OWNER  GROUP  MODE            LIMIT  MAXTTL
wptest1  hdfs   hdfs   rwxr-xr-x   unlimited   never

-bash-4.1$ hdfs cacheadmin -addDirective -path hadoop003.log -pool wptest1
Added cache directive 1

-bash-4.1$  time /hadoop/hadoop-2.3.0/bin/hadoop fs -tail hadoop003.log
real    0m2.796s
user    0m4.263s
sys     0m0.203s

-bash-4.1$  time /hadoop/hadoop-2.3.0/bin/hadoop fs -tail hadoop003.log
real    0m3.050s
user    0m4.176s
sys     0m0.192s

It is weird that the cache status shows 0 byte cached:-bash-4.1$
/hadoop/hadoop-2.3.0/bin/hdfs cacheadmin -listDirectives -stats -path
hadoop003.log -pool wptest1
Found 1 entry
ID POOL      REPL EXPIRY  PATH                       BYTES_NEEDED
BYTES_CACHED  FILES_NEEDED  FILES_CACHED
  1 wptest1      1 never   /user/hdfs/hadoop003.log
209715206             0             1             0

-bash-4.1$ file /hadoop/hadoop-2.3.0/lib/native/libhadoop.so.1.0.0
/hadoop/hadoop-2.3.0/lib/native/libhadoop.so.1.0.0: ELF 64-bit LSB shared
object, x86-64, version 1 (SYSV), dynamically linked, not stripped

I also tried the word count example with the same file. The execution time
is always 40 seconds. (The map/reduce job without cache is 42 seconds)
Is there anything wrong?
Thanks a lot

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB