Have you looked at DataNode metrics?
$ sudo -u hdfs bash -c 'echo -e "open `cat $HADOOP_SECURE_DN_PID_DIR/hadoop_secure_dn.pid`\n get -b Hadoop:name=DataNode,service=DataNode *" | /usr/bin/java -jar /home/rajive/w/jmx/jmxterm.jar -n '
Welcome to JMX terminal. Type "help" for available commands.
#Connection to 31177 is opened
#mbean = Hadoop:name=DataNode,service=DataNode:
tag.context = dfs;
tag.sessionId = null;
tag.hostName = phanpy-dn001.hadoop.apache.org;
bytes_written = 3048581049273;
bytes_read = 497034196749;
blocks_written = 326391;
blocks_read = 249262;
blocks_replicated = 401423;
blocks_removed = 29844;
blocks_verified = 858155;
block_verification_failures = 0;
reads_from_local_client = 6671;
reads_from_remote_client = 242591;
writes_from_local_client = 7331;
writes_from_remote_client = 319056;
readBlockOp_num_ops = 249262;
readBlockOp_avg_time = 17.0;
writeBlockOp_num_ops = 326387;
writeBlockOp_avg_time = 61.5;
blockChecksumOp_num_ops = 22431;
blockChecksumOp_avg_time = 59.0;
copyBlockOp_num_ops = 0;
copyBlockOp_avg_time = 0.0;
replaceBlockOp_num_ops = 0;
replaceBlockOp_avg_time = 0.0;
heartBeats_num_ops = 590662;
heartBeats_avg_time = 19.0;
blockReports_num_ops = 571;
blockReports_avg_time = 6205.0;
Mark question wrote on 10/21/11 at 17:27:23 -0700:
> I wonder if there is a way to measure how many of the data blocks have
>transferred over the network? Or more generally, how many times where there
>a connection/contact between different machines?
> I thought of checking the Namenode log file which usually shows blk_....
>from src= to dst ... but I'm not sure if it's correct to count those lines.
>Any ideas are helpful.