I am trying to acquire statistics about my hdfs cluster in the lab. One stat
I am really interested in is the total throughput (gigabytes served) of the
cluster for 24 hours. I suppose I can look for 'cmd=open' in the log file of
the name node but how accurate is it? It seems there is no 'cmd=close'
to distinguish a full file read. Is there a better way to acquire this?
--- Get your facts first, then you can distort them as you please.--