Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Discrepancy in the values of consumed disk space by hadoop


Copy link to this message
-
Re: Discrepancy in the values of consumed disk space by hadoop
Hi,

I think you are referring DFS Used (from NameNode report) and Total size
(from fsck) values right?.

*DFS Used:* This contains the total hdfs space used on all the connected
data nodes, in your case 230296610816 (214.48 GB).
**
*Total Size:* Fsck utility looks for the blocks in namespace , it will
check all the blocks one by one including replicated blocks, fsck retrieve
all the information from name node only. Therefore fsck total size contains
the size of total blocks on hdfs excluding replicas.

Hope this will help you.

Thanks
On Sun, Aug 11, 2013 at 10:44 PM, Yogini Gulkotwar <
[EMAIL PROTECTED]> wrote:

> Hi All,
>
> I have a CDH4 hadoop cluster setup with 3 datanodes and a data replication
> factor of 2.
>
> When I try to check the consumed dfs space, I get different values using
> the "hdfs dfsadmin -report" and "hdfs fsck" command.
> Could anyone please help me understand the reason behind the discrepancy
> in the values?
>
>  I get the following output:
>
> *# sudo -u hdfs hdfs dfsadmin -report*
>
>
> Configured Capacity: 321252989337600 (292.18 TB)
> Present Capacity: 264896108259328 (240.92 TB)
> DFS Remaining: 264665811648512 (240.71 TB)
> DFS Used: 230296610816 (214.48 GB)
> DFS Used%: 0.09%
> Under replicated blocks: 19
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 3 (3 total, 0 dead)
>
> Live datanodes:
> Name: (slave1)
> Hostname: localhost
> Decommission Status : Normal
> Configured Capacity: 107084329779200 (97.39 TB)
> DFS Used: 77728510976 (72.39 GB)
> Non DFS Used: 18784664751104 (17.08 TB)
> DFS Remaining: 88221936517120 (80.24 TB)
> DFS Used%: 0.07%
> DFS Remaining%: 82.39%
> Last contact: Fri Aug 09 13:26:38 IST 2013
>
>
> Name: (slave3)
> Hostname: localhost
> Decommission Status : Normal
> Configured Capacity: 107084329779200 (97.39 TB)
> DFS Used: 76206287872 (70.97 GB)
> Non DFS Used: 18786185925632 (17.09 TB)
> DFS Remaining: 88221937565696 (80.24 TB)
> DFS Used%: 0.07%
> DFS Remaining%: 82.39%
> Last contact: Fri Aug 09 13:26:37 IST 2013
>
>
> Name:(slave2)
> Hostname: localhost
> Decommission Status : Normal
> Configured Capacity: 107084329779200 (97.39 TB)
> DFS Used: 76361811968 (71.12 GB)
> Non DFS Used: 18786030401536 (17.09 TB)
> DFS Remaining: 88221937565696 (80.24 TB)
> DFS Used%: 0.07%
> DFS Remaining%: 82.39%
>
>
> --------------------------------------------------------------------------------------------------------------------------
> *# sudo -u hdfs hadoop fsck /*
>
>
> Connecting to namenode via http://master1:50070
>
>
> Status: HEALTHY
>  Total size: 75245213337 B
>  Total dirs: 3203
>  Total files: 7893
>  Total blocks (validated): 7642 (avg. block size 9846272 B)
>  Minimally replicated blocks: 7642 (100.0 %)
>  Over-replicated blocks: 0 (0.0 %)
>  Under-replicated blocks: 19 (0.24862601 %)
>  Mis-replicated blocks: 0 (0.0 %)
>  Default replication factor: 2
>  Average block replication: 2.0024862
>  Corrupt blocks: 0
>  Missing replicas: 133 (0.86162215 %)
>  Number of data-nodes: 3
>  Number of racks: 1
> FSCK ended at Fri Aug 09 14:01:47 IST 2013 in 266 milliseconds
>
>
> The filesystem under path '/' is HEALTHY
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> *# sudo -u hdfs hadoop fs -count -q /*
>   2147483647      2147472547            none             inf         3203
>         7897        75245470999 /
>
>
>
> Thanks & Regards,
> *Yogini Gulkotwar*
> *Flutura Decision Sciences & Analytics, Bangalore*
> *Email*: [EMAIL PROTECTED]<[EMAIL PROTECTED]>
> *Website*: www.fluturasolutions.com
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB