Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Bulk Loading DFS Space issue in Hbase


Copy link to this message
-
Bulk Loading DFS Space issue in Hbase
Hi
I am trying to bulk load 700m CSV data with 31 colms into Hbase

I have written MapReduce Program for but when run my program
it takes whole disk space and fails

Here is Status before running
*
 *
**
Configured Capacity : 116.16 GB DFS Used : 13.28 GB Non DFS Used :
61.41 GBDFS Remaining:41.47 GBDFS Used%:11.43 %DFS Remaining%:35.7 %
Live
Nodes <http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
: 1 Dead Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
: 0 Decommissioning
Nodes<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
: 0 Number of Under-Replicated Blocks : 68

After Runnign

* *

* Configured Capacity*

 :

 116.16 GB

* DFS Used*

 :

 52.07 GB

* Non DFS Used*

 :

 61.47 GB

* DFS Remaining*

 :

 2.62 GB

* DFS Used%*

 :

 44.83 %

* DFS Remaining%*

 :

 2.26 %

* **Live Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=LIVE>
* *

 :

 1

* **Dead Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DEAD>
* *

 :

 0

* **Decommissioning
Nodes*<http://rdcesx12078.race.sas.com:50070/dfsnodelist.jsp?whatNodes=DECOMMISSIONING>
* *

 :

 0

* Number of Under-Replicated Blocks*

 :

 455

So what is taking so much DFS space.

Has Anybody come across this issue.

even though map and reduce complete 100%

For incramental loading of HFILE it again keep on

Demanding spcace until whole drive ..

52 GB for 700 MB csv File


--
*
*
*

Thanx and Regards*
* Vikas Jadhav*
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB