Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Estimating disk space requirements

Copy link to this message
Estimating disk space requirements

I was estimating how much disk space do I need for my cluster.

I have 24 million JSON documents approx. 5kb each
the Json is to be stored into HBASE with some identifying data in coloumns
and I also want to store the Json for later retrieval based on the Id data
as keys in Hbase.
I have my HDFS replication set to 3
each node has Hadoop and hbase and Ubuntu installed on it.. so approx 11 GB
is available for use on my 20 GB node.

I have no idea, if I have not enabled Hbase replication, is the HDFS
replication enough to keep the data safe and redundant.
How much total disk space I will need for the storage of the data.

Please help me estimate this.

Thank you so much.

Ouch Whisper