Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Re: HDFS disk space requirements


Copy link to this message
-
Re: HDFS disk space requirements
What is the reason to have replication factor set to 5?
change it to 3 and you will save 30% of the space.
Also, you can load your JSON data to separate folder with replication set to 1, as it is only the source  and will be gone after processing.

Thank you!

Sincerely,
Leonid Fedotov
On Jan 10, 2013, at 7:07 PM, Panshul Whisper wrote:

> Hello,
>
> I have a 5 node hadoop cluster and a fully distributed Hbase setup on the
> cluster with 130 GB of HDFS space avaialble. HDFS replication is set to 5.
>
> I have a total of 115 GB of JSON files that need to be loaded into the
> Hbase database and then they have to processed.
>
> So is the available HDFS space sufficient for the operations?? considering
> the replication and all factors?
> or should I increase the space and by how much?
>
> Thanking You,
>
> --
> Regards,
> Ouch Whisper
> 010101010101

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB