Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> Hardware failure and data protection


+
Aji Janis 2012-08-13, 16:31
Copy link to this message
-
Re: Hardware failure and data protection
On Mon, Aug 13, 2012 at 12:31 PM, Aji Janis <[EMAIL PROTECTED]> wrote:

> I am very new to Hadoop and Accumulo. I need some information on how data
> is backed up or guaranteed against system failures (if it is).I am
> considering setting up a Hadoop cluster consisting of 5 nodes where each
> node has 3 internal hard drives. I understand HDFS has a configurable
> redundancy feature but what happens if an entire drive crashes (physically)
> for whatever reason? How does Hadoop recover, if it can, from this
> situation? More specifically, I am assuming Accumulo uses HDFS redundancy
> to make back ups of the data.
>
> One, is this assumption true?
>

Yes, Accumulo uses HDFS replication to preserve data in the presence of
failures.  HDFS stores N exact copies of each data block, with each copy
being stored on a different server.  If a drive crashes, HDFS notices that
blocks are under-replicated, and copies those blocks to an available
drive.  Thus the data can survive N-1 simultaneous failures.
> Two, if I had a copy of the hard drive and I duplicate that to a new drive
> and pop it in where the old/crashed drive used to be would this work?
>

Since that drive's data would have been replicated to other drives, you
should not need a copy of it.  You should just be able to put in a fresh
hard drive, and HDFS will start using it.

Billie
>
> I apologize if this is a really stupid question. But I highly appreciate
> any help, pointers and suggestions! Thanks in advance.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB