Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Erasure Coding in HDFS


+
Pankaj Misra 2012-12-13, 09:50
Copy link to this message
-
RE: Erasure Coding in HDFS
Requesting for community's help for the questions below, as it will help us better understand erasure coding at HDFS level in context of replication. Thanks.

Thanks and Regards
Pankaj Misra
________________________________
From: Pankaj Misra
Sent: Thursday, December 13, 2012 3:20 PM
To: [EMAIL PROTECTED]
Subject: Erasure Coding in HDFS

Dear All,

I was looking at options for reducing the overall cost of storage that is incurred due to replication of data across the datanodes for higher availability and data localization for processing.

I stumbled on a few articles suggesting erasure coding (software-raid) as one such mechanism which can provide upt 5 to 8 9s of availability while keeping the replication factor low.

I also came across a JIRA for erasure coding in HDFS
https://issues.apache.org/jira/browse/HDFS-503

I will need some help to understand the following
1. How can I use erasure coding with Hadoop 1.1.1 release?
2. How will erasure coding work with replication mechanism and how will it affect the data locality aspect for data processing, since erasure coding fragments the data?
3. How mature is the current implementation of erasure coding in HDFS?

Any help will be greatly appreciated.

Thanks and Regards
Pankaj Misra
________________________________
NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.

________________________________
NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
+
Pankaj Misra 2012-12-17, 09:44
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB