Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> [VOTE -- Round 2] Commit hdfs-630 to 0.21?


Copy link to this message
-
Re: [VOTE -- Round 2] Commit hdfs-630 to 0.21?
Stack wrote:

I'm being 0 on this

-I would worry if the exclusion list was used by the NN to do its
blacklisting, I'm glad to see this isn't happening. Yes, you could pick
up datanode failure faster, but you would also be vulnerable to a user
doing a DoS against the cluster by reporting every DN as failing

-Russ Perry's work on high-speed Hadoop rendering [1] tweaked Hadoop to
allow the datanodes to get the entire list of nodes holding the data,
and allowed them to make their own decision about where to get the data
from. This
  1. pushed the policy of handling failure down to the clients, less
need to talk to the NN about it.
  2. lets you do something very fancy where you deliberately choose data
from different DNs, so that you can then pull data off the cluster at
the full bandwidth of every disk

Long term, I would like to see Russ's addition go in, so worry if the
HDFS-630 patch would be useful long term. Maybe its a more fundamental
issue: where does the decision making go, into the clients or into the NN?

-steve

[1] http://www.hpl.hp.com/techreports/2009/HPL-2009-345.html
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB