Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - why not zookeeper for the namenode


Copy link to this message
-
why not zookeeper for the namenode
Thomas Koch 2010-02-19, 08:41
Hi,

yesterday I read the documentation of zookeeper and the zk contrib bookkeeper.
From what I read, I thought, that bookkeeper would be the ideal enhancement
for the namenode, to make it distributed and therefor finaly highly available.
Now I searched, if work in that direction has already started and found out,
that apparently a totaly different approach has been choosen:
http://issues.apache.org/jira/browse/HADOOP-4539

Since I'm new to hadoop, I do trust in your decision. However I'd be glad, if
somebody could satisfy my curiosity:

- Why hasn't zookeeper(-bookkeeper) not been choosen? Especially since it  
  seems to do a similiar job already in hbase.

- Isn't it, that with HADOOP-4539 client's can only connect to one namenode at
  a time, leaving the burden of all reads and writes on the one's shoulder?

- Isn't it, that zookeeper would be more network efficient. It requires only a
  majority of nodes to receive a change, while HADOOP-4539 seems to require
  all backup nodes to receive a change before its persisted.

Thanks for any explanation,

Thomas Koch, http://www.koch.ro