The namenode is already a serious bottleneck for meta-data updates. If you
allow some of the block map or meta-data to page out to disk, then the
bottleneck is going to get much worse.
The only way to avoid this is to make the data much more cacheable and to
have a viable cache coherency strategy. Cache coherency at the meta-data
level is difficult. Cache coherency at the block level is also difficult
(but not as difficult) because many blocks get moved for balance purposes.
The MapR approach is a useful counter-example here since the architecture
was specifically designed so that the only centralized data could be cached
indefinitely because coherency can be checked on access. This dramatically
increases the distribution of the location information which in turn makes
the centralized copy much smaller and more pageable. The virtuous cycle
continues by making the distributed resources read/write so that meta-data
needn't be centralized.
It is very hard for me to understand how to evolutionarily migrate the
current HDFS architecture to something that admits paging of data to disk.
The problem is that there are logical circularities with the current
approach that force either the current design or a major rebuild from the
On Mon, Sep 5, 2011 at 9:29 AM, Sesha Kumar <[EMAIL PROTECTED]> wrote:
> 1. Namenode stores blockmaps for all the blocks in its main memory. This
> can be used to keep an up-to-date snapshot of total filesystem. But what i
> feel is this blockmap is not a constant data and hence storing it in main
> memory all the time can be avoided in order to save main memory space. On a
> request for a file from the client the blockmap details can be fetched.