Am 19.11.2012 um 15:27 schrieb "Kartashov, Andy" <[EMAIL PROTECTED]>:
> I am learning that NN doesn’t persistently store block locations. Only file names and heir permissions as well as file blocks. It is said that locations come from DataNodes when NN starts.
> So, how does it work?
> Say we only have one file A.txt in our HDFS that is split into 4 blocks 1,2,3,4 (no replication), with block 1-2 residing on DN1 and blocks 3,4 on DN2.
> When we start NN it reads it metastore and tries to locate and map the locations of 4 blocks of file A.txt??
when a NameNode starts, it does that in safe mode. Like you said, it doesn't know where the blocks are. The DataNodes send a list of all of their local block IDs (so called block reports). Once the NameNode knows about the locations of most blocks (99,9%, configurable number), it will leave safe mode and HDFS is back.