|
|
-
Question about HDFS Architecture
Harold Lim 2009-08-20, 22:44
To read/get a file, I understand that a client first contacts the namenode to determine which datanode has the file/block. Then, it contacts the datanode for the actual file.
Does the client cache this information, or does it always talk to the namenode first?
Also, if a file has multiple replicas stored on multiple datanodes on the same "rack", how does the namenode pick which datanode the client has to talk to? In this case, all datanodes are homogeneous, which makes the "rack-awareness" unimportant to the decision making.
Thanks, Harold
+
Harold Lim 2009-08-20, 22:44
-
Re: Question about HDFS Architecture
Aaron Kimball 2009-08-21, 07:36
On Thu, Aug 20, 2009 at 3:44 PM, Harold Lim <[EMAIL PROTECTED]> wrote:
> To read/get a file, I understand that a client first contacts the namenode > to determine which datanode has the file/block. Then, it contacts the > datanode for the actual file. > > Does the client cache this information, or does it always talk to the > namenode first? The latter. > > > Also, if a file has multiple replicas stored on multiple datanodes on the > same "rack", how does the namenode pick which datanode the client has to > talk to? In this case, all datanodes are homogeneous, which makes the > "rack-awareness" unimportant to the decision making. I believe the datanode itself picks. In the absence of rack information, it's choice is random (unless one is localhost, in which case that one gets used).
- Aaron > > > Thanks, > Harold > > > >
+
Aaron Kimball 2009-08-21, 07:36
-
Re: Question about HDFS Architecture
Harold Lim 2009-08-21, 13:42
> > Also, if a file has multiple replicas stored on multiple > datanodes on the same "rack", how does the > namenode pick which datanode the client has to talk to? In > this case, all datanodes are homogeneous, which makes the > "rack-awareness" unimportant to the decision > making. > > > I believe the datanode itself picks. In the absence of rack > information, it's choice is random (unless one is > localhost, in which case that one gets used). >
Hi Aaron,
Did you mean the client itself picks? So the namenode simply gives the list of datanode that has the file, and the client picks from this list?
Thanks, Harold
+
Harold Lim 2009-08-21, 13:42
-
Re: Question about HDFS Architecture
Aaron Kimball 2009-08-25, 00:55
yes
On Fri, Aug 21, 2009 at 6:42 AM, Harold Lim <[EMAIL PROTECTED]> wrote:
> > > > Also, if a file has multiple replicas stored on multiple > > datanodes on the same "rack", how does the > > namenode pick which datanode the client has to talk to? In > > this case, all datanodes are homogeneous, which makes the > > "rack-awareness" unimportant to the decision > > making. > > > > > > I believe the datanode itself picks. In the absence of rack > > information, it's choice is random (unless one is > > localhost, in which case that one gets used). > > > > Hi Aaron, > > Did you mean the client itself picks? So the namenode simply gives the list > of datanode that has the file, and the client picks from this list? > > > > Thanks, > Harold > > > > >
+
Aaron Kimball 2009-08-25, 00:55
-
Re: Question about HDFS Architecture
Konstantin Shvachko 2009-08-25, 01:40
Harold,
Both answers by Aaron were incorrect.
> Does the client cache this information, or does it always talk to the namenode first?
Yes, the client caches replica locations received from the name-node. On open() it receives locations of the first 10 blocks of the file. In most cases these are all file blocks. If not then the client will get another portion of blocks when needed, and will also cache them.
> Also, if a file has multiple replicas stored on multiple datanodes on the same "rack", how does the namenode pick which datanode the client has to talk to?
The name-node returns block locations ordered by the proximity to the client. The client always contacts data-nodes in this order. It cannot make any decisions about the proximity because it does not possess knowledge about the cluster topology. If all replicas are on the same rack but not local to the client then the ordering returned by the name-node is arbitrary. This may happen mostly if network topology is not configured. Otherwise replicas should be distributed on different racks. 3 replicas should be on at least 2 racks.
Thanks --Konstantin Harold Lim wrote: > To read/get a file, I understand that a client first contacts the namenode to determine which datanode has the file/block. Then, it contacts the datanode for the actual file. > > Does the client cache this information, or does it always talk to the namenode first? > > Also, if a file has multiple replicas stored on multiple datanodes on the same "rack", how does the namenode pick which datanode the client has to talk to? In this case, all datanodes are homogeneous, which makes the "rack-awareness" unimportant to the decision making. > > Thanks, > Harold > > > >
+
Konstantin Shvachko 2009-08-25, 01:40
-
Re: Question about HDFS Architecture
Harold Lim 2009-08-25, 02:00
Hi Konstantin, How long does the client keep the info in its cache? Or does it continue to use the info, until it becomes invalid (i.e., contacting a data node but the data node does not have that particular file anymore)?
Thanks, Harold
--- On Mon, 8/24/09, Konstantin Shvachko <[EMAIL PROTECTED]> wrote:
> From: Konstantin Shvachko <[EMAIL PROTECTED]> > Subject: Re: Question about HDFS Architecture > To: [EMAIL PROTECTED] > Date: Monday, August 24, 2009, 9:40 PM > Harold, > > Both answers by Aaron were incorrect. > > > Does the client cache this information, or does it > always talk to the namenode first? > > Yes, the client caches replica locations received from the > name-node. > On open() it receives locations of the first 10 blocks of > the file. > In most cases these are all file blocks. If not then the > client will > get another portion of blocks when needed, and will also > cache them. > > > Also, if a file has multiple replicas stored on > multiple datanodes on the same "rack", how does the namenode > pick which datanode the client has to talk to? > > The name-node returns block locations ordered by the > proximity to the client. > The client always contacts data-nodes in this order. It > cannot make any decisions > about the proximity because it does not possess knowledge > about the cluster topology. > If all replicas are on the same rack but not local to the > client then the ordering > returned by the name-node is arbitrary. > This may happen mostly if network topology is not > configured. > Otherwise replicas should be distributed on different > racks. > 3 replicas should be on at least 2 racks. > > Thanks > --Konstantin > > > Harold Lim wrote: > > To read/get a file, I understand that a client first > contacts the namenode to determine which datanode has the > file/block. Then, it contacts the datanode for the actual > file. > > > > Does the client cache this information, or does it > always talk to the namenode first? > > Also, if a file has multiple replicas stored on > multiple datanodes on the same "rack", how does the namenode > pick which datanode the client has to talk to? In this case, > all datanodes are homogeneous, which makes the > "rack-awareness" unimportant to the decision making. > > > > Thanks, > > Harold > > > > > > >
+
Harold Lim 2009-08-25, 02:00
-
Re: Question about HDFS Architecture
Konstantin Shvachko 2009-08-25, 02:07
Yes client continues to use the info, until it becomes invalid. After that it will contact the name-node and update the cache.
--Konstantin
Harold Lim wrote: > Hi Konstantin, > > > How long does the client keep the info in its cache? Or does it continue to use the info, until it becomes invalid (i.e., contacting a data node but the data node does not have that particular file anymore)? > > > > > > Thanks, > Harold > > --- On Mon, 8/24/09, Konstantin Shvachko <[EMAIL PROTECTED]> wrote: > >> From: Konstantin Shvachko <[EMAIL PROTECTED]> >> Subject: Re: Question about HDFS Architecture >> To: [EMAIL PROTECTED] >> Date: Monday, August 24, 2009, 9:40 PM >> Harold, >> >> Both answers by Aaron were incorrect. >> >>> Does the client cache this information, or does it >> always talk to the namenode first? >> >> Yes, the client caches replica locations received from the >> name-node. >> On open() it receives locations of the first 10 blocks of >> the file. >> In most cases these are all file blocks. If not then the >> client will >> get another portion of blocks when needed, and will also >> cache them. >> >>> Also, if a file has multiple replicas stored on >> multiple datanodes on the same "rack", how does the namenode >> pick which datanode the client has to talk to? >> >> The name-node returns block locations ordered by the >> proximity to the client. >> The client always contacts data-nodes in this order. It >> cannot make any decisions >> about the proximity because it does not possess knowledge >> about the cluster topology. >> If all replicas are on the same rack but not local to the >> client then the ordering >> returned by the name-node is arbitrary. >> This may happen mostly if network topology is not >> configured. >> Otherwise replicas should be distributed on different >> racks. >> 3 replicas should be on at least 2 racks. >> >> Thanks >> --Konstantin >> >> >> Harold Lim wrote: >>> To read/get a file, I understand that a client first >> contacts the namenode to determine which datanode has the >> file/block. Then, it contacts the datanode for the actual >> file. >>> Does the client cache this information, or does it >> always talk to the namenode first? >>> Also, if a file has multiple replicas stored on >> multiple datanodes on the same "rack", how does the namenode >> pick which datanode the client has to talk to? In this case, >> all datanodes are homogeneous, which makes the >> "rack-awareness" unimportant to the decision making. >>> Thanks, >>> Harold >>> >>> >>> > > > >
+
Konstantin Shvachko 2009-08-25, 02:07
-
Re: Question about HDFS Architecture
Todd Lipcon 2009-08-25, 04:43
On Mon, Aug 24, 2009 at 6:40 PM, Konstantin Shvachko <[EMAIL PROTECTED]>wrote:
> Harold, > > Both answers by Aaron were incorrect. > > > Does the client cache this information, or does it always talk to the > namenode first? > > Yes, the client caches replica locations received from the name-node. > On open() it receives locations of the first 10 blocks of the file. > In most cases these are all file blocks. If not then the client will > get another portion of blocks when needed, and will also cache them. >
This is only within a single DFSInputStream. The block location cache does not persist across re-opens of the same file. As I read the original question, it was about longer-term caching, not just keeping state during a single DFSInputStream.
-Todd
+
Todd Lipcon 2009-08-25, 04:43
-
Re: Question about HDFS Architecture
Harold Lim 2009-08-25, 04:57
Hi Todd,
Yes. My question is about multiple re-opens. For example, I have an application that reads/fetches a file depending on what a user chooses. So, in this case, there is no location caching?
Thanks, Harold --- On Tue, 8/25/09, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> From: Todd Lipcon <[EMAIL PROTECTED]> > Subject: Re: Question about HDFS Architecture > To: [EMAIL PROTECTED] > Date: Tuesday, August 25, 2009, 12:43 AM > On Mon, Aug 24, 2009 at 6:40 PM, Konstantin > Shvachko <[EMAIL PROTECTED]> > wrote: > > > Harold, > > > > Both answers by Aaron were incorrect. > > > > > Does the client cache this information, or does it > always talk to the namenode first? > > > > Yes, the client caches replica locations received from the > name-node. > > On open() it receives locations of the first 10 blocks of > the file. > > In most cases these are all file blocks. If not then the > client will > > get another portion of blocks when needed, and will also > cache them. > This is only within a single DFSInputStream. The > block location cache does not persist across re-opens of the > same file. As I read the original question, it was about > longer-term caching, not just keeping state during a single > DFSInputStream. > > > -Todd > >
+
Harold Lim 2009-08-25, 04:57
-
Re: Question about HDFS Architecture
Todd Lipcon 2009-08-25, 05:21
On Mon, Aug 24, 2009 at 9:57 PM, Harold Lim <[EMAIL PROTECTED]> wrote:
> Hi Todd, > > Yes. My question is about multiple re-opens. For example, I have an > application that reads/fetches a file depending on what a user chooses. So, > in this case, there is no location caching? >
Correct. But the getBlockLocations call is very fast - it only hits the namenode, and the namenode has the data in RAM.
-Todd > > > > Thanks, > Harold > > > > > --- On Tue, 8/25/09, Todd Lipcon <[EMAIL PROTECTED]> wrote: > > > From: Todd Lipcon <[EMAIL PROTECTED]> > > Subject: Re: Question about HDFS Architecture > > To: [EMAIL PROTECTED] > > Date: Tuesday, August 25, 2009, 12:43 AM > > On Mon, Aug 24, 2009 at 6:40 PM, Konstantin > > Shvachko <[EMAIL PROTECTED]> > > wrote: > > > > > > Harold, > > > > > > > > Both answers by Aaron were incorrect. > > > > > > > > > Does the client cache this information, or does it > > always talk to the namenode first? > > > > > > > > Yes, the client caches replica locations received from the > > name-node. > > > > On open() it receives locations of the first 10 blocks of > > the file. > > > > In most cases these are all file blocks. If not then the > > client will > > > > get another portion of blocks when needed, and will also > > cache them. > > This is only within a single DFSInputStream. The > > block location cache does not persist across re-opens of the > > same file. As I read the original question, it was about > > longer-term caching, not just keeping state during a single > > DFSInputStream. > > > > > > -Todd > > > > > > > >
+
Todd Lipcon 2009-08-25, 05:21
-
Re: Question about HDFS Architecture
Aaron Kimball 2009-08-28, 22:24
oops :)
On Mon, Aug 24, 2009 at 6:40 PM, Konstantin Shvachko <[EMAIL PROTECTED]>wrote:
> Harold, > > Both answers by Aaron were incorrect. > > > Does the client cache this information, or does it always talk to the > namenode first? > > Yes, the client caches replica locations received from the name-node. > On open() it receives locations of the first 10 blocks of the file. > In most cases these are all file blocks. If not then the client will > get another portion of blocks when needed, and will also cache them. > > > Also, if a file has multiple replicas stored on multiple datanodes on the > same "rack", how does the namenode pick which datanode the client has to > talk to? > > The name-node returns block locations ordered by the proximity to the > client. > The client always contacts data-nodes in this order. It cannot make any > decisions > about the proximity because it does not possess knowledge about the cluster > topology. > If all replicas are on the same rack but not local to the client then the > ordering > returned by the name-node is arbitrary. > This may happen mostly if network topology is not configured. > Otherwise replicas should be distributed on different racks. > 3 replicas should be on at least 2 racks. > > Thanks > --Konstantin > > > > Harold Lim wrote: > >> To read/get a file, I understand that a client first contacts the namenode >> to determine which datanode has the file/block. Then, it contacts the >> datanode for the actual file. >> >> Does the client cache this information, or does it always talk to the >> namenode first? >> Also, if a file has multiple replicas stored on multiple datanodes on the >> same "rack", how does the namenode pick which datanode the client has to >> talk to? In this case, all datanodes are homogeneous, which makes the >> "rack-awareness" unimportant to the decision making. >> >> Thanks, >> Harold >> >> >> >> >
+
Aaron Kimball 2009-08-28, 22:24
|
|