Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Question about HDFS Architecture


Copy link to this message
-
Re: Question about HDFS Architecture
Harold Lim 2009-08-25, 04:57
Hi Todd,

Yes. My question is about multiple re-opens. For example, I have an application that reads/fetches a file depending on what a user chooses. So, in this case, there is no location caching?

Thanks,
Harold
--- On Tue, 8/25/09, Todd Lipcon <[EMAIL PROTECTED]> wrote:

> From: Todd Lipcon <[EMAIL PROTECTED]>
> Subject: Re: Question about HDFS Architecture
> To: [EMAIL PROTECTED]
> Date: Tuesday, August 25, 2009, 12:43 AM
> On Mon, Aug 24, 2009 at 6:40 PM, Konstantin
> Shvachko <[EMAIL PROTECTED]>
> wrote:
>
>
> Harold,
>
>
>
> Both answers by Aaron were incorrect.
>
>
>
> > Does the client cache this information, or does it
> always talk to the namenode first?
>
>
>
> Yes, the client caches replica locations received from the
> name-node.
>
> On open() it receives locations of the first 10 blocks of
> the file.
>
> In most cases these are all file blocks. If not then the
> client will
>
> get another portion of blocks when needed, and will also
> cache them.
> This is only within a single DFSInputStream. The
> block location cache does not persist across re-opens of the
> same file. As I read the original question, it was about
> longer-term caching, not just keeping state during a single
> DFSInputStream.
>
>
> -Todd
>
>