Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Is it necessary to cache metadata in client side?


Copy link to this message
-
Re: Is it necessary to cache metadata in client side?
Per inputstream means the cache can only been used in the scope of one
file. I think it's will be better if there's a cache in DFSClient.

On Fri, Jun 11, 2010 at 5:02 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> It is cached per input stream - see DFSInputStream.locatedBlocks,
> prefetchSize, etc.
>
> -Todd
> On Thu, Jun 10, 2010 at 11:43 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote:
>>
>> Hi all,
>>
>> According the GFS paper claims, GFS will cache meta data in client.
>> But when I check the source code of hadoop, it seems that hadoop won't
>> cache it in client side. I just wan to make sure whether I am right ?
>> And wondering whether there's someone work on it ? One advantage of
>> caching metadata in client side I can think of is that tasktracker
>> will fetch job.xml in HDFS. And most of time we will run multiple task
>> in one node, so if tasktrack cache the metadata, it can reduce the
>> communication with namenode.
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

--
Best Regards

Jeff Zhang