-Re: Is it necessary to cache metadata in client side?
Jeff Zhang 2010-06-11, 09:17
Per inputstream means the cache can only been used in the scope of one
file. I think it's will be better if there's a cache in DFSClient.
On Fri, Jun 11, 2010 at 5:02 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> It is cached per input stream - see DFSInputStream.locatedBlocks,
> prefetchSize, etc.
> On Thu, Jun 10, 2010 at 11:43 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote:
>> Hi all,
>> According the GFS paper claims, GFS will cache meta data in client.
>> But when I check the source code of hadoop, it seems that hadoop won't
>> cache it in client side. I just wan to make sure whether I am right ?
>> And wondering whether there's someone work on it ? One advantage of
>> caching metadata in client side I can think of is that tasktracker
>> will fetch job.xml in HDFS. And most of time we will run multiple task
>> in one node, so if tasktrack cache the metadata, it can reduce the
>> communication with namenode.
>> Best Regards
>> Jeff Zhang
> Todd Lipcon
> Software Engineer, Cloudera