Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Is it necessary to cache metadata in client side?


Copy link to this message
-
Re: Is it necessary to cache metadata in client side?
Per inputstream means the cache can only been used in the scope of one
file. I think it's will be better if there's a cache in DFSClient.

On Fri, Jun 11, 2010 at 5:02 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> It is cached per input stream - see DFSInputStream.locatedBlocks,
> prefetchSize, etc.
>
> -Todd
> On Thu, Jun 10, 2010 at 11:43 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote:
>>
>> Hi all,
>>
>> According the GFS paper claims, GFS will cache meta data in client.
>> But when I check the source code of hadoop, it seems that hadoop won't
>> cache it in client side. I just wan to make sure whether I am right ?
>> And wondering whether there's someone work on it ? One advantage of
>> caching metadata in client side I can think of is that tasktracker
>> will fetch job.xml in HDFS. And most of time we will run multiple task
>> in one node, so if tasktrack cache the metadata, it can reduce the
>> communication with namenode.
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

--
Best Regards

Jeff Zhang
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB