Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Re: distributed cache


+
Harsh J 2012-12-26, 10:21
+
Lin Ma 2012-12-26, 10:43
+
Harsh J 2012-12-26, 11:48
+
Lin Ma 2012-12-22, 12:03
+
Kai Voigt 2012-12-22, 12:44
Copy link to this message
-
Re: distributed cache
Lin Ma 2012-12-22, 12:46
Thanks Kai, using higher replication count for the purpose of?

regards,
Lin

On Sat, Dec 22, 2012 at 8:44 PM, Kai Voigt <[EMAIL PROTECTED]> wrote:

> Hi,
>
> Am 22.12.2012 um 13:03 schrieb Lin Ma <[EMAIL PROTECTED]>:
>
> > I want to confirm when on each task node either mapper or reducer access
> distributed cache file, it resides on disk, not resides in memory. Just
> want to make sure distributed cache file does not fully loaded into memory
> which compete memory consumption with mapper/reducer tasks. Is that correct?
>
>
> Yes, you are correct. The JobTracker will put files for the distributed
> cache into HDFS with a higher replication count (10 by default). Whenever a
> TaskTracker needs those files for a task it is launching locally, it will
> fetch a copy to its local disk. So it won't need to do this again for
> future tasks on this node. After a job is done, all local copies and the
> HDFS copies of files in the distributed cache are cleaned up.
>
> Kai
>
> --
> Kai Voigt
> [EMAIL PROTECTED]
>
>
>
>
>
+
Kai Voigt 2012-12-22, 12:51
+
Lin Ma 2012-12-22, 13:24
+
Harsh J 2012-12-26, 08:51
+
Ted Yu 2013-07-09, 22:07