Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - Re: Map Tasks do not obey data locality principle........


Copy link to this message
-
Re: Map Tasks do not obey data locality principle........
Harsh J 2013-05-15, 21:01
Also, does your custom FS report block locations in the exact same
format as how HDFS does?

On Tue, May 14, 2013 at 4:25 PM, Agarwal, Nikhil
<[EMAIL PROTECTED]> wrote:
> Hi,
>
>
>
> I  have a 3-node cluster, with JobTracker running on one machine and
> TaskTrackers on other two (say, slave1 and slave2). Instead of using HDFS, I
> have written my own FileSystem implementation. Since, unlike HDFS I am
> unable to provide a shared filesystem view to JobTrackers and TaskTracker
> thus, I mounted the root container of slave2 on a directory in slave1 (nfs
> mount). By doing this I am able to submit MR job to JobTracker, with input
> path as my_scheme://slave1_IP:Port/dir1, etc.  MR runs successfully but what
> happens is that data locality is not ensured i.e. if files A,B,C are kept on
> slave1 and D,E,F on slave2 then according to data locality, map tasks should
> be submitted such that map task of A,B,C are submitted to TaskTracker
> running on slave1 and D,E,F on slave2. Instead of this, it randomly
> schedules the map task to any of the tasktrackers. If map task of file A is
> submitted to TaskTracker running on slave2 then it implies that file A is
> being fetched over the network by slave2.
>
>
>
> How do I avoid this from happening?
>
>
>
> Thanks,
>
> Nikhil
>
>
>
>

--
Harsh J