Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> RE: Running Hadoop v2 clustered mode MR on an NFS mounted filesystem


+
java8964 2013-12-20, 14:28
Copy link to this message
-
Re: Running Hadoop v2 clustered mode MR on an NFS mounted filesystem
Yong raises an important issue:  You have thrown out the I/O advantages
of HDFS and also thrown out the advantages of data locality. It would be
interesting to know why you are taking this approach.
Chris

On 12/20/2013 9:28 AM, java8964 wrote:
> I believe the "-fs local" should be removed too. The reason is that
> even you have a dedicated JobTracker after removing "-jt local", but
> with "-fs local", I believe that all the mappers will be run
> sequentially.
>
> "-fs local" will force the mapreducer run in "local" mode, which is
> really a test mode.
>
> What you can do is to remove both "-fs local -jt local", but give the
> FULL URI of the input and output path, to tell Hadoop that they are
> local filesystem instead of HDFS.
>
> "hadoop jar
> /hduser/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar
> wordcount file:///hduser/mount_point file:///results"
>
> Keep in mind followings:
>
> 1) The NFS mount need to be available in all your Task Nodes, and
> mounted in the same way.
> 2) Even you can do that, but your sharing storage will be your
> bottleneck. NFS won't work well for scalability.
>
> Yong
>
> ------------------------------------------------------------------------
> Date: Fri, 20 Dec 2013 09:01:32 -0500
> Subject: Re: Running Hadoop v2 clustered mode MR on an NFS mounted
> filesystem
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
>
> I think most of your problem is coming from the options you are setting:
>
> "hadoop jar
> /hduser/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar
> wordcount *-fs local -jt local* /hduser/mount_point/  /results"
>
> You appear to be directing your namenode to run jobs in the
> *LOCAL* job runner and directing it to read from the
> *LOCAL* filesystem. Drop the *-jt* argument and it should run in
> distributed mode if your cluster is set up right. You don't need to do
> anything special to point Hadoop towards a NFS location, other than
> set up the NFS location properly and make sure if you are directing to
> it by name that it will resolve to the right address. Hadoop doesn't
> care where it is, as long as it can read from and write to it. The
> fact that you are telling it to read/write from/to a NFS location that
> happens to be mounted as a local filesystem object doesn't matter -
> you could direct it to the local /hduser/ path and set the -fs local
> option, and it would end up on the NFS mount, because that's where the
> NFS mount actually exists, or you could direct it to the absolute
> network location of the folder that you want, it shouldn't make a
> difference.
>
> *Devin Suiter*
> Jr. Data Solutions Software Engineer
> 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
> Google Voice: 412-256-8556 | www.rdx.com <http://www.rdx.com/>
>
>
> On Fri, Dec 20, 2013 at 5:27 AM, Atish Kathpal
> <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
>
>     Hello
>
>     The picture below describes the deployment architecture I am
>     trying to achieve.
>     However, when I run the wordcount example code with the below
>     configuration, by issuing the command from the master node, I
>     notice only the master node spawning map tasks and completing the
>     submitted job. Below is the command I used:
>
>     *hadoop jar
>     /hduser/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar
>     wordcount -fs local -jt local /hduser/mount_point/  /results*
>
>     _Question: How can I leverage both the hadoop nodes for running
>     MR, while serving my data from the common NFS mount point running
>     my filesystem at the backend? Has any one tried such a setup before?_
>     Inline image 1
>
>     Thanks!
>
>

+
Atish Kathpal 2014-01-08, 10:00
+
Atish Kathpal 2014-01-08, 10:18
+
java8964 2014-01-10, 15:42
+
Atish Kathpal 2014-01-12, 10:36
+
Atish Kathpal 2014-01-12, 10:44
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB