Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Re: built hadoop! please help with next steps?


Copy link to this message
-
Re: built hadoop! please help with next steps?
Sandy Ryza 2013-05-31, 22:22
I've been successful with importing all the leaf-level maven projects as
"Existing Maven Projects" using the eclipse maven plugin.  I've also gotten
things to work without the eclipse maven plugin with some combination of
mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top
pom.xml as my eclipse workspace directory.
On Fri, May 31, 2013 at 3:18 PM, John Lilley <[EMAIL PROTECTED]>wrote:

>  Sandy,****
>
> Thanks for all of the tips, I will try this over the weekend.   Regarding
> the last question, I am still trying to get the source loaded into Eclipse
> in a manner that facilitates easier browsing, symbol search, editing, etc.
> Perhaps I am just missing some obvious FAQ?  This is leading up to
> modifying and debugging the “shell” ApplicationMaster sample.  This page:*
> ***
>
>
> http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
> ****
>
> looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old
> and I’m not sure if it applies to Hadoop 2.0 and YARN.****
>
> John****
>
> ** **
>
> *From:* Sandy Ryza [mailto:[EMAIL PROTECTED]]
> *Sent:* Friday, May 31, 2013 12:13 PM
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: built hadoop! please help with next steps?****
>
> ** **
>
> Hi John,****
>
> ** **
>
> Here's how I deploy/debug Hadoop locally:****
>
> To build and tar Hadoop:****
>
> ** **
>
>   mvn clean package -Pdist -Dtar -DskipTests=true****
>
> ** **
>
> The tar will be located in the project directory under
> hadoop-dist/target/.  I untar it into my deploy directory.****
>
> ** **
>
> I then copy these scripts into the same directory:****
>
> ** **
>
> hadoop-dev-env.sh:****
>
> ---****
>
> #!/bin/bash****
>
> export HADOOP_DEV_HOME=`pwd`****
>
> export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}****
>
> export YARN_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop****
>
> ** **
>
> hadoop-dev-setup.sh:****
>
> ---****
>
> #!/bin/bash****
>
> source ./hadoop-dev-env.sh****
>
> bin/hadoop namenode -format****
>
> ** **
>
> hadoop-dev.sh:****
>
> ---****
>
> source hadoop-dev-env.sh****
>
> sbin/hadoop-daemon.sh $1 namenode****
>
> sbin/hadoop-daemon.sh $1 datanode****
>
> sbin/yarn-daemon.sh $1 resourcemanager****
>
> sbin/yarn-daemon.sh $1 nodemanager****
>
> sbin/mr-jobhistory-daemon.sh $1 historyserver****
>
> sbin/httpfs.sh $1****
>
> ** **
>
> I copy all the files in <deploy directory>/conf into my conf directory,
> <deploy directory>/etc/hadoop, and then copy the minimal site configuration
> into .  The advantage of using a directory that's not the /conf directory
> is that it won't be overwritten the next time you untar a new build.
>  Lastly, I copy the minimal site configuration into the conf files.  For
> the sake of brevity, I won't include the properties in full xml format, but
> here are the ones I set:****
>
> ** **
>
> yarn-site.xml:****
>
>   yarn.nodemanager.aux-services = mapreduce.shuffle****
>
>   yarn.nodemanager.aux-services.mapreduce.shuffle.class
> = org.apache.hadoop.mapred.ShuffleHandler****
>
>   yarn.resourcemanager.scheduler.class
> = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
> ****
>
> mapred-site.xml:****
>
>   mapreduce.framework.name = yarn****
>
> core-site.xml:****
>
>   fs.default.name = hdfs://localhost:9000****
>
> hdfs-site.xml:****
>
>   dfs.replication = 1****
>
>   dfs.permissions = false****
>
> ** **
>
> Then, to format HDFS and start our cluster, we can simply do:****
>
> ./hadoop-dev-setup.sh****
>
> ./hadoop-dev.sh start****
>
> To stop it:****
>
> ./hadoop-dev.sh stop****
>
> ** **
>
> Once I have this set up, for quicker iteration, I have some scripts that
> build submodules (sometimes all of mapreduce, sometimes just the
> resourcemanager) and copy the updated jars into my setup.****
>
> ** **