Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> When running Hadoop in pseudo-distributed mode, what directory should I use for hadoop.tmp.dir?


+
jeremy p 2012-10-05, 17:21
+
Harsh J 2012-10-05, 17:58
Copy link to this message
-
Re: When running Hadoop in pseudo-distributed mode, what directory should I use for hadoop.tmp.dir?
Thank you, that worked!

On Fri, Oct 5, 2012 at 10:58 AM, Harsh J <[EMAIL PROTECTED]> wrote:

> On 0.20.x or 1.x based releases, do not use a file:/// prefix for
> hadoop.tmp.dir. That won't work. Remove it and things should work, I
> guess.
>
> And yes, for production, either tweak specific configs (like
> dfs.name.dir, dfs.data.dir, mapred.local.dir, mapred.system.dir (DFS),
> mapreduce.jobtracker.staging.root.dir (DFS)) to specific paths rather
> than hadoop.tmp.dir relative path and keep hadoop.tmp.dir at /tmp or
> another temporary local store (non DFS).
>
> On Fri, Oct 5, 2012 at 10:51 PM, jeremy p
> <[EMAIL PROTECTED]> wrote:
> > By default, Hadoop sets hadoop.tmp.dir to your /tmp folder. This is a
> > problem, because /tmp gets wiped out by Linux when you reboot, leading to
> > this lovely error from the JobTracker :
> >
> > 2012-10-05 07:41:13,618 INFO org.apache.hadoop.ipc.Client: Retrying
> connect
> > to server: localhost/127.0.0.1:8020. Already tried 0 time(s).
> > ...
> > 2012-10-05 07:41:22,636 INFO org.apache.hadoop.ipc.Client: Retrying
> connect
> > to server: localhost/127.0.0.1:8020. Already tried 9 time(s).
> > 2012-10-05 07:41:22,643 INFO org.apache.hadoop.mapred.JobTracker: problem
> > cleaning system directory: null
> > java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on
> > connection exception: java.net.ConnectException: Connection refused
> >     at org.apache.hadoop.ipc.Client.wrapException(Client.java:767)
> > The only way I've found to fix this is to reformat your name node, which
> > rebuilds the /tmp/hadoop-root folder, which of course gets wiped out
> again
> > when you reboot.
> >
> > So I went ahead and created a folder called /hadoop_temp and gave all
> users
> > read/write access to it. I then set this property in my core-site.xml :
> >
> > <property>
> > <name>hadoop.tmp.dir</name>
> > <value>file:///hadoop_temp</value>
> > </property>
> >
> > When I re-formatted my namenode, Hadoop seemed happy, giving me this
> message
> > :
> >
> > 12/10/05 07:58:54 INFO common.Storage: Storage directory
> > file:/hadoop_temp/dfs/name has been successfully formatted.
> > However, when I looked at /hadoop_temp, I noticed that the folder was
> empty.
> > And then when I restarted Hadoop and checked my JobTracker log, I saw
> this :
> >
> > 2012-10-05 08:02:41,988 INFO org.apache.hadoop.ipc.Client: Retrying
> connect
> > to server: localhost/127.0.0.1:8020. Already tried 0 time(s).
> > ...
> > 2012-10-05 08:02:51,010 INFO org.apache.hadoop.ipc.Client: Retrying
> connect
> > to server: localhost/127.0.0.1:8020. Already tried 9 time(s).
> > 2012-10-05 08:02:51,011 INFO org.apache.hadoop.mapred.JobTracker: problem
> > cleaning system directory: null
> > java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on
> > connection exception: java.net.ConnectException: Connection refused
> > And when I checked my namenode log, I saw this :
> >
> > 2012-10-05 08:00:31,206 INFO
> org.apache.hadoop.hdfs.server.common.Storage:
> > Storage directory /opt/hadoop/hadoop-0.20.2/file:/hadoop_temp/dfs/name
> does
> > not exist.
> > 2012-10-05 08:00:31,212 ERROR
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
> > initialization failed.
> > org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
> Directory
> > /opt/hadoop/hadoop-0.20.2/file:/hadoop_temp/dfs/name is in an
> inconsistent
> > state: storage directory does not exist or is not accessible.
> > So, clearly I didn't configure something right. Hadoop still expects to
> see
> > its files in the /tmp folder even though I set hadoop.tmp.dir to
> > /hadoop_temp in core-site.xml. What did I do wrong? What's the accepted
> > "right" value for hadoop.tmp.dir?
> >
> > Bonus question : what should I use for hbase.tmp.dir?
> >
> > System info :
> >
> > Ubuntu 12.04, Apache Hadoop .20.2, Apache HBase .92.1
> >
> > Thanks for taking a look!
> >
> > --Jeremy
>
>
>
> --
> Harsh J
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB