Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # general >> JVM is crashing for systems with NFS


+
Mihail Ionescu 2012-12-08, 17:01
+
Ted Dunning 2012-12-08, 18:00
+
Mihail Ionescu 2012-12-10, 11:09
Copy link to this message
-
Re: JVM is crashing for systems with NFS
Hi,

I have experienced this with non-hadoop applications although we
typically saw the SIGBUS with Java 5. When it received the signal the
JVM was trying to load a class from a jar on the NFS volume. In our
experience with whatever version of Java 6 we were using, the JVM
would throw ClassNotFoundExceptions as opposed to exiting due to a
SIGBUS. We were using SLES 10 or 11. At the end of the day, we decided
not keep jar files on NFS.

Brock

On Sat, Dec 8, 2012 at 11:01 AM, Mihail Ionescu <[EMAIL PROTECTED]> wrote:
> I have a small cluster of 15 machines, running hadoop-1-0-2. Each machine
> runs kernel 2.6.35, has a root disk mounted under nfs (all machines have
> the same root file system) and  a local disk (mounted under
> /mnt/localdisk). I installed hadoop under /mnt/localdisk/hadoop, which the
> conf directory shared for all machines (in order to change the
> configuration for all machines in an easy manner). I am using jdk 1.6.23,
> installed locally on /mnt/localdisk/jdk. On each machine a datanode and a
> tasktracker are running, each task tracker has 2 slots for mapper and 2
> slots for reducer.
>
> The problem is that, after running various map-reduce tasks, JVM crashes
> pretty frequently on many machines. There is no rule I could find,
> sometimes the datanode is crashing, other times tasktracker, or maybe even
> both. They generate a hs_err file, with SIGBUS 0x7, if needed I can post
> the contents of that file, I could not find anything interesting there.
>
> Does anyone had this problem? Maybe because the root file system is shared
> and hadoop tries to writes some files in /tmp or something and because the
> file system is shared across all machines? Any help would be greatly
> appreciated.
>
> Thanks,
>
> Mihail

--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB