Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> problem running multiple native mode map reduce processes concurrently


Copy link to this message
-
Re: problem running multiple native mode map reduce processes concurrently
Please post your Hadoop version (command: hadoop version).

On Thu, Mar 21, 2013 at 10:59 PM, Derrick H. Karimi
<[EMAIL PROTECTED]> wrote:
> Anybody have any ideas?  How can I safely run two native mode rap reduces on
> one machine at the same time?
>
>
>
> --Derrick H. Karimi
>
> --Software Developer, SEI Innovation Center
>
> --Carnegie Mellon University
>
>
>
> From: Derrick H. Karimi
> Sent: Tuesday, March 19, 2013 10:55 PM
> To: '[EMAIL PROTECTED]'
> Subject: problem running multiple native mode map reduce processes
> concurrently
>
>
>
> Hi,
>
>
>
> I have a MapReduce program I have written and have used it on top of a
> Hadoop cluster with success.  During development, for quick tests, and when
> the cluster is not available I run it on machines that have no access to a
> Hadoop cluster.  I do this with regular command line invocation
>
>
>
> java –cp $MY_HADOOP_JARS:mybuild/app_under_test.jar
>
>
>
> This works fine, until I attempt to run more than one at a time.  When I do
> launch many at one time I intermittently get failures.  (each invocation is
> using a separate copy of jars, and has its own working directory and
> input/output area, they are fully distributable and do not share anything.
> The machines have plenty of disk space too.)  Most commonly I get two
> exception’s in my job’s stderr output:
>
>
>
> org.apache.hadoop.util.DiskChecker$DiskErrorException: "Could not find
> output/file.out in any of the configured local directories"
>
>
>
> when I see this error the job appears to continue on, but in the output I
> can tell that several of my input files were not processed.  I have nothing
> called “output/file.out” in my job.
>
>
>
> The other error text I do not have handy at the moment, but it appears to be
> an XML parser error at job startup on some file in the /tmp directory that
> is not part of any file mentioned in my job.  Here I assume that multiple
> instances of the native mode implementation of map reduce are trying to
> write to the same file at startup and it gets corrupted.  In these cases the
> job fails and I do not get any output.  I theorize I can work around this
> error by sleeping a few seconds between launching my processes.
>
>
>
> I expected to be able to run more than one of these processes at the same
> time.  It appears I cannot.  Does anyone have any suggestions that would
> help me do this?
>
>
>
> --Derrick H. Karimi
>
> --Software Developer, SEI Innovation Center
>
> --Carnegie Mellon University
>
>

--
Harsh J