Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> problem running multiple native mode map reduce processes concurrently


Copy link to this message
-
Re: problem running multiple native mode map reduce processes concurrently
Please post your Hadoop version (command: hadoop version).

On Thu, Mar 21, 2013 at 10:59 PM, Derrick H. Karimi
<[EMAIL PROTECTED]> wrote:
> Anybody have any ideas?  How can I safely run two native mode rap reduces on
> one machine at the same time?
>
>
>
> --Derrick H. Karimi
>
> --Software Developer, SEI Innovation Center
>
> --Carnegie Mellon University
>
>
>
> From: Derrick H. Karimi
> Sent: Tuesday, March 19, 2013 10:55 PM
> To: '[EMAIL PROTECTED]'
> Subject: problem running multiple native mode map reduce processes
> concurrently
>
>
>
> Hi,
>
>
>
> I have a MapReduce program I have written and have used it on top of a
> Hadoop cluster with success.  During development, for quick tests, and when
> the cluster is not available I run it on machines that have no access to a
> Hadoop cluster.  I do this with regular command line invocation
>
>
>
> java –cp $MY_HADOOP_JARS:mybuild/app_under_test.jar
>
>
>
> This works fine, until I attempt to run more than one at a time.  When I do
> launch many at one time I intermittently get failures.  (each invocation is
> using a separate copy of jars, and has its own working directory and
> input/output area, they are fully distributable and do not share anything.
> The machines have plenty of disk space too.)  Most commonly I get two
> exception’s in my job’s stderr output:
>
>
>
> org.apache.hadoop.util.DiskChecker$DiskErrorException: "Could not find
> output/file.out in any of the configured local directories"
>
>
>
> when I see this error the job appears to continue on, but in the output I
> can tell that several of my input files were not processed.  I have nothing
> called “output/file.out” in my job.
>
>
>
> The other error text I do not have handy at the moment, but it appears to be
> an XML parser error at job startup on some file in the /tmp directory that
> is not part of any file mentioned in my job.  Here I assume that multiple
> instances of the native mode implementation of map reduce are trying to
> write to the same file at startup and it gets corrupted.  In these cases the
> job fails and I do not get any output.  I theorize I can work around this
> error by sleeping a few seconds between launching my processes.
>
>
>
> I expected to be able to run more than one of these processes at the same
> time.  It appears I cannot.  Does anyone have any suggestions that would
> help me do this?
>
>
>
> --Derrick H. Karimi
>
> --Software Developer, SEI Innovation Center
>
> --Carnegie Mellon University
>
>

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB