Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Accumulo Map Reduce is not distributed


Copy link to this message
-
Re: Accumulo Map Reduce is not distributed
What about the main method that calls ToolRunner.run? If you have 4 jobs
being created, then you're calling run(String[]) or runOneTable() 4 times.

On Fri, Nov 2, 2012 at 5:21 PM, Cornish, Duane C.
<[EMAIL PROTECTED]>wrote:

> Thanks for the prompt response John!****
>
> ****
>
> When I say that I’m pre-splitting my table, I mean I am using the
> tableOperations().addSplits(table,splits) command.  I have verified that
> this is correctly splitting my table into 4 tablets and it is being
> distributed across my cloud before I start my map reduce job.****
>
> ** **
>
> Now, I only kick off the job once, but it appears that 4 separate jobs run
> (one after the other).  The first one reaches 100% in its map phase (and
> based on my output only handled ¼ of the data), then the next job starts at
> 0% and reaches 100%, and so on.  So I think I’m “only running one mapper
> at a time in an MR job that has 4 mappers total.”.  I have 2 mapper slots
> per node.  My hadoop is set up so that one machine is the namenode and the
> other 3 are datanodes.  This gives me 6 slots total.  (This is not
> congruent to my accumulo where the master is also a slave – giving 4 total
> slaves).  ****
>
> ** **
>
> My map reduce job is not a chain job, so all 4 tablets should be able to
> run at the same time.****
>
> ** **
>
> Here is my job class code below:****
>
> ** **
>
> *import* org.apache.accumulo.core.security.Authorizations;****
>
> *import* org.apache.accumulo.core.client.mapreduce.AccumuloOutputFormat;**
> **
>
> *import* org.apache.accumulo.core.client.mapreduce.AccumuloRowInputFormat;
> ****
>
> *import* org.apache.hadoop.conf.Configured;****
>
> *import* org.apache.hadoop.io.DoubleWritable;****
>
> *import* org.apache.hadoop.io.Text;****
>
> *import* org.apache.hadoop.mapreduce.Job;****
>
> *import* org.apache.hadoop.util.Tool;****
>
> *import* org.apache.log4j.Level;****
>
> ** **
>
> ** **
>
> *public* *class* Accumulo_FE_MR_Job *extends* Configured *implements*Tool{
> ****
>
>        ****
>
>        *private* *void* runOneTable() *throws* Exception {****
>
>         System.*out*.println("Running Map Reduce Feature Extraction Job");
> ****
>
> ** **
>
>         Job job  = *new* Job(getConf(), getClass().getName());****
>
> ** **
>
>         job.setJarByClass(getClass());****
>
>         job.setJobName("MRFE");****
>
> ** **
>
>         job.setInputFormatClass(AccumuloRowInputFormat.*class*);****
>
>         AccumuloRowInputFormat.*setZooKeeperInstance*
> (job.getConfiguration(),****
>
>                 HMaxConstants.*INSTANCE*,****
>
>                 HMaxConstants.*ZOO_SERVERS*);****
>
> ** **
>
>         AccumuloRowInputFormat.*setInputInfo*(job.getConfiguration(),****
>
>                      HMaxConstants.*USER*, ****
>
>                 HMaxConstants.*PASSWORD*.getBytes(), ****
>
>                 HMaxConstants.*FEATLESS_IMG_TABLE*,****
>
>                 *new* Authorizations());****
>
>         ****
>
>         AccumuloRowInputFormat.*setLogLevel*(job.getConfiguration(),
> Level.*FATAL*);****
>
> ** **
>
>         job.setMapperClass(AccumuloFEMapper.*class*);****
>
>         job.setMapOutputKeyClass(Text.*class*);****
>
>         job.setMapOutputValueClass(DoubleWritable.*class*);****
>
> ** **
>
>         job.setNumReduceTasks(4);****
>
>         job.setReducerClass(AccumuloFEReducer.*class*);****
>
>         job.setOutputKeyClass(Text.*class*);****
>
>         job.setOutputValueClass(Text.*class*);****
>
> ** **
>
>         job.setOutputFormatClass(AccumuloOutputFormat.*class*);****
>
>         AccumuloOutputFormat.*setZooKeeperInstance*
> (job.getConfiguration(),****
>
>                      HMaxConstants.*INSTANCE*,****
>
>                      HMaxConstants.*ZOO_SERVERS*);****
>
>         AccumuloOutputFormat.*setOutputInfo*(job.getConfiguration(),****
>
>                      HMaxConstants.*USER*,****
>
>                      HMaxConstants.*PASSWORD*.getBytes(),****
>
>                 *true*,****
>
>                 HMaxConstants.*ALL_IMG_TABLE*);****