Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> Accumulo Map Reduce is not distributed


+
Cornish, Duane C. 2012-11-02, 20:53
+
John Vines 2012-11-02, 21:04
+
Cornish, Duane C. 2012-11-02, 21:21
Copy link to this message
-
Re: Accumulo Map Reduce is not distributed
What about the main method that calls ToolRunner.run? If you have 4 jobs
being created, then you're calling run(String[]) or runOneTable() 4 times.

On Fri, Nov 2, 2012 at 5:21 PM, Cornish, Duane C.
<[EMAIL PROTECTED]>wrote:

> Thanks for the prompt response John!****
>
> ****
>
> When I say that I’m pre-splitting my table, I mean I am using the
> tableOperations().addSplits(table,splits) command.  I have verified that
> this is correctly splitting my table into 4 tablets and it is being
> distributed across my cloud before I start my map reduce job.****
>
> ** **
>
> Now, I only kick off the job once, but it appears that 4 separate jobs run
> (one after the other).  The first one reaches 100% in its map phase (and
> based on my output only handled ¼ of the data), then the next job starts at
> 0% and reaches 100%, and so on.  So I think I’m “only running one mapper
> at a time in an MR job that has 4 mappers total.”.  I have 2 mapper slots
> per node.  My hadoop is set up so that one machine is the namenode and the
> other 3 are datanodes.  This gives me 6 slots total.  (This is not
> congruent to my accumulo where the master is also a slave – giving 4 total
> slaves).  ****
>
> ** **
>
> My map reduce job is not a chain job, so all 4 tablets should be able to
> run at the same time.****
>
> ** **
>
> Here is my job class code below:****
>
> ** **
>
> *import* org.apache.accumulo.core.security.Authorizations;****
>
> *import* org.apache.accumulo.core.client.mapreduce.AccumuloOutputFormat;**
> **
>
> *import* org.apache.accumulo.core.client.mapreduce.AccumuloRowInputFormat;
> ****
>
> *import* org.apache.hadoop.conf.Configured;****
>
> *import* org.apache.hadoop.io.DoubleWritable;****
>
> *import* org.apache.hadoop.io.Text;****
>
> *import* org.apache.hadoop.mapreduce.Job;****
>
> *import* org.apache.hadoop.util.Tool;****
>
> *import* org.apache.log4j.Level;****
>
> ** **
>
> ** **
>
> *public* *class* Accumulo_FE_MR_Job *extends* Configured *implements*Tool{
> ****
>
>        ****
>
>        *private* *void* runOneTable() *throws* Exception {****
>
>         System.*out*.println("Running Map Reduce Feature Extraction Job");
> ****
>
> ** **
>
>         Job job  = *new* Job(getConf(), getClass().getName());****
>
> ** **
>
>         job.setJarByClass(getClass());****
>
>         job.setJobName("MRFE");****
>
> ** **
>
>         job.setInputFormatClass(AccumuloRowInputFormat.*class*);****
>
>         AccumuloRowInputFormat.*setZooKeeperInstance*
> (job.getConfiguration(),****
>
>                 HMaxConstants.*INSTANCE*,****
>
>                 HMaxConstants.*ZOO_SERVERS*);****
>
> ** **
>
>         AccumuloRowInputFormat.*setInputInfo*(job.getConfiguration(),****
>
>                      HMaxConstants.*USER*, ****
>
>                 HMaxConstants.*PASSWORD*.getBytes(), ****
>
>                 HMaxConstants.*FEATLESS_IMG_TABLE*,****
>
>                 *new* Authorizations());****
>
>         ****
>
>         AccumuloRowInputFormat.*setLogLevel*(job.getConfiguration(),
> Level.*FATAL*);****
>
> ** **
>
>         job.setMapperClass(AccumuloFEMapper.*class*);****
>
>         job.setMapOutputKeyClass(Text.*class*);****
>
>         job.setMapOutputValueClass(DoubleWritable.*class*);****
>
> ** **
>
>         job.setNumReduceTasks(4);****
>
>         job.setReducerClass(AccumuloFEReducer.*class*);****
>
>         job.setOutputKeyClass(Text.*class*);****
>
>         job.setOutputValueClass(Text.*class*);****
>
> ** **
>
>         job.setOutputFormatClass(AccumuloOutputFormat.*class*);****
>
>         AccumuloOutputFormat.*setZooKeeperInstance*
> (job.getConfiguration(),****
>
>                      HMaxConstants.*INSTANCE*,****
>
>                      HMaxConstants.*ZOO_SERVERS*);****
>
>         AccumuloOutputFormat.*setOutputInfo*(job.getConfiguration(),****
>
>                      HMaxConstants.*USER*,****
>
>                      HMaxConstants.*PASSWORD*.getBytes(),****
>
>                 *true*,****
>
>                 HMaxConstants.*ALL_IMG_TABLE*);****
+
David Medinets 2012-11-03, 03:49
+
Cornish, Duane C. 2012-11-05, 13:56
+
John Vines 2012-11-05, 14:13
+
Billie Rinaldi 2012-11-05, 14:40
+
Cornish, Duane C. 2012-11-05, 14:46
+
Billie Rinaldi 2012-11-05, 15:03
+
Cornish, Duane C. 2012-11-05, 16:54
+
Krishmin Rai 2012-11-05, 17:14
+
Billie Rinaldi 2012-11-05, 17:18
+
Cornish, Duane C. 2012-11-06, 13:45
+
David Medinets 2012-11-06, 14:34
+
Cornish, Duane C. 2012-11-06, 14:53
+
Billie Rinaldi 2012-11-06, 15:19
+
David Medinets 2012-11-05, 15:16
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB