Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Programtically invoking a Map/Reduce job


Copy link to this message
-
Re: Programtically invoking a Map/Reduce job
Spot on - thanks Billie, that did the trick!
On Thu, Jan 17, 2013 at 1:57 PM, Billie Rinaldi <[EMAIL PROTECTED]> wrote:

> On Thu, Jan 17, 2013 at 11:16 AM, Mike Hugo <[EMAIL PROTECTED]> wrote:
>
>> Thanks Billie!
>>
>> Setting "mapred.job.tracker" and "fs.default.name" in the conf has
>> gotten me further.
>>
>>          job.getConfiguration().set("mapred.job.tracker",
>> "server_name_here:8021");
>>         job.getConfiguration().set("fs.default.name",
>> "hdfs://server_name_here:8020");
>>
>> What's interesting now is that the job can't find Accumulo classes - when
>> I run the job now, I get
>>
>> 2013-01-17 12:59:25,278 [main] INFO  mapred.JobClient  - Task Id :
>> attempt_201301171102_0012_m_000000_1, Status : FAILED
>> java.lang.RuntimeException: java.lang.ClassNotFoundException:
>> org.apache.accumulo.core.client.mapreduce.AccumuloRowInputFormat
>>
>> Is there a way to inform the job (via the Job API, on a separate machine
>> not running hadoop) about extra libs to include on the classpath of the job?
>>
>
> You normally inform a job about jars it needs by specifying "-libjars
> comma,separated,jar,list" on the command line.  In this case, you need to
> put those two strings "-libjars" and "jar,list" in the String[] args passed
> to ToolRunner.run:
> ToolRunner.run(CachedConfiguration.getInstance(), new ...(), args)
>
> The accumulo-core jar probably isn't the only one you'll need.
>
> Billie
>
>
>>
>> Thanks
>>
>> Mike
>>
>>
>>
>> On Wed, Jan 16, 2013 at 3:11 PM, Billie Rinaldi <[EMAIL PROTECTED]>wrote:
>>
>>> Your job is running in "local" mode (Running job: job_local_0001).  This
>>> basically means that the hadoop configuration is not present on the
>>> classpath of the java client kicking off the job.  If you weren't planning
>>> to have the hadoop config on that machine, you might be able to get away
>>> with setting "mapred.job.tracker" and probably also "fs.default.name"
>>> on the Configuration object.
>>>
>>> Billie
>>>
>>>
>>>
>>> On Wed, Jan 16, 2013 at 12:07 PM, Mike Hugo <[EMAIL PROTECTED]> wrote:
>>>
>>>> Cool, thanks for the feedback John, the examples have been helpful in
>>>> getting up and running!
>>>>
>>>> Perhaps I'm not doing something quite right.  When I jar up my jobs and
>>>> deploy the jar to the server and run it via the tool.sh command on the
>>>> cluster, I see the job running in the jobtracker (servername:50030) and it
>>>> runs as I would expect.
>>>>
>>>> 13/01/16 14:39:53 INFO mapred.JobClient: Running job:
>>>> job_201301161326_0006
>>>> 13/01/16 14:39:54 INFO mapred.JobClient:  map 0% reduce 0%
>>>> 13/01/16 14:41:29 INFO mapred.JobClient:  map 50% reduce 0%
>>>> 13/01/16 14:41:35 INFO mapred.JobClient:  map 100% reduce 0%
>>>> 13/01/16 14:41:40 INFO mapred.JobClient: Job complete:
>>>> job_201301161326_0006
>>>> 13/01/16 14:41:40 INFO mapred.JobClient: Counters: 18
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:   Job Counters
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=180309
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:     Total time spent by all
>>>> reduces waiting after reserving slots (ms)=0
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:     Total time spent by all
>>>> maps waiting after reserving slots (ms)=0
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:     Rack-local map tasks=2
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:     Launched map tasks=2
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:   File Output Format Counters
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:     Bytes Written=0
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:   FileSystemCounters
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:     HDFS_BYTES_READ=248
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=60214
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:   File Input Format Counters
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:     Bytes Read=0
>>>> 13/01/16 14:41:40 INFO mapred.JobClient:   Map-Reduce Framework
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB