Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Running Back to Back Map-reduce jobs


Copy link to this message
-
Re: Running Back to Back Map-reduce jobs
You can use ControlledJob's addDependingJob to handle dependency between
multiple jobs.

On Tue, Jun 7, 2011 at 4:15 PM, Adarsh Sharma <[EMAIL PROTECTED]>wrote:

> Harsh J wrote:
>
>> Yes, I believe Oozie does have Pipes and Streaming action helpers as well.
>>
>> On Thu, Jun 2, 2011 at 5:05 PM, Adarsh Sharma <[EMAIL PROTECTED]>
>> wrote:
>>
>>
>>> Ok, Is it valid for running jobs through Hadoop Pipes too.
>>>
>>> Thanks
>>>
>>> Harsh J wrote:
>>>
>>>
>>>> Oozie's workflow feature may exactly be what you're looking for. It
>>>> can also do much more than just chain jobs.
>>>>
>>>> Check out additional features at: http://yahoo.github.com/oozie/
>>>>
>>>> On Thu, Jun 2, 2011 at 4:48 PM, Adarsh Sharma <[EMAIL PROTECTED]
>>>> >
>>>> wrote:
>>>>
>>>>
>>>>
>>> After following the below points, I am confused about the examples used
> in the documentation :
>
> http://yahoo.github.com/oozie/**releases/3.0.0/**
> WorkflowFunctionalSpec.html#**a3.2.2.3_Pipes<http://yahoo.github.com/oozie/releases/3.0.0/WorkflowFunctionalSpec.html#a3.2.2.3_Pipes>
>
> What I want to achieve is to terminate a job after my permission i.e if I
> want to run again a map-reduce job after the completion of one , it runs &
> then terminates after my code execution.
> I struggled to find a simple example that proves this concept. In the Oozie
> documentation, they r just setting parameters and use them.
>
> fore.g a simple Hadoop Pipes job is executed by :
>
> int main(int argc, char *argv[]) {
>  return HadoopPipes::runTask(**HadoopPipes::TemplateFactory<**
> WordCountMap,
>                             WordCountReduce>());
> }
>
> Now if I want to run another job after this on the reduced data in HDFS,
> how this could be possible. Do i need to add some code.
>
> Thanks
>
>
>
>
>
>  Dear all,
>>>>>
>>>>> I ran several map-reduce jobs in Hadoop Cluster of 4 nodes.
>>>>>
>>>>> Now this time I want a map-reduce job to be run again after one.
>>>>>
>>>>> Fore.g to clear my point, suppose a wordcount is run on gutenberg file
>>>>> in
>>>>> HDFS and after completion
>>>>>
>>>>> 11/06/02 15:14:35 WARN mapred.JobClient: No job jar file set.  User
>>>>> classes
>>>>> may not be found. See JobConf(Class) or JobConf#setJar(String).
>>>>> 11/06/02 15:14:35 INFO mapred.FileInputFormat: Total input paths to
>>>>> process
>>>>> : 3
>>>>> 11/06/02 15:14:36 INFO mapred.JobClient: Running job:
>>>>> job_201106021143_0030
>>>>> 11/06/02 15:14:37 INFO mapred.JobClient:  map 0% reduce 0%
>>>>> 11/06/02 15:14:50 INFO mapred.JobClient:  map 33% reduce 0%
>>>>> 11/06/02 15:14:59 INFO mapred.JobClient:  map 66% reduce 11%
>>>>> 11/06/02 15:15:08 INFO mapred.JobClient:  map 100% reduce 22%
>>>>> 11/06/02 15:15:17 INFO mapred.JobClient:  map 100% reduce 100%
>>>>> 11/06/02 15:15:25 INFO mapred.JobClient: Job complete:
>>>>> job_201106021143_0030
>>>>> 11/06/02 15:15:25 INFO mapred.JobClient: Counters: 18
>>>>>
>>>>>
>>>>>
>>>>> Again a map-reduce job is started on the output or original data say
>>>>> again
>>>>>
>>>>> 1/06/02 15:14:36 INFO mapred.JobClient: Running job:
>>>>> job_201106021143_0030
>>>>> 11/06/02 15:14:37 INFO mapred.JobClient:  map 0% reduce 0%
>>>>> 11/06/02 15:14:50 INFO mapred.JobClient:  map 33% reduce 0%
>>>>>
>>>>> Is it possible or any parameters to achieve it.
>>>>>
>>>>> Please guide .
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>>
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB