Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> how to run jobs every 30 minutes?


Copy link to this message
-
Re: how to run jobs every 30 minutes?
That clears the confusion. Thanks.
There are just too many tools for Hadoop :-)

2010/12/14 Alejandro Abdelnur <[EMAIL PROTECTED]>

> Ed,
>
> Actually Oozie is quite different from Cascading.
>
> * Cascading allows you to write 'queries' using a Java API and they get
> translated into MR jobs.
> * Oozie allows you compose sequences of MR/Pig/Hive/Java/SSH jobs in a DAG
> (workflow jobs) and has timer+data dependency triggers (coordinator jobs).
>
> Regards.
>
> Alejandro
>
> On Tue, Dec 14, 2010 at 1:26 PM, edward choi <[EMAIL PROTECTED]> wrote:
>
> > Thanks for the tip. I took a look at it.
> > Looks similar to Cascading I guess...?
> > Anyway thanks for the info!!
> >
> > Ed
> >
> > 2010/12/8 Alejandro Abdelnur <[EMAIL PROTECTED]>
> >
> > > Or, if you want to do it in a reliable way you could use an Oozie
> > > coordinator job.
> > >
> > > On Wed, Dec 8, 2010 at 1:53 PM, edward choi <[EMAIL PROTECTED]> wrote:
> > > > My mistake. Come to think about it, you are right, I can just make an
> > > > infinite loop inside the Hadoop application.
> > > > Thanks for the reply.
> > > >
> > > > 2010/12/7 Harsh J <[EMAIL PROTECTED]>
> > > >
> > > >> Hi,
> > > >>
> > > >> On Tue, Dec 7, 2010 at 2:25 PM, edward choi <[EMAIL PROTECTED]>
> wrote:
> > > >> > Hi,
> > > >> >
> > > >> > I'm planning to crawl a certain web site every 30 minutes.
> > > >> > How would I get it done in Hadoop?
> > > >> >
> > > >> > In pure Java, I used Thread.sleep() method, but I guess this won't
> > > work
> > > >> in
> > > >> > Hadoop.
> > > >>
> > > >> Why wouldn't it? You need to manage your post-job logic mostly, but
> > > >> sleep and resubmission should work just fine.
> > > >>
> > > >> > Or if it could work, could anyone show me an example?
> > > >> >
> > > >> > Ed.
> > > >> >
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Harsh J
> > > >> www.harshj.com
> > > >>
> > > >
> > >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB