Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Can Hadoop replace the use of MQ b/w processes?


Copy link to this message
-
Re: Can Hadoop replace the use of MQ b/w processes?
Ted Dunning 2012-08-20, 02:13
There is another much more active fork of Azkaban.  See

https://github.com/rbpark/azkaban

On Sun, Aug 19, 2012 at 6:57 PM, Lance Norskog <[EMAIL PROTECTED]> wrote:

> Cool. I'm on the sidelines of a project trying to use Oozie in a large
> Hadoop-ecology app. Oozie is the one thing marked 'to be replaced'.
>
> On Sun, Aug 19, 2012 at 6:31 PM, Russell Jurney
> <[EMAIL PROTECTED]> wrote:
> > Glad to hear about Hamake. FWIW, I've had good success with Azkaban in
> the
> > past for very complex, lengthy Hadoop/Pig/Streaming pipelines. It even
> has a
> > DAG GUI.
> >
> >
> > On Sun, Aug 19, 2012 at 5:43 PM, Lance Norskog <[EMAIL PROTECTED]>
> wrote:
> >>
> >> Last checkin on Azkaban was 11 months ago:
> >>
> >>
> https://github.com/azkaban/azkaban/commit/b105570625bcb2002de1acf4012c8d0e4388470a
> >>
> >> But, the last checkin for Hamake was June 2010. And it's still a cool
> >> little Hadoop/Pig scheduler.
> >> http://hamake.googlecode.com/
> >>
> >> On Sun, Aug 19, 2012 at 2:49 PM, Michael Segel
> >> <[EMAIL PROTECTED]> wrote:
> >> > There has been some work to replace the use of queues with HBase.
> >> > This would be used to feed processes off the queue to help balance out
> >> > the load on the cluster.
> >> >
> >> > In one specific use case, this was effective because the time spent
> >> > processing each mapper.map() iteration is a couple of orders of
> magnitude as
> >> > the time it takes to pull the data from the 'queue' and to each node
> for
> >> > processing.
> >> >
> >> > Again, YMMV, it is an interesting hack though....
> >> >
> >> > On Aug 19, 2012, at 11:46 AM, Robert Nicholson
> >> > <[EMAIL PROTECTED]> wrote:
> >> >
> >> >> We have an application or a series of applications that listen to
> >> >> incoming feeds they then distribute this data in XML form to a
> number of
> >> >> queues.  Another set of processes listen to these queues and process
> the
> >> >> messages. Order of processing is important in so far as related
> messages
> >> >> need to be processed in sequence hence today all related messages go
> to the
> >> >> same queue and are processed by the same queue consumer.
> >> >>
> >> >> The idea would be replace the use of MQ with some kind of reliable
> >> >> distributed dispatch. Does Hadoop provide that?
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >> Lance Norskog
> >> [EMAIL PROTECTED]
> >
> >
> >
> >
> > --
> > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]
> datasyndrome.com
>
>
>
> --
> Lance Norskog
> [EMAIL PROTECTED]
>