Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Can Hadoop replace the use of MQ b/w processes?


+
Robert Nicholson 2012-08-19, 16:46
+
Michael Segel 2012-08-19, 21:49
+
Lance Norskog 2012-08-20, 00:43
+
Russell Jurney 2012-08-20, 01:31
+
Lance Norskog 2012-08-20, 01:57
Copy link to this message
-
Re: Can Hadoop replace the use of MQ b/w processes?
There is another much more active fork of Azkaban.  See

https://github.com/rbpark/azkaban

On Sun, Aug 19, 2012 at 6:57 PM, Lance Norskog <[EMAIL PROTECTED]> wrote:

> Cool. I'm on the sidelines of a project trying to use Oozie in a large
> Hadoop-ecology app. Oozie is the one thing marked 'to be replaced'.
>
> On Sun, Aug 19, 2012 at 6:31 PM, Russell Jurney
> <[EMAIL PROTECTED]> wrote:
> > Glad to hear about Hamake. FWIW, I've had good success with Azkaban in
> the
> > past for very complex, lengthy Hadoop/Pig/Streaming pipelines. It even
> has a
> > DAG GUI.
> >
> >
> > On Sun, Aug 19, 2012 at 5:43 PM, Lance Norskog <[EMAIL PROTECTED]>
> wrote:
> >>
> >> Last checkin on Azkaban was 11 months ago:
> >>
> >>
> https://github.com/azkaban/azkaban/commit/b105570625bcb2002de1acf4012c8d0e4388470a
> >>
> >> But, the last checkin for Hamake was June 2010. And it's still a cool
> >> little Hadoop/Pig scheduler.
> >> http://hamake.googlecode.com/
> >>
> >> On Sun, Aug 19, 2012 at 2:49 PM, Michael Segel
> >> <[EMAIL PROTECTED]> wrote:
> >> > There has been some work to replace the use of queues with HBase.
> >> > This would be used to feed processes off the queue to help balance out
> >> > the load on the cluster.
> >> >
> >> > In one specific use case, this was effective because the time spent
> >> > processing each mapper.map() iteration is a couple of orders of
> magnitude as
> >> > the time it takes to pull the data from the 'queue' and to each node
> for
> >> > processing.
> >> >
> >> > Again, YMMV, it is an interesting hack though....
> >> >
> >> > On Aug 19, 2012, at 11:46 AM, Robert Nicholson
> >> > <[EMAIL PROTECTED]> wrote:
> >> >
> >> >> We have an application or a series of applications that listen to
> >> >> incoming feeds they then distribute this data in XML form to a
> number of
> >> >> queues.  Another set of processes listen to these queues and process
> the
> >> >> messages. Order of processing is important in so far as related
> messages
> >> >> need to be processed in sequence hence today all related messages go
> to the
> >> >> same queue and are processed by the same queue consumer.
> >> >>
> >> >> The idea would be replace the use of MQ with some kind of reliable
> >> >> distributed dispatch. Does Hadoop provide that?
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >> Lance Norskog
> >> [EMAIL PROTECTED]
> >
> >
> >
> >
> > --
> > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]
> datasyndrome.com
>
>
>
> --
> Lance Norskog
> [EMAIL PROTECTED]
>
+
Russell Jurney 2012-08-19, 17:27
+
Karthik Kambatla 2012-08-19, 17:32
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB