Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Oozie apparent concurrency deadlocking


+
Kartashov, Andy 2012-11-15, 14:44
+
Matt Goeke 2012-11-15, 15:31
+
Kartashov, Andy 2012-11-15, 15:57
Copy link to this message
-
Re: Oozie apparent concurrency deadlocking
Inline
On Thu, Nov 15, 2012 at 9:57 AM, Kartashov, Andy <[EMAIL PROTECTED]>wrote:

>  Matt,
>
>
>
> Thank you for the prompt response.
>
>
>
> To answer your question “Are you using the fairscheduler or default
> FIFO?” I frankly have no idea. I suppose that since I have no idea, it
> must be the default?  How can I find out??
>

If you are not explicitly specifying a different scheduler then yes you are
using the default FIFO. In this case you will have to make the
differentiation based on queue and not pool. Take a look at
http://downright-amazed.blogspot.com/2012/02/configure-oozies-launcher-job.htmlfor
specifics.

>
>
> Do you set  ${LAUNCHER_POOL} parameter inside job.properties, similar to
> ${JT} & ${NN}? What do you set it to?
>

You will have to either set it statically in the workflow, feed it in
through the job.properties or specify it in the coordinator (if you are
using one). Remember that you can have multiple layers of inheritance but
in the end the workflow.xml needs to be able to resolve it from one of the
layers.

>
>
> I have one more question I cannot find an answer to. It is about
> "uri:oozie:sqoop-action:0.2” structure. I found no source about this. I
> take it 0.2 is a schema version? I have seen examples with 0.1, 0.2 and
> 0.3. I have seen examples where simple  <sqoop>..</sqoop> used without
> “uri:….”. Could you explain this part as well or point into the right
> direction where I can learn how to sensibly us “uri:…” in <workflow-app
> xmlns=”uri:…” <map-reduce xmlns=”uri:..” and <sqoop xmlns=”uri:..”, etc?
>

This is all based on which URIs are included in the version that you
downloaded. I actually haven't dug too deep into the difference between the
versions for sqoop but there could be some minor differences between the
action API based on which URI you specify. I wouldn't worry to hard about
it unless you run into errors when actually trying to run the action. If
you take a look at the numerous examples on how to add your own custom
actions for Oozie you will get a much better grasp on how everything is
registered.

>
>
> p.s. My sqoop input totals around  800Mb  of datacoming from 9 tables, at
> 64Mb default split size I end up with about what, 13 mappers total? I run
> this test on two EC2 medium type instances with one node running as
> NN,JT,DN,TT and another just DN,TT.  With 2cores per node I have two M|R
> slots each?
>

So my initial guess based on this is you are hitting the issue where all of
the available slots are being held by launcher actions. If you start to
scale out more and still run into this then you will want to start being
mindful of any setting that could limit your max concurrent jobs for your
user.

>
>
> Rgds,
>
> AK-47
>
>
>
> *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] *On Behalf Of *Matt
> Goeke
> *Sent:* Thursday, November 15, 2012 10:31 AM
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: Oozie apparent concurrency deadlocking
>
>
>
> Andy,
>
>
>
> Are you using the fairscheduler or default FIFO? This problem can be
> partially alleviated by routing the MR actions and the Launcher jobs to
> seperate queues/pools. The reason for this is if they are both competing
> for the same resources you can run into a situation where all of the
> available slots are taken up by the launcher actions and
> thus permanent deadlock. I am guessing based on the numbers you threw out
> there that your overall slot capacity is small (less than 10 mappers
> total?) but if this isn't the case then something else is probably going on
> as well. The way to specify it if you are looking to do it in a sqoop node
> is below:
>
>
>
> <action name="sqoop-node">
>
>         <sqoop xmlns="uri:oozie:sqoop-action:0.2">
>
>             <job-tracker>${JOB_TRACKER}</job-tracker>
>
>             <name-node>${NAME_NODE}</name-node>
>
>             <prepare>
>
>                 <delete path="${NAME_NODE}/tmp/blah"/>
>
>             </prepare>
>
>             <configuration>
>
>                 <property>
+
Kartashov, Andy 2012-11-15, 15:29
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB