|
|
-
Re: Where do/should .jar files live?bejoy.hadoop@... 2013-01-23, 02:54
Hi Chris
In larger clusters it is better to have an edge/client node where all the user jars reside and you trigger your MR jobs from here. A client/edge node is a server with hadoop jars and conf but hosting no daemons. In smaller clusters one DN might act as the client node and you can execute your jars from there. Here you have a risk of that DN getting filled if the files are copied to hdfs from this DN (as per block placement policy one replica would always be on this node) In oozie you put your executables into hdfs . But oozie comes at an integration level. In initial development phase, developers put jar into the LFS on client node, execute and test their code. Regards Bejoy KS Sent from remote device, Please excuse typos -----Original Message----- From: Chris Embree <[EMAIL PROTECTED]> Date: Tue, 22 Jan 2013 14:24:40 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: Where do/should .jar files live? Hi List, This should be a simple question, I think. Disclosure, I am not a java developer. ;) We're getting ready to build our Dev and Prod clusters. I'm pretty comfortable with HDFS and how it sits atop several local file systems on multiple servers. I'm fairly comfortable with the concept of Map/Reduce and why it's cool and we want it. Now for the question. Where should my developers, put and store their jar files? Or asked another way, what's the best entry point for submitting jobs? We have separate physical systems for NN, Checkpoint Node (formerly 2nn), Job Tracker and Standby NN. Should I run from the JT node? Do I keep all of my finished .jar's on the JT local file system? Or should I expect that jobs will be run via Oozie? Do I put jars on the local Oozie FS? Thanks in advance. Chris |