Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Re: [PROPOSAL] Hadoop OSGi compliant and Apache Karaf features


Copy link to this message
-
Re: [PROPOSAL] Hadoop OSGi compliant and Apache Karaf features
On 08/02/12 14:25, Jean-Baptiste Onofr� wrote:
> Hi folks,
>
> I'm working right now to turn Hadoop as an OSGi compliant set of modules.
>
> I've more or less achieved the first step:
> - turn all Hadoop modules (common, annotations, hdfs, mapreduce, etc) as
> OSGi bundle
> - provide a Karaf features descriptor to easily deploy it into Apache
> Karaf OSGi container
>
> I will upload the patches on the different Jira related to that.
>
> The second step that I propose is to introduce blueprint descriptor in
> order to expose some Hadoop features as OSGi services.
> It won't affect the "non-OSGi" users but give lot of fun and interest
> for OSGi users ;)
>

Zookeeper would be nice too, as you could bring up a very small cluster

As I mentioned in one of the JIRA comments

-there are a lot of calls to System.exit() in Hadoop when it isn't
happy, you need a security manager to catch them and turn them into
exceptions -and no, the code doesn't expect exceptions everywhere.

-There are a lot of assumptions that every service (namenode, datanode,
etc) is running in its own VM, with its own singletons. They will all
need their own classloaders, which implies separate OSGi bundles for
each public service.

YARN is even more interesting, as it works by deploying the application
master (such as the MR engine) on request, picking a suitable node and
executing the entry point with a classpath (somehow) set up. If you are
going to work with trunk you will need to address this, the simplest
tactic being "don't try and run YARN-based services under OSGi, just the
YARN Resource Manager and Node Managers itself";

A more advanced options "support OSGi-based YARN services specially",
would also be good if it could start both Application Masters and their
container applications themselves (Task Trackers &c), and aided the
execution of things like actual tasks within the OSGi container (for
speed).

If you are looking a production use of this stuff, you'll need to worry
about loading of the native libraries too. Otherwise this becomes more
restricted to experimental-small-machine setups.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB