Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # dev - [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop FileSystems + Workshop


Copy link to this message
-
[DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop FileSystems + Workshop
Stephen Watt 2013-05-23, 23:52
Hi Folks

Hadoop's pluggable filesystem architecture supports the ability to enable an alternate filesystem for use with Hadoop by writing a plugin for it. We now have several alternate filesystems that have Hadoop FileSystem plugins and because this isn't a very well understood topic, I've been working on a page on the project wiki to bring this all together - http://wiki.apache.org/hadoop/HCFS. At the same time, the Ambari project has been opening up Ambari to support any configured Hadoop FileSystem (as opposed to just HDFS) over at https://issues.apache.org/jira/browse/AMBARI-1817
 
My team (over at Red Hat) have been working on writing a Hadoop FileSystem plugin for the glusterfs filesystem and have been finding that some of the expected semantics of the operations within the Abstract FileSystem class are a little ambiguous. With that said, we've joined Steve Loughran in attempting to clarify these for both the Hadoop 1.0 and the Hadoop 2.0 FileSystem class over at https://issues.apache.org/jira/browse/HADOOP-9371

It seems to me that once we had these semantics defined, it would be good for consistency of implementation if we could make sure they are well understood and properly implemented by the community of folks writing Hadoop FileSystem plugins. To that end, we might work to ensure that those semantics are tested within an exhaustive test framework that focuses on the abstract Hadoop FileSystem layer. Each FileSystem provider could run the tests to ensure their plugin implementation and behavior is consistent with the expectation. Perhaps a broader extension of https://issues.apache.org/jira/browse/HADOOP-9258.

If folks are interested in these goals, I could host a workshop/discussion/hackday in Mountain View to get local people together (perhaps a Google Hangout for the remote folks) to keep the ball rolling on the semantics discussion and test creation. As a side note, I think this could also turn out be quite an effective means of introducing FileSystem vendors to the ASF and getting them contributing to these aspects of the project.

Regards
Steve Watt