Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # dev - [Vote] Merge branch-trunk-win to trunk

Copy link to this message
Re: [Vote] Merge branch-trunk-win to trunk
Chris Nauroth 2013-02-28, 16:45
I'd like to share a few anecdotes about developing cross-platform,
hopefully to address some of the concerns about adding overhead to the
development process.  By reviewing past cases of cross-platform Linux vs.
Windows bugs, we can get a sense for how the development process could look
in the future.

HADOOP-9131: TestLocalFileSystem#testListStatusWithColons cannot run on
Windows.  As part of an earlier jira, HADOOP-8962, there was a new test
committed on trunk covering the case of a local file system interaction on
a file containing a ':'.  On Windows, ':' in a path has special meaning as
part of the drive specifier (i.e. C:), so this test cannot pass when
running on Windows.  In this kind of case, the cross-platform bug is
obvious, and the fix is obvious (assumeTrue(!Shell.WINDOWS)).  Ideally,
this would get fixed pre-commit after seeing a -1 from the Windows Jenkins

HDFS-4274: BlockPoolSliceScanner does not close verification log during
shutdown.  This caused problems for MiniDFSCluster-based tests running on
Windows.  Failure to close the verification log meant that we didn't
release file locks, so the tests couldn't delete/recreate working
directories during teardown/setup.  Arguably, this was always a bug, and
running on Windows just exposed it because of its stricter rules about file
locking.  This is a more complex fix, but it doesn't require
platform-specific knowledge.  If some future patch accidentally regresses
this, then we'll likely see +1 from Linux Jenkins and -1 from Windows
Jenkins.  Ideally, it would get fixed pre-commit, because it doesn't
require Windows-specific knowledge.  There is also the matter of impact.
 Re-breaking this would re-break many test suites on Windows.

HADOOP-9232: JniBasedUnixGroupsMappingWithFallback fails on Windows with
UnsatisfiedLinkError.  This was introduced by HADOOP-8712, which switched
to JniBasedUnixGroupsMappingWithFallback as the default
hadoop.security.group.mapping, but did not provide a Windows implementation
of the JNI function.  In this case, there was a strong desire to get
HADOOP-8712 into a release, fixing it on Windows required native Windows
API knowledge, and Windows users had a simple workaround available by
changing their configs back to ShellBasedUnixGroupsMapping.  I think this
is the kind of situation where we could allow HADOOP-8712 to commit despite
-1 from Windows Jenkins, with fairly quick follow-up from an engineer with
the Windows expertise to fix it.

To summarize, I don't think it needs to differ greatly from our current
development process.  We're all responsible for breadth of understanding
and maintenance of the whole codebase, but we also rely on specific
individuals with deep expertise in particular areas for certain issues.
 Sometimes we commit despite a -1 from Jenkins, based on the community's

Virtualization greatly simplifies cross-platform development.  I use
VirtualBox on a Mac host and run VMs for Windows and Ubuntu with a shared
drive so that they can all see the same copy of the source code.  There are
plenty of variations on this depending on your preference, such as
offloading the VMs to a separate server or cloud service to free up local
RAM.  I'm planning on submitting BUILDING.txt changes later today that
fully describe how to build on Windows.  After some initial setup, it's
nearly identical to the mvn commands that you already use today.

Hope this helps,
On Thu, Feb 28, 2013 at 3:25 AM, John Gordon <[EMAIL PROTECTED]>wrote:

> +1 (non-binding)
> I want to share my vote of confidence in this community.  If motivated to
> do so, this community can keep this project cross-platform and continue to
> rapidly innovate without breaking a sweat.
> The day we started working on this, I saw the foundations of greatness in
> the quality and volume of dev tests, the code itself, and the Apache values
> themselves.
> 1.) Hadoop's unit tests and their frameworks are very well thought out and
> the consideration and energy that went into their design is worthy of