Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> VOTE: HDFS-347 merge


Copy link to this message
-
Re: VOTE: HDFS-347 merge
On Wed, Feb 20, 2013 at 3:31 PM, Suresh Srinivas <[EMAIL PROTECTED]> wrote:
>> Given that this is an optimization, and we have a ton of optimizations
>> which don't yet run on Windows, I don't think that should be
>> considered. Additionally, the Windows support has not yet been merged,
>> nor is it in any release, so this isn't a regression.
>>
>
> This is a critical functionality for HBase peformance and an optimization
> we consider
> very important to have.

Too bad it doesn't work in any of the normal installations... none of
the packages for Hadoop would allow it to work, given that the data
directories will be owned by HDFS and not world readable, and
tasks/HBase would run as an "hbase" user, which wouldn't have direct
access to the block files.

>>
>> I would be happy to review an addition to the HDFS-347 branch which
>> addresses this issue. But I don't think we should be maintaining two
>> codepaths for the sake of an optimization on a platform which is not
>> yet officially supported on trunk, especially when the old code path
>> is _insecure_ and thus unusable in most environments.
>
>
> I have to disagree. No where in the jira or the design it is explicitly
> stated that
> the old short circuit functionality is being removed. My assumption has been
> that it will not be removed.

I've tried this avenue in the past on other insecurities which were
fixed. Sorry if you were depending on insecure behavior. The project
should move on and not have 3+ ways of implementing the same thing.

> As regards "officially supported", we have been doing
> windows development for
> more than a year. In fact branch-1-win is being used by a lot users. Given
> merging it to branch-1 requires first making it available in trunk, we have
> been doing
> a lot of work in branch-trunk-win. It is almost ready to be merged as well.
>
> I am -1 on removing existing short circuit until an alternative short
> circuit similar
> to HDFS-347 on all the platforms.

Great -- are you committed to building this equivalent feature for
Windows, then? On what timeline? From my viewpoint, Windows isn't a
supported platform *right now*, so vetoing based on it seems
meritless.

BTW, the posix_fadvise based readahead is an important optimization
for many workloads, too, but from what I can tell looking at the
Windows branch, it doesn't support it. There are other places in the
Windows branch where performance is going to be worse - eg disabling
the pipelined crc32c implementation will be a 2-3x hit on that code
path.

No one has voted to merge Windows support, and if merging Windows
support means that, from now on, every _optimization_ must work on
Windows, I don't think I will be able to vote +1. The vast majority of
the community is _not_ running Windows, and I don't want to block
progress on the small number of developers who know how to program
against that platform.

If that's the axe hanging over our head with the Windows branch, then
I'm all for saying "good, keep it on a branch and don't merge it to
trunk".

I was hoping we could all work together a bit better here...
contentious merge votes like this just cause cases where different
distros diverge by merging different branches way ahead of the
upstream (eg yours with windows support, ours with 347, etc)

-Todd
--
Todd Lipcon
Software Engineer, Cloudera