The trick with yarn is that there are a lot more client APIs then for HDFS. In addition they have not had as much time as the HDFS APIs to mature. If we want to lock down the client APIs I am fine with that, because I don't really see any huge problems with the existing APIs, but I think waiting to lock them down is a good thing, at least until we can get Hamster and other non-mapreduce applications up and running, and any feedback they may have about the APIs integrated back into them.
On 4/20/12 2:15 AM, "Eli Collins" <[EMAIL PROTECTED]> wrote:
On Thu, Apr 19, 2012 at 11:46 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote:
> Moving to a separate thread...
> On Apr 20, 2012, at 1:24 AM, Todd Lipcon wrote:
>> On Thu, Apr 19, 2012 at 12:26 PM, Eli Collins <[EMAIL PROTECTED]> wrote:
>>> On Thu, Apr 19, 2012 at 11:45 AM, Arun C Murthy <[EMAIL PROTECTED]>
>>>> However, we should consider whether HDFS protocols are 'ready' for us to
>>> commit to them for the foreseeable future, my sense is that it's a tad
>>> early - particularly with auto-failover not complete.
>>> Agree that we're a little too early on the HDFS protocol side, think
>>> MR2 is probably in a similar boat wrt stability as well.
> Agreed, I didn't mean to point fingers at HDFS - it was just the most recent changes.
>> Regarding protocols:
>> +1 to _not_ locking down "cluster-internal" wire compatibility at this
>> point. i.e we can break DN<->NN, or NN<->SBN, or Admin command -> NN
>> compatibility still.
>> +1 to locking down client wire compatibility with the release of 2.0. After
>> 2.0 is released I would like to see all 2.0.x clients continue to be
>> compatible. Now that we are protobuf-ified, I think this is doable.
>> Should we open a separate discussion thread for the above?
> Good points on separating client & internal protocols.
> My sense is that locking client-protocols is a great start, but not sufficient.
> Ideally, we should be considering things like rolling upgrades etc. which necessitate compatibility all across. I'm fully aware it might be too early for us to lock them...
Yup, we've put the mechanism into HDFS for rolling upgrades
(HDFS-2983) and filed (MR-4150) for the same in MR2, but they'll only
be useful if we lock down the protocol (and use PB to get around
differences). Agree w Todd that we're too early for those right now,
and they're much less painful breakages than client <-> server.
> Maybe we can do some hadoop-2.x-(alpha,beta) releases for a few months and then just bite the bullet as HA & YARN protocols stabilize?
Sounds good, we should probably use eg "alpha1", "alpha2" etc in case
we need to do more than a single alpha or beta release.
> Arun C. Murthy
> Hortonworks Inc.