|
Andrew Purtell
2011-05-27, 19:49
Andrew Purtell
2011-05-27, 20:00
Ryan Rawson
2011-05-27, 20:10
Todd Lipcon
2011-05-27, 20:30
Andrew Purtell
2011-05-27, 21:38
Todd Lipcon
2011-05-27, 22:10
Ryan Rawson
2011-05-27, 22:05
Andrew Purtell
2011-05-27, 21:22
Gary Helmling
2011-05-27, 20:20
Stack
2011-05-28, 04:15
Joey Echeverria
2011-05-28, 02:11
Eric Yang
2011-05-31, 04:55
Stack
2011-05-31, 20:22
Ryan Rawson
2011-05-31, 20:42
Eric Yang
2011-05-31, 21:27
Gary Helmling
2011-05-31, 22:47
Lars George
2011-06-01, 13:26
Andrew Purtell
2011-06-01, 15:54
Andrew Purtell
2011-05-31, 23:09
|
-
modular build and pluggable rpcAndrew Purtell 2011-05-27, 19:49
>From IRC:
apurtell i propose we take the build modular as early as possible to deal with multiple platform targets apurtell secure vs nonsecure apurtell 0.20 vs 0.22 vs trunk apurtell i understand the maintenence issues with multiple rpc engines, for example, but a lot of reflection twistiness is going to be worse apurtell i propose we take up esammer on his offer apurtell so branch 0.92 asap, get trunk modular and working against multiple platform targets apurtell especially if we're going to see rpc changes coming from downstream projects... apurtell also what about supporting secure and nonsecure clients with the same deployment? apurtell zookeeper does this apurtell so that is selectable rpc engine per connection, with a negotiation apurtell we don't have or want to be crazy about it but a rolling upgrade should be possible if for example we are taking in a new rpc from fb (?) or cloudera (avro based?) apurtell also looks like hlog modules for 0.20 vs 0.22 and successors apurtell i think over time we can roadmap the rpc engines, if we have multiple, by deprecation apurtell now that we're on the edge of supporting both 0.20 and 0.22, and secure vs nonsecure, let's get it as manageable as possible right away St^Ack_ apurtell: +1 apurtell also i think there is some interest in async rpc engine St^Ack_ we should stick this up on dev i'd say Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) +
Andrew Purtell 2011-05-27, 19:49
-
Re: modular build and pluggable rpcAndrew Purtell 2011-05-27, 20:00
Also needing, perhaps later, consideration:
- HDFS-347 or not - Lucene embedding for hbase-search, though as a coprocessor this is already pretty much handled if we have platform support (therefore a platform module) for a HDFS that can do local read shortcutting and block placement requests - HFile v1 versus v2 Making decoupled development at several downstream sites manageable, with a home upstream for all the work, while simultaneously providing clean migration paths for users, basically. --- On Fri, 5/27/11, Andrew Purtell <[EMAIL PROTECTED]> wrote: > From: Andrew Purtell <[EMAIL PROTECTED]> > Subject: modular build and pluggable rpc > To: [EMAIL PROTECTED] > Date: Friday, May 27, 2011, 12:49 PM > From IRC: > > apurtell i propose we take the build modular as early as possible to deal with multiple platform targets > apurtell secure vs nonsecure > apurtell 0.20 vs 0.22 vs trunk > apurtell i understand the maintenence issues with multiple rpc engines, for example, but a lot of reflection twistiness is going to be worse > apurtell i propose we take up esammer on his offer > apurtell so branch 0.92 asap, get trunk modular and working against multiple platform targets > apurtell especially if we're going to see rpc changes coming from downstream projects... > apurtell also what about supporting secure and nonsecure clients with the same deployment? > apurtell zookeeper does this > apurtell so that is selectable rpc engine per connection, with a negotiation > apurtell we don't have or want to be crazy about it but a rolling upgrade should be possible if for example we are taking in a new rpc from fb (?) or cloudera (avro based?) > apurtell also looks like hlog modules for 0.20 vs 0.22 and successors > apurtell i think over time we can roadmap the rpc engines, if we have multiple, by deprecation > apurtell now that we're on the edge of supporting both 0.20 and 0.22, and secure vs nonsecure, let's get it as manageable as possible right away > > St^Ack_ apurtell: +1 > > apurtell also i think there is some interest in async rpc engine > > St^Ack_ we should stick this up on dev i'd say > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting > back. - Piet Hein (via Tom White) > +
Andrew Purtell 2011-05-27, 20:00
-
Re: modular build and pluggable rpcRyan Rawson 2011-05-27, 20:10
I'm -1 on avro as a RPC format. Thrift is the way to go, any of the
advantages of smaller serialization of avro is lost by the sheer complexity of avro and therefore the potential bugs. I understand the desire to have a pluggable RPC engine, but it feels like the better approach would be to adopt a unified RPC and just be done with it. I had a look at the HsHa mechanism in thrift and it is very good, it in fact matches our 'handler' approach - async recieving/sending of data, but single threaded for processing a message. -ryan On Fri, May 27, 2011 at 1:00 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > Also needing, perhaps later, consideration: > > - HDFS-347 or not > > - Lucene embedding for hbase-search, though as a coprocessor this is already pretty much handled if we have platform support (therefore a platform module) for a HDFS that can do local read shortcutting and block placement requests > > - HFile v1 versus v2 > > Making decoupled development at several downstream sites manageable, with a home upstream for all the work, while simultaneously providing clean migration paths for users, basically. > > --- On Fri, 5/27/11, Andrew Purtell <[EMAIL PROTECTED]> wrote: > >> From: Andrew Purtell <[EMAIL PROTECTED]> >> Subject: modular build and pluggable rpc >> To: [EMAIL PROTECTED] >> Date: Friday, May 27, 2011, 12:49 PM >> From IRC: >> >> apurtell i propose we take the build modular as early as possible to deal with multiple platform targets >> apurtell secure vs nonsecure >> apurtell 0.20 vs 0.22 vs trunk >> apurtell i understand the maintenence issues with multiple rpc engines, for example, but a lot of reflection twistiness is going to be worse >> apurtell i propose we take up esammer on his offer >> apurtell so branch 0.92 asap, get trunk modular and working against multiple platform targets >> apurtell especially if we're going to see rpc changes coming from downstream projects... >> apurtell also what about supporting secure and nonsecure clients with the same deployment? >> apurtell zookeeper does this >> apurtell so that is selectable rpc engine per connection, with a negotiation >> apurtell we don't have or want to be crazy about it but a rolling upgrade should be possible if for example we are taking in a new rpc from fb (?) or cloudera (avro based?) >> apurtell also looks like hlog modules for 0.20 vs 0.22 and successors >> apurtell i think over time we can roadmap the rpc engines, if we have multiple, by deprecation >> apurtell now that we're on the edge of supporting both 0.20 and 0.22, and secure vs nonsecure, let's get it as manageable as possible right away >> >> St^Ack_ apurtell: +1 >> >> apurtell also i think there is some interest in async rpc engine >> >> St^Ack_ we should stick this up on dev i'd say >> >> Best regards, >> >> - Andy >> >> Problems worthy of attack prove their worth by hitting >> back. - Piet Hein (via Tom White) >> > +
Ryan Rawson 2011-05-27, 20:10
-
Re: modular build and pluggable rpcTodd Lipcon 2011-05-27, 20:30
Agreed - I'm all for Thrift.
Though, I actually, contrary to Ryan, think that the existing HBaseRPC handler/client code is pretty good -- better than the equivalents from Thrift Java. We could start by using Thrift serialization on our existing transport -- then maybe work towards contributing it upstream to the Thrift project. HDFS folks are potentially interested in doing that as well. -Todd On Fri, May 27, 2011 at 1:10 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > I'm -1 on avro as a RPC format. Thrift is the way to go, any of the > advantages of smaller serialization of avro is lost by the sheer > complexity of avro and therefore the potential bugs. > > I understand the desire to have a pluggable RPC engine, but it feels > like the better approach would be to adopt a unified RPC and just be > done with it. I had a look at the HsHa mechanism in thrift and it is > very good, it in fact matches our 'handler' approach - async > recieving/sending of data, but single threaded for processing a > message. > > -ryan > > On Fri, May 27, 2011 at 1:00 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: >> Also needing, perhaps later, consideration: >> >> - HDFS-347 or not >> >> - Lucene embedding for hbase-search, though as a coprocessor this is already pretty much handled if we have platform support (therefore a platform module) for a HDFS that can do local read shortcutting and block placement requests >> >> - HFile v1 versus v2 >> >> Making decoupled development at several downstream sites manageable, with a home upstream for all the work, while simultaneously providing clean migration paths for users, basically. >> >> --- On Fri, 5/27/11, Andrew Purtell <[EMAIL PROTECTED]> wrote: >> >>> From: Andrew Purtell <[EMAIL PROTECTED]> >>> Subject: modular build and pluggable rpc >>> To: [EMAIL PROTECTED] >>> Date: Friday, May 27, 2011, 12:49 PM >>> From IRC: >>> >>> apurtell i propose we take the build modular as early as possible to deal with multiple platform targets >>> apurtell secure vs nonsecure >>> apurtell 0.20 vs 0.22 vs trunk >>> apurtell i understand the maintenence issues with multiple rpc engines, for example, but a lot of reflection twistiness is going to be worse >>> apurtell i propose we take up esammer on his offer >>> apurtell so branch 0.92 asap, get trunk modular and working against multiple platform targets >>> apurtell especially if we're going to see rpc changes coming from downstream projects... >>> apurtell also what about supporting secure and nonsecure clients with the same deployment? >>> apurtell zookeeper does this >>> apurtell so that is selectable rpc engine per connection, with a negotiation >>> apurtell we don't have or want to be crazy about it but a rolling upgrade should be possible if for example we are taking in a new rpc from fb (?) or cloudera (avro based?) >>> apurtell also looks like hlog modules for 0.20 vs 0.22 and successors >>> apurtell i think over time we can roadmap the rpc engines, if we have multiple, by deprecation >>> apurtell now that we're on the edge of supporting both 0.20 and 0.22, and secure vs nonsecure, let's get it as manageable as possible right away >>> >>> St^Ack_ apurtell: +1 >>> >>> apurtell also i think there is some interest in async rpc engine >>> >>> St^Ack_ we should stick this up on dev i'd say >>> >>> Best regards, >>> >>> - Andy >>> >>> Problems worthy of attack prove their worth by hitting >>> back. - Piet Hein (via Tom White) >>> >> > -- Todd Lipcon Software Engineer, Cloudera +
Todd Lipcon 2011-05-27, 20:30
-
Re: modular build and pluggable rpcAndrew Purtell 2011-05-27, 21:38
I don't disagree with any of this but the fact is we have compile time differences if going against secure Hadoop 0.20 or non-secure Hadoop 0.20.
So either we decide to punt on integration with secure Hadoop 0.20 or we deal with the compile time differences. If dealing with them, we can do it by reflection, which is brittle and can be difficult to understand and debug, and someone would have to do this work; or we can wholesale replace RPC with something based on Thrift, and someone would have to do the work; or we take the pluggable RPC changes that Gary has already developed and modularize the build, which Eric has already volunteered to do. - Andy --- On Fri, 5/27/11, Todd Lipcon <[EMAIL PROTECTED]> wrote: > From: Todd Lipcon <[EMAIL PROTECTED]> > Subject: Re: modular build and pluggable rpc > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Date: Friday, May 27, 2011, 1:30 PM > Agreed - I'm all for Thrift. > > Though, I actually, contrary to Ryan, think that the > existing HBaseRPC > handler/client code is pretty good -- better than the > equivalents from > Thrift Java. > > We could start by using Thrift serialization on our > existing transport > -- then maybe work towards contributing it upstream to the > Thrift > project. HDFS folks are potentially interested in doing > that as well. > > -Todd > > On Fri, May 27, 2011 at 1:10 PM, Ryan Rawson <[EMAIL PROTECTED]> > wrote: > > I'm -1 on avro as a RPC format. Thrift is the way to > go, any of the > > advantages of smaller serialization of avro is lost by > the sheer > > complexity of avro and therefore the potential bugs. > > > > I understand the desire to have a pluggable RPC > engine, but it feels > > like the better approach would be to adopt a unified > RPC and just be > > done with it. I had a look at the HsHa mechanism in > thrift and it is > > very good, it in fact matches our 'handler' approach - > async > > recieving/sending of data, but single threaded for > processing a > > message. > > > > -ryan > > > > On Fri, May 27, 2011 at 1:00 PM, Andrew Purtell <[EMAIL PROTECTED]> > wrote: > >> Also needing, perhaps later, consideration: > >> > >> - HDFS-347 or not > >> > >> - Lucene embedding for hbase-search, though as a > coprocessor this is already pretty much handled if we have > platform support (therefore a platform module) for a HDFS > that can do local read shortcutting and block placement > requests > >> > >> - HFile v1 versus v2 > >> > >> Making decoupled development at several downstream > sites manageable, with a home upstream for all the work, > while simultaneously providing clean migration paths for > users, basically. > >> > >> --- On Fri, 5/27/11, Andrew Purtell <[EMAIL PROTECTED]> > wrote: > >> > >>> From: Andrew Purtell <[EMAIL PROTECTED]> > >>> Subject: modular build and pluggable rpc > >>> To: [EMAIL PROTECTED] > >>> Date: Friday, May 27, 2011, 12:49 PM > >>> From IRC: > >>> > >>> apurtell i propose we take the build > modular as early as possible to deal with multiple platform > targets > >>> apurtell secure vs nonsecure > >>> apurtell 0.20 vs 0.22 vs trunk > >>> apurtell i understand the maintenence > issues with multiple rpc engines, for example, but a lot of > reflection twistiness is going to be worse > >>> apurtell i propose we take up esammer on > his offer > >>> apurtell so branch 0.92 asap, get trunk > modular and working against multiple platform targets > >>> apurtell especially if we're going to > see rpc changes coming from downstream projects... > >>> apurtell also what about supporting > secure and nonsecure clients with the same deployment? > >>> apurtell zookeeper does this > >>> apurtell so that is selectable rpc > engine per connection, with a negotiation > >>> apurtell we don't have or want to be > crazy about it but a rolling upgrade should be possible if > for example we are taking in a new rpc from fb (?) or > cloudera (avro based?) > >>> apurtell also looks like hlog modules +
Andrew Purtell 2011-05-27, 21:38
-
Re: modular build and pluggable rpcTodd Lipcon 2011-05-27, 22:10
On Fri, May 27, 2011 at 2:38 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
> I don't disagree with any of this but the fact is we have compile time differences if going against secure Hadoop 0.20 or non-secure Hadoop 0.20. > > So either we decide to punt on integration with secure Hadoop 0.20 or we deal with the compile time differences. If dealing with them, we can do it by reflection, which is brittle and can be difficult to understand and debug, and someone would have to do this work; or we can wholesale replace RPC with something based on Thrift, and someone would have to do the work; or we take the pluggable RPC changes that Gary has already developed and modularize the build, which Eric has already volunteered to do. Yes, sorry for taking this discussion off-track. I think modularizing this part of the build and fixing things like the recent async response patch to work with that modularization is the right short-term solution. -Todd > > - Andy > > --- On Fri, 5/27/11, Todd Lipcon <[EMAIL PROTECTED]> wrote: > >> From: Todd Lipcon <[EMAIL PROTECTED]> >> Subject: Re: modular build and pluggable rpc >> To: [EMAIL PROTECTED] >> Cc: [EMAIL PROTECTED] >> Date: Friday, May 27, 2011, 1:30 PM >> Agreed - I'm all for Thrift. >> >> Though, I actually, contrary to Ryan, think that the >> existing HBaseRPC >> handler/client code is pretty good -- better than the >> equivalents from >> Thrift Java. >> >> We could start by using Thrift serialization on our >> existing transport >> -- then maybe work towards contributing it upstream to the >> Thrift >> project. HDFS folks are potentially interested in doing >> that as well. >> >> -Todd >> >> On Fri, May 27, 2011 at 1:10 PM, Ryan Rawson <[EMAIL PROTECTED]> >> wrote: >> > I'm -1 on avro as a RPC format. Thrift is the way to >> go, any of the >> > advantages of smaller serialization of avro is lost by >> the sheer >> > complexity of avro and therefore the potential bugs. >> > >> > I understand the desire to have a pluggable RPC >> engine, but it feels >> > like the better approach would be to adopt a unified >> RPC and just be >> > done with it. I had a look at the HsHa mechanism in >> thrift and it is >> > very good, it in fact matches our 'handler' approach - >> async >> > recieving/sending of data, but single threaded for >> processing a >> > message. >> > >> > -ryan >> > >> > On Fri, May 27, 2011 at 1:00 PM, Andrew Purtell <[EMAIL PROTECTED]> >> wrote: >> >> Also needing, perhaps later, consideration: >> >> >> >> - HDFS-347 or not >> >> >> >> - Lucene embedding for hbase-search, though as a >> coprocessor this is already pretty much handled if we have >> platform support (therefore a platform module) for a HDFS >> that can do local read shortcutting and block placement >> requests >> >> >> >> - HFile v1 versus v2 >> >> >> >> Making decoupled development at several downstream >> sites manageable, with a home upstream for all the work, >> while simultaneously providing clean migration paths for >> users, basically. >> >> >> >> --- On Fri, 5/27/11, Andrew Purtell <[EMAIL PROTECTED]> >> wrote: >> >> >> >>> From: Andrew Purtell <[EMAIL PROTECTED]> >> >>> Subject: modular build and pluggable rpc >> >>> To: [EMAIL PROTECTED] >> >>> Date: Friday, May 27, 2011, 12:49 PM >> >>> From IRC: >> >>> >> >>> apurtell i propose we take the build >> modular as early as possible to deal with multiple platform >> targets >> >>> apurtell secure vs nonsecure >> >>> apurtell 0.20 vs 0.22 vs trunk >> >>> apurtell i understand the maintenence >> issues with multiple rpc engines, for example, but a lot of >> reflection twistiness is going to be worse >> >>> apurtell i propose we take up esammer on >> his offer >> >>> apurtell so branch 0.92 asap, get trunk >> modular and working against multiple platform targets >> >>> apurtell especially if we're going to >> see rpc changes coming from downstream projects... >> >>> apurtell also what about supporting >> secure and nonsecure clients with the same deployment? Todd Lipcon Software Engineer, Cloudera +
Todd Lipcon 2011-05-27, 22:10
-
Re: modular build and pluggable rpcRyan Rawson 2011-05-27, 22:05
The build modules are fine, I just wanted to voice my opinions on avro
vs thrift. I dont think we should spend a lot of time attempting to build a avro vs thrift thing, we should plan to eventually move to thrift as our RPC serialization. I also concur with Todd, our server side code has had a lot of work and it isnt half bad now :-) +1 to maven modules, they are pretty cool On Fri, May 27, 2011 at 2:38 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > I don't disagree with any of this but the fact is we have compile time differences if going against secure Hadoop 0.20 or non-secure Hadoop 0.20. > > So either we decide to punt on integration with secure Hadoop 0.20 or we deal with the compile time differences. If dealing with them, we can do it by reflection, which is brittle and can be difficult to understand and debug, and someone would have to do this work; or we can wholesale replace RPC with something based on Thrift, and someone would have to do the work; or we take the pluggable RPC changes that Gary has already developed and modularize the build, which Eric has already volunteered to do. > > - Andy > > --- On Fri, 5/27/11, Todd Lipcon <[EMAIL PROTECTED]> wrote: > >> From: Todd Lipcon <[EMAIL PROTECTED]> >> Subject: Re: modular build and pluggable rpc >> To: [EMAIL PROTECTED] >> Cc: [EMAIL PROTECTED] >> Date: Friday, May 27, 2011, 1:30 PM >> Agreed - I'm all for Thrift. >> >> Though, I actually, contrary to Ryan, think that the >> existing HBaseRPC >> handler/client code is pretty good -- better than the >> equivalents from >> Thrift Java. >> >> We could start by using Thrift serialization on our >> existing transport >> -- then maybe work towards contributing it upstream to the >> Thrift >> project. HDFS folks are potentially interested in doing >> that as well. >> >> -Todd >> >> On Fri, May 27, 2011 at 1:10 PM, Ryan Rawson <[EMAIL PROTECTED]> >> wrote: >> > I'm -1 on avro as a RPC format. Thrift is the way to >> go, any of the >> > advantages of smaller serialization of avro is lost by >> the sheer >> > complexity of avro and therefore the potential bugs. >> > >> > I understand the desire to have a pluggable RPC >> engine, but it feels >> > like the better approach would be to adopt a unified >> RPC and just be >> > done with it. I had a look at the HsHa mechanism in >> thrift and it is >> > very good, it in fact matches our 'handler' approach - >> async >> > recieving/sending of data, but single threaded for >> processing a >> > message. >> > >> > -ryan >> > >> > On Fri, May 27, 2011 at 1:00 PM, Andrew Purtell <[EMAIL PROTECTED]> >> wrote: >> >> Also needing, perhaps later, consideration: >> >> >> >> - HDFS-347 or not >> >> >> >> - Lucene embedding for hbase-search, though as a >> coprocessor this is already pretty much handled if we have >> platform support (therefore a platform module) for a HDFS >> that can do local read shortcutting and block placement >> requests >> >> >> >> - HFile v1 versus v2 >> >> >> >> Making decoupled development at several downstream >> sites manageable, with a home upstream for all the work, >> while simultaneously providing clean migration paths for >> users, basically. >> >> >> >> --- On Fri, 5/27/11, Andrew Purtell <[EMAIL PROTECTED]> >> wrote: >> >> >> >>> From: Andrew Purtell <[EMAIL PROTECTED]> >> >>> Subject: modular build and pluggable rpc >> >>> To: [EMAIL PROTECTED] >> >>> Date: Friday, May 27, 2011, 12:49 PM >> >>> From IRC: >> >>> >> >>> apurtell i propose we take the build >> modular as early as possible to deal with multiple platform >> targets >> >>> apurtell secure vs nonsecure >> >>> apurtell 0.20 vs 0.22 vs trunk >> >>> apurtell i understand the maintenence >> issues with multiple rpc engines, for example, but a lot of >> reflection twistiness is going to be worse >> >>> apurtell i propose we take up esammer on >> his offer >> >>> apurtell so branch 0.92 asap, get trunk >> modular and working against multiple platform targets >> +
Ryan Rawson 2011-05-27, 22:05
-
Re: modular build and pluggable rpcAndrew Purtell 2011-05-27, 21:22
This is all kind of non-responsive to the issues at hand?
How are we supposed to have unified security and non security RPC? You volunteering to unify them? If not then we *already* have pluggable RPC for secure and nonsecure RPC in trunk, today. I'm just proposing we take Eric up on his offer of Maven-fu to modularize the build accordingly, and cut 0.92 real soon now so he can do it soon. --- On Fri, 5/27/11, Ryan Rawson <[EMAIL PROTECTED]> wrote: > I'm -1 on avro as a RPC format. Thrift is the way to go, any of the > advantages of smaller serialization of avro is lost by the sheer > complexity of avro and therefore the potential bugs. > > I understand the desire to have a pluggable RPC engine, but it feels > like the better approach would be to adopt a unified RPC and just be > done with it. I had a look at the HsHa mechanism in thrift and it is > very good, it in fact matches our 'handler' approach - async > recieving/sending of data, but single threaded for processing a > message. > > -ryan > > On Fri, May 27, 2011 at 1:00 PM, Andrew Purtell <[EMAIL PROTECTED]> > wrote: > > Also needing, perhaps later, consideration: > > > > - HDFS-347 or not > > > > - Lucene embedding for hbase-search, though as a > coprocessor this is already pretty much handled if we have > platform support (therefore a platform module) for a HDFS > that can do local read shortcutting and block placement > requests > > > > - HFile v1 versus v2 > > > > Making decoupled development at several downstream > sites manageable, with a home upstream for all the work, > while simultaneously providing clean migration paths for > users, basically. > > > > --- On Fri, 5/27/11, Andrew Purtell <[EMAIL PROTECTED]> > wrote: > > > >> From: Andrew Purtell <[EMAIL PROTECTED]> > >> Subject: modular build and pluggable rpc > >> To: [EMAIL PROTECTED] > >> Date: Friday, May 27, 2011, 12:49 PM > >> From IRC: > >> > >> apurtell i propose we take the build modular > as early as possible to deal with multiple platform targets > >> apurtell secure vs nonsecure > >> apurtell 0.20 vs 0.22 vs trunk > >> apurtell i understand the maintenence issues > with multiple rpc engines, for example, but a lot of > reflection twistiness is going to be worse > >> apurtell i propose we take up esammer on his > offer > >> apurtell so branch 0.92 asap, get trunk > modular and working against multiple platform targets > >> apurtell especially if we're going to see > rpc changes coming from downstream projects... > >> apurtell also what about supporting secure > and nonsecure clients with the same deployment? > >> apurtell zookeeper does this > >> apurtell so that is selectable rpc engine > per connection, with a negotiation > >> apurtell we don't have or want to be crazy > about it but a rolling upgrade should be possible if for > example we are taking in a new rpc from fb (?) or cloudera > (avro based?) > >> apurtell also looks like hlog modules for > 0.20 vs 0.22 and successors > >> apurtell i think over time we can roadmap > the rpc engines, if we have multiple, by deprecation > >> apurtell now that we're on the edge of > supporting both 0.20 and 0.22, and secure vs nonsecure, > let's get it as manageable as possible right away > >> > >> St^Ack_ apurtell: +1 > >> > >> apurtell also i think there is some interest > in async rpc engine > >> > >> St^Ack_ we should stick this up on > dev i'd say > >> > >> Best regards, > >> > >> - Andy > >> > >> Problems worthy of attack prove their worth by > hitting > >> back. - Piet Hein (via Tom White) > >> > > > +
Andrew Purtell 2011-05-27, 21:22
-
Re: modular build and pluggable rpcGary Helmling 2011-05-27, 20:20
+1
Using maven modules would also allow us to have a minimal hbase-client.jar, which is periodically requested on the mailing lists. As we move to support more versions of Hadoop, being able to have separate modules built against each version seems saner than continuing to extend the current reflection based approaches, which can be brittle. Since the security work has all been developed as loadable components, making it a separate module would make perfect sense as a means of integration. Security can then be built against secure Hadoop for those who care, while not impacting core HBase. Same goes for supporting changes across Hadoop 0.21, 0.22, trunk... I agree we should branch 0.92 first, then get trunk modularized as soon as possible. If we all agree on that, I'm happy to help the modularization effort (with limited maven skills), and will start posting security patches for review as soon as we have the setup in place to support it. --gh On Fri, May 27, 2011 at 12:49 PM, Andrew Purtell <[EMAIL PROTECTED]>wrote: > From IRC: > > apurtell i propose we take the build modular as early as possible to > deal with multiple platform targets > apurtell secure vs nonsecure > apurtell 0.20 vs 0.22 vs trunk > apurtell i understand the maintenence issues with multiple rpc > engines, for example, but a lot of reflection twistiness is going to be > worse > apurtell i propose we take up esammer on his offer > apurtell so branch 0.92 asap, get trunk modular and working against > multiple platform targets > apurtell especially if we're going to see rpc changes coming from > downstream projects... > apurtell also what about supporting secure and nonsecure clients > with the same deployment? > apurtell zookeeper does this > apurtell so that is selectable rpc engine per connection, with a > negotiation > apurtell we don't have or want to be crazy about it but a rolling > upgrade should be possible if for example we are taking in a new rpc from fb > (?) or cloudera (avro based?) > apurtell also looks like hlog modules for 0.20 vs 0.22 and > successors > apurtell i think over time we can roadmap the rpc engines, if we > have multiple, by deprecation > apurtell now that we're on the edge of supporting both 0.20 and > 0.22, and secure vs nonsecure, let's get it as manageable as possible right > away > > St^Ack_ apurtell: +1 > > apurtell also i think there is some interest in async rpc engine > > St^Ack_ we should stick this up on dev i'd say > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > +
Gary Helmling 2011-05-27, 20:20
-
Re: modular build and pluggable rpcStack 2011-05-28, 04:15
I already +1'd modularizing post branching of 0.92.0. When do we
branch 0.92? We have a bunch of blockers and criticals filed against it still. Maybe we all can review and move stuff out that we don't think so critical or that much of a blocker? Good stuff, St.Ack On Fri, May 27, 2011 at 12:49 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > From IRC: > > apurtell i propose we take the build modular as early as possible to deal with multiple platform targets > apurtell secure vs nonsecure > apurtell 0.20 vs 0.22 vs trunk > apurtell i understand the maintenence issues with multiple rpc engines, for example, but a lot of reflection twistiness is going to be worse > apurtell i propose we take up esammer on his offer > apurtell so branch 0.92 asap, get trunk modular and working against multiple platform targets > apurtell especially if we're going to see rpc changes coming from downstream projects... > apurtell also what about supporting secure and nonsecure clients with the same deployment? > apurtell zookeeper does this > apurtell so that is selectable rpc engine per connection, with a negotiation > apurtell we don't have or want to be crazy about it but a rolling upgrade should be possible if for example we are taking in a new rpc from fb (?) or cloudera (avro based?) > apurtell also looks like hlog modules for 0.20 vs 0.22 and successors > apurtell i think over time we can roadmap the rpc engines, if we have multiple, by deprecation > apurtell now that we're on the edge of supporting both 0.20 and 0.22, and secure vs nonsecure, let's get it as manageable as possible right away > > St^Ack_ apurtell: +1 > > apurtell also i think there is some interest in async rpc engine > > St^Ack_ we should stick this up on dev i'd say > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) > +
Stack 2011-05-28, 04:15
-
Re: modular build and pluggable rpcJoey Echeverria 2011-05-28, 02:11
+1 on maven modules. That will simplify the native code build/integration
that I'm working on for HBASE-1316. -Joey On May 27, 2011 6:15 PM, "Ryan Rawson" <[EMAIL PROTECTED]> wrote: > The build modules are fine, I just wanted to voice my opinions on avro > vs thrift. I dont think we should spend a lot of time attempting to > build a avro vs thrift thing, we should plan to eventually move to > thrift as our RPC serialization. I also concur with Todd, our server > side code has had a lot of work and it isnt half bad now :-) > > +1 to maven modules, they are pretty cool > > On Fri, May 27, 2011 at 2:38 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: >> I don't disagree with any of this but the fact is we have compile time differences if going against secure Hadoop 0.20 or non-secure Hadoop 0.20. >> >> So either we decide to punt on integration with secure Hadoop 0.20 or we deal with the compile time differences. If dealing with them, we can do it by reflection, which is brittle and can be difficult to understand and debug, and someone would have to do this work; or we can wholesale replace RPC with something based on Thrift, and someone would have to do the work; or we take the pluggable RPC changes that Gary has already developed and modularize the build, which Eric has already volunteered to do. >> >> - Andy >> >> --- On Fri, 5/27/11, Todd Lipcon <[EMAIL PROTECTED]> wrote: >> >>> From: Todd Lipcon <[EMAIL PROTECTED]> >>> Subject: Re: modular build and pluggable rpc >>> To: [EMAIL PROTECTED] >>> Cc: [EMAIL PROTECTED] >>> Date: Friday, May 27, 2011, 1:30 PM >>> Agreed - I'm all for Thrift. >>> >>> Though, I actually, contrary to Ryan, think that the >>> existing HBaseRPC >>> handler/client code is pretty good -- better than the >>> equivalents from >>> Thrift Java. >>> >>> We could start by using Thrift serialization on our >>> existing transport >>> -- then maybe work towards contributing it upstream to the >>> Thrift >>> project. HDFS folks are potentially interested in doing >>> that as well. >>> >>> -Todd >>> >>> On Fri, May 27, 2011 at 1:10 PM, Ryan Rawson <[EMAIL PROTECTED]> >>> wrote: >>> > I'm -1 on avro as a RPC format. Thrift is the way to >>> go, any of the >>> > advantages of smaller serialization of avro is lost by >>> the sheer >>> > complexity of avro and therefore the potential bugs. >>> > >>> > I understand the desire to have a pluggable RPC >>> engine, but it feels >>> > like the better approach would be to adopt a unified >>> RPC and just be >>> > done with it. I had a look at the HsHa mechanism in >>> thrift and it is >>> > very good, it in fact matches our 'handler' approach - >>> async >>> > recieving/sending of data, but single threaded for >>> processing a >>> > message. >>> > >>> > -ryan >>> > >>> > On Fri, May 27, 2011 at 1:00 PM, Andrew Purtell <[EMAIL PROTECTED]> >>> wrote: >>> >> Also needing, perhaps later, consideration: >>> >> >>> >> - HDFS-347 or not >>> >> >>> >> - Lucene embedding for hbase-search, though as a >>> coprocessor this is already pretty much handled if we have >>> platform support (therefore a platform module) for a HDFS >>> that can do local read shortcutting and block placement >>> requests >>> >> >>> >> - HFile v1 versus v2 >>> >> >>> >> Making decoupled development at several downstream >>> sites manageable, with a home upstream for all the work, >>> while simultaneously providing clean migration paths for >>> users, basically. >>> >> >>> >> --- On Fri, 5/27/11, Andrew Purtell <[EMAIL PROTECTED]> >>> wrote: >>> >> >>> >>> From: Andrew Purtell <[EMAIL PROTECTED]> >>> >>> Subject: modular build and pluggable rpc >>> >>> To: [EMAIL PROTECTED] >>> >>> Date: Friday, May 27, 2011, 12:49 PM >>> >>> From IRC: >>> >>> >>> >>> apurtell i propose we take the build >>> modular as early as possible to deal with multiple platform >>> targets >>> >>> apurtell secure vs nonsecure >>> >>> apurtell 0.20 vs 0.22 vs trunk >>> >>> apurtell i understand the maintenence +
Joey Echeverria 2011-05-28, 02:11
-
Re: modular build and pluggable rpcEric Yang 2011-05-31, 04:55
Maven modulation could be enhanced to have a structure looks like this:
Super POM +- common +- shell +- master +- region-server +- coprocessor The software is basically group by processor type (role of the process) and a shared library. For RPC, there are several feasible options, avro, thrift and jackson+jersey (REST). Avro may seems cumbersome to define the schema in JSON string. Thrift comes with it's own rpc server, it is not trivial to add authorization and authentication to secure the rpc transport. Jackson+Jersey RPC message is biggest message size compare to Avro and thrift. All three frameworks have pros and cons but I think Jackson+jersey have the right balance for rpc framework. In most of the use case, pluggable RPC can be narrow down to two main category of use cases: 1. Freedom of creating most efficient rpc but hard to integrate with everything else because it's custom made. 2. Being able to evolve message passing and versioning. If we can see beyond first reason, and realize second reason is in part polymorphic serialization. This means, Jackson+Jersey is probably the better choice as a RPC framework because Jackson supports polymorphic serialization, and Jersey builds on HTTP protocol. It would be easier to versioning and add security on top of existing standards. The syntax and feature set seems more engineering proper to me. Regards, Eric On 5/27/11 7:11 PM, "Joey Echeverria" <[EMAIL PROTECTED]> wrote: +1 on maven modules. That will simplify the native code build/integration that I'm working on for HBASE-1316. -Joey On May 27, 2011 6:15 PM, "Ryan Rawson" <[EMAIL PROTECTED]> wrote: > The build modules are fine, I just wanted to voice my opinions on avro > vs thrift. I dont think we should spend a lot of time attempting to > build a avro vs thrift thing, we should plan to eventually move to > thrift as our RPC serialization. I also concur with Todd, our server > side code has had a lot of work and it isnt half bad now :-) > > +1 to maven modules, they are pretty cool > > On Fri, May 27, 2011 at 2:38 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: >> I don't disagree with any of this but the fact is we have compile time differences if going against secure Hadoop 0.20 or non-secure Hadoop 0.20. >> >> So either we decide to punt on integration with secure Hadoop 0.20 or we deal with the compile time differences. If dealing with them, we can do it by reflection, which is brittle and can be difficult to understand and debug, and someone would have to do this work; or we can wholesale replace RPC with something based on Thrift, and someone would have to do the work; or we take the pluggable RPC changes that Gary has already developed and modularize the build, which Eric has already volunteered to do. >> >> - Andy >> >> --- On Fri, 5/27/11, Todd Lipcon <[EMAIL PROTECTED]> wrote: >> >>> From: Todd Lipcon <[EMAIL PROTECTED]> >>> Subject: Re: modular build and pluggable rpc >>> To: [EMAIL PROTECTED] >>> Cc: [EMAIL PROTECTED] >>> Date: Friday, May 27, 2011, 1:30 PM >>> Agreed - I'm all for Thrift. >>> >>> Though, I actually, contrary to Ryan, think that the >>> existing HBaseRPC >>> handler/client code is pretty good -- better than the >>> equivalents from >>> Thrift Java. >>> >>> We could start by using Thrift serialization on our >>> existing transport >>> -- then maybe work towards contributing it upstream to the >>> Thrift >>> project. HDFS folks are potentially interested in doing >>> that as well. >>> >>> -Todd >>> >>> On Fri, May 27, 2011 at 1:10 PM, Ryan Rawson <[EMAIL PROTECTED]> >>> wrote: >>> > I'm -1 on avro as a RPC format. Thrift is the way to >>> go, any of the >>> > advantages of smaller serialization of avro is lost by >>> the sheer >>> > complexity of avro and therefore the potential bugs. >>> > >>> > I understand the desire to have a pluggable RPC >>> engine, but it feels >>> > like the better approach would be to adopt a unified >>> RPC and just be >>> > done with it. I had a look at the HsHa mechanism in +
Eric Yang 2011-05-31, 04:55
-
Re: modular build and pluggable rpcStack 2011-05-31, 20:22
On Mon, May 30, 2011 at 9:55 PM, Eric Yang <[EMAIL PROTECTED]> wrote:
> Maven modulation could be enhanced to have a structure looks like this: > > Super POM > +- common > +- shell > +- master > +- region-server > +- coprocessor > > The software is basically group by processor type (role of the process) and a shared library. > I'd change the list above. shell should be client and perhaps master and regionserver should be both inside a single 'server' submodule. We need to add security in there. Perhaps we'd have a submodule for thrift, avro, rest (and perhaps rest war file)? (Is this too many submodules -- I suppose once we are submodularized, adding new ones is trivial. Its the initial move to submodules that is painful) > For RPC, there are several feasible options, avro, thrift and jackson+jersey (REST). Avro may seems cumbersome to define the schema in JSON string. Thrift comes with it's own rpc server, it is not trivial to add authorization and authentication to secure the rpc transport. Jackson+Jersey RPC message is biggest message size compare to Avro and thrift. All three frameworks have pros and cons but I think Jackson+jersey have the right balance for rpc framework. In most of the use case, pluggable RPC can be narrow down to two main category of use cases: > > 1. Freedom of creating most efficient rpc but hard to integrate with everything else because it's custom made. > 2. Being able to evolve message passing and versioning. > > If we can see beyond first reason, and realize second reason is in part polymorphic serialization. This means, Jackson+Jersey is probably the better choice as a RPC framework because Jackson supports polymorphic serialization, and Jersey builds on HTTP protocol. It would be easier to versioning and add security on top of existing standards. The syntax and feature set seems more engineering proper to me. > I always considered http attactive but much too heavy-weight for hbase rpc; each request/response would carry a bunch of what are for the most part extraneous headers. I suppose we should just measure. Regards JSON messages, thats interesting but hbase is all about binary data. Does jackson/jersey do BSON? St.Ack +
Stack 2011-05-31, 20:22
-
Re: modular build and pluggable rpcRyan Rawson 2011-05-31, 20:42
The cost of serialization is non trivial and a substantial expense in
conveying information from regionserver -> client. I did some timings, and sending data across the wire is surprisingly slow, but attempting to compress it with various compression systems ended up taking 50-100ms on average case (1-5mb Result[] sets). Originally when conceptualizing thrift, the thought was to just send the KeyValue byte[] over thrift as an opaque blob and not doing a whole structure thing, eg: no KeyValue structure with parts for each of the parts of a KeyValue. On large results that cost becomes prohibitive. While HTTP has a high overhead of headers, if one wanted to be http-oriented you could do: http://www.chromium.org/spdy The nice thing is that HTTP has a good set of interops and the like. The bad thing is it is too verbose. -ryan On Tue, May 31, 2011 at 1:22 PM, Stack <[EMAIL PROTECTED]> wrote: > On Mon, May 30, 2011 at 9:55 PM, Eric Yang <[EMAIL PROTECTED]> wrote: >> Maven modulation could be enhanced to have a structure looks like this: >> >> Super POM >> +- common >> +- shell >> +- master >> +- region-server >> +- coprocessor >> >> The software is basically group by processor type (role of the process) and a shared library. >> > > I'd change the list above. shell should be client and perhaps master > and regionserver should be both inside a single 'server' submodule. > We need to add security in there. Perhaps we'd have a submodule for > thrift, avro, rest (and perhaps rest war file)? (Is this too many > submodules -- I suppose once we are submodularized, adding new ones > is trivial. Its the initial move to submodules that is painful) > > >> For RPC, there are several feasible options, avro, thrift and jackson+jersey (REST). Avro may seems cumbersome to define the schema in JSON string. Thrift comes with it's own rpc server, it is not trivial to add authorization and authentication to secure the rpc transport. Jackson+Jersey RPC message is biggest message size compare to Avro and thrift. All three frameworks have pros and cons but I think Jackson+jersey have the right balance for rpc framework. In most of the use case, pluggable RPC can be narrow down to two main category of use cases: >> >> 1. Freedom of creating most efficient rpc but hard to integrate with everything else because it's custom made. >> 2. Being able to evolve message passing and versioning. >> >> If we can see beyond first reason, and realize second reason is in part polymorphic serialization. This means, Jackson+Jersey is probably the better choice as a RPC framework because Jackson supports polymorphic serialization, and Jersey builds on HTTP protocol. It would be easier to versioning and add security on top of existing standards. The syntax and feature set seems more engineering proper to me. >> > > I always considered http attactive but much too heavy-weight for hbase > rpc; each request/response would carry a bunch of what are for the > most part extraneous headers. I suppose we should just measure. > Regards JSON messages, thats interesting but hbase is all about binary > data. Does jackson/jersey do BSON? > > St.Ack > +
Ryan Rawson 2011-05-31, 20:42
-
Re: modular build and pluggable rpcEric Yang 2011-05-31, 21:27
+1 on Server and Client sub-modules.
Jackson 1.7 supports BSON. Jersey 1.6 (latest) is including Jackson 1.5.5 but can be swap out with Jackson 1.7 with maven tricks or wait for Jersey to update to the latest jackson. Regards, Eric On 5/31/11 1:22 PM, "Stack" <[EMAIL PROTECTED]> wrote: On Mon, May 30, 2011 at 9:55 PM, Eric Yang <[EMAIL PROTECTED]> wrote: > Maven modulation could be enhanced to have a structure looks like this: > > Super POM > +- common > +- shell > +- master > +- region-server > +- coprocessor > > The software is basically group by processor type (role of the process) and a shared library. > I'd change the list above. shell should be client and perhaps master and regionserver should be both inside a single 'server' submodule. We need to add security in there. Perhaps we'd have a submodule for thrift, avro, rest (and perhaps rest war file)? (Is this too many submodules -- I suppose once we are submodularized, adding new ones is trivial. Its the initial move to submodules that is painful) > For RPC, there are several feasible options, avro, thrift and jackson+jersey (REST). Avro may seems cumbersome to define the schema in JSON string. Thrift comes with it's own rpc server, it is not trivial to add authorization and authentication to secure the rpc transport. Jackson+Jersey RPC message is biggest message size compare to Avro and thrift. All three frameworks have pros and cons but I think Jackson+jersey have the right balance for rpc framework. In most of the use case, pluggable RPC can be narrow down to two main category of use cases: > > 1. Freedom of creating most efficient rpc but hard to integrate with everything else because it's custom made. > 2. Being able to evolve message passing and versioning. > > If we can see beyond first reason, and realize second reason is in part polymorphic serialization. This means, Jackson+Jersey is probably the better choice as a RPC framework because Jackson supports polymorphic serialization, and Jersey builds on HTTP protocol. It would be easier to versioning and add security on top of existing standards. The syntax and feature set seems more engineering proper to me. > I always considered http attactive but much too heavy-weight for hbase rpc; each request/response would carry a bunch of what are for the most part extraneous headers. I suppose we should just measure. Regards JSON messages, thats interesting but hbase is all about binary data. Does jackson/jersey do BSON? St.Ack +
Eric Yang 2011-05-31, 21:27
-
Re: modular build and pluggable rpcGary Helmling 2011-05-31, 22:47
On Tue, May 31, 2011 at 1:22 PM, Stack <[EMAIL PROTECTED]> wrote:
> On Mon, May 30, 2011 at 9:55 PM, Eric Yang <[EMAIL PROTECTED]> wrote: > > Maven modulation could be enhanced to have a structure looks like this: > > > > Super POM > > +- common > > +- shell > > +- master > > +- region-server > > +- coprocessor > > > > The software is basically group by processor type (role of the process) > and a shared library. > > > > I'd change the list above. shell should be client and perhaps master > and regionserver should be both inside a single 'server' submodule. > We need to add security in there. Perhaps we'd have a submodule for > thrift, avro, rest (and perhaps rest war file)? (Is this too many > submodules -- I suppose once we are submodularized, adding new ones > is trivial. Its the initial move to submodules that is painful) > > I'd be in favor of starting simply as well. Something like: - common - client - server - security or even combine the "common" bits just in to "client". I agree thrift, avro and rest would make perfect module candidates as well, but I don't feel particularly strongly about them myself. I also don't really see the coprocessor framework as a separate module. It's more like part of the server infrastructure. HTTP/REST is one good option to have (among many) as an application interface to HBase. But I'm skeptical of it's applicability as an internal RPC transport. Personally, I think we need a well defined (but still performant) serialization format to better support cross-version operation and alternate clients such as asynchbase. The actual RPC framework we use (from Hadoop) may not be perfect, but it's seen a lot of profiling and it's threading model seems to perform pretty well for HBase workloads with long-lived connections. The current framework also continues to evolve, with some recent effort to work in asynchronous handling on the server-side. And in addition we have full support for security via Kerberos and token-based DIGEST-MD5 authentication in a separate branch. I'm personally not really interested in repeating the work to incorporate security over a new HTTP based stack. I think I'd need some convincing that an HTTP transport would perform better than what we have. I'm more inclined to go an evolutionary route in improving our current stack. --gh +
Gary Helmling 2011-05-31, 22:47
-
Re: modular build and pluggable rpcLars George 2011-06-01, 13:26
I agree with Gary here.
What would "common" be anyways? Just curious. Also, Flume is using Thrift, so do quite a few here as the gateway server to access HBase. Could we get some reports from those who used it if they are happy with the Thrift RPC? It seems like they are, but maybe we should carefully check into the option. I am also wary about using REST. Especially when you handle very small payloads, then the performance is driven by the protocol overhead. Lars On Wed, Jun 1, 2011 at 12:47 AM, Gary Helmling <[EMAIL PROTECTED]> wrote: > On Tue, May 31, 2011 at 1:22 PM, Stack <[EMAIL PROTECTED]> wrote: > >> On Mon, May 30, 2011 at 9:55 PM, Eric Yang <[EMAIL PROTECTED]> wrote: >> > Maven modulation could be enhanced to have a structure looks like this: >> > >> > Super POM >> > +- common >> > +- shell >> > +- master >> > +- region-server >> > +- coprocessor >> > >> > The software is basically group by processor type (role of the process) >> and a shared library. >> > >> >> I'd change the list above. shell should be client and perhaps master >> and regionserver should be both inside a single 'server' submodule. >> We need to add security in there. Perhaps we'd have a submodule for >> thrift, avro, rest (and perhaps rest war file)? (Is this too many >> submodules -- I suppose once we are submodularized, adding new ones >> is trivial. Its the initial move to submodules that is painful) >> >> > I'd be in favor of starting simply as well. Something like: > > - common > - client > - server > - security > > or even combine the "common" bits just in to "client". I agree thrift, avro > and rest would make perfect module candidates as well, but I don't feel > particularly strongly about them myself. I also don't really see the > coprocessor framework as a separate module. It's more like part of the > server infrastructure. > > HTTP/REST is one good option to have (among many) as an application > interface to HBase. But I'm skeptical of it's applicability as an internal > RPC transport. Personally, I think we need a well defined (but still > performant) serialization format to better support cross-version operation > and alternate clients such as asynchbase. The actual RPC framework we use > (from Hadoop) may not be perfect, but it's seen a lot of profiling and it's > threading model seems to perform pretty well for HBase workloads with > long-lived connections. > > The current framework also continues to evolve, with some recent effort to > work in asynchronous handling on the server-side. And in addition we have > full support for security via Kerberos and token-based DIGEST-MD5 > authentication in a separate branch. I'm personally not really interested > in repeating the work to incorporate security over a new HTTP based stack. > I think I'd need some convincing that an HTTP transport would perform better > than what we have. I'm more inclined to go an evolutionary route in > improving our current stack. > > --gh > +
Lars George 2011-06-01, 13:26
-
Re: modular build and pluggable rpcAndrew Purtell 2011-06-01, 15:54
> From: Lars George <[EMAIL PROTECTED]>
> What would "common" be anyways? Just curious. Maybe 'common' should be 'core' ? > Also, Flume is using Thrift, so do quite a few here as the gateway > server to access HBase. Could we get some reports from those who used > it if they are happy with the Thrift RPC? It seems like they are, but > maybe we should carefully check into the option. I am also wary about > using REST. Especially when you handle very small payloads, then the > performance is driven by the protocol overhead. Well... :-) REST is suitable for an application interface, depending on what you want to do, lots of people have success with that architectural approach. But not for internal RPC. REST addresses individual resources per transaction, so headers are significant overhead even with HTTP pipelining. Doing otherwise is non-RESTful, so let's make a distinction between REST and HTTP. A non-RESTful approach could use HTTP as little more than a network transport. Something like SPDY or BEEP after an initial HTTP transaction. Or Thrift-over-HTTP. But what if any benefit is there over the existing RPC layer or using Thrift directly? - Andy +
Andrew Purtell 2011-06-01, 15:54
-
Re: modular build and pluggable rpcAndrew Purtell 2011-05-31, 23:09
> From: Gary Helmling <[EMAIL PROTECTED]>
> [...] I'm personally not really interested in repeating the > work to incorporate security over a new HTTP based stack. > I think I'd need some convincing that an HTTP transport would > perform better than what we have. I'm more inclined to go an > evolutionary route in improving our current stack. I am skeptical as well about HTTP based RPC transport performing anywhere in the ballpark of the current RPC, given my experience with developing Stargate (and profiling server and client) and use of commons httpclient on a crawler project. I'd also prefer not to see Gary's work, representing several man-weeks of effort, to smartly integrate security with the existing RPC layer go to waste. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) +
Andrew Purtell 2011-05-31, 23:09
|