|
jie_zhou
2011-09-21, 06:40
Uma Maheswara Rao G 72686...
2011-09-21, 07:12
Koert Kuipers
2011-09-21, 20:59
GOEKE, MATTHEW
2011-09-21, 21:37
Ted Dunning
2011-09-21, 21:46
Stephan Gammeter
2011-09-23, 12:14
Koert Kuipers
2011-09-23, 13:40
Stephan Gammeter
2011-10-11, 19:44
Todd Lipcon
2011-10-11, 19:52
|
-
A question about RPCjie_zhou 2011-09-21, 06:40
Dear:
Nice to meet you! I am a beginner of hadoop. Recently, I have seen the source of RPC of hadoop,but now I have a question. As we know,hadoop RPC make use of Dynamic proxy mechanism ,but why not use IDL such as CORBA, or AIDL of Android? Thanks for your early reply. Best Regards, jie
-
Re: A question about RPCUma Maheswara Rao G 72686... 2011-09-21, 07:12
Hadoop has its RPC machanism mainly Writables to overcome some of the disadvantages on normal serializations.
For more info: http://www.lexemetech.com/2008/07/rpc-and-serialization-with-hadoop.html Regards, Uma ----- Original Message ----- From: jie_zhou <[EMAIL PROTECTED]> Date: Wednesday, September 21, 2011 12:12 pm Subject: A question about RPC To: [EMAIL PROTECTED] > Dear: > > Nice to meet you! > > I am a beginner of hadoop. Recently, I have seen the source of RPC of > hadoop,but now I have a question. As we know,hadoop RPC make use > of Dynamic > proxy mechanism ,but > > why not use IDL such as CORBA, or AIDL of Android? > > Thanks for your early reply. > > Best Regards, > > jie > > > > > > > > > >
-
Re: A question about RPCKoert Kuipers 2011-09-21, 20:59
i would love an IDL, plus that modern serialization frameworks such as
protobuf/thrift support versioning (although i still have issues with different versions of thrift not working nicely together, argh why is that). the only downside is perhaps that they are a little slower than writables. On Wed, Sep 21, 2011 at 3:12 AM, Uma Maheswara Rao G 72686 < [EMAIL PROTECTED]> wrote: > Hadoop has its RPC machanism mainly Writables to overcome some of the > disadvantages on normal serializations. > For more info: > http://www.lexemetech.com/2008/07/rpc-and-serialization-with-hadoop.html > > Regards, > Uma > ----- Original Message ----- > From: jie_zhou <[EMAIL PROTECTED]> > Date: Wednesday, September 21, 2011 12:12 pm > Subject: A question about RPC > To: [EMAIL PROTECTED] > > > Dear: > > > > Nice to meet you! > > > > I am a beginner of hadoop. Recently, I have seen the source of RPC of > > hadoop,but now I have a question. As we know,hadoop RPC make use > > of Dynamic > > proxy mechanism ,but > > > > why not use IDL such as CORBA, or AIDL of Android? > > > > Thanks for your early reply. > > > > Best Regards, > > > > jie > > > > > > > > > > > > > > > > > > > > >
-
RE: A question about RPCGOEKE, MATTHEW 2011-09-21, 21:37
Correct me if I am wrong but isn't Hadoop moving towards the Avro IDL for full RPC (in either 0.23 or some later version)?
Matt From: Koert Kuipers [mailto:[EMAIL PROTECTED]] Sent: Wednesday, September 21, 2011 4:00 PM To: [EMAIL PROTECTED] Subject: Re: A question about RPC i would love an IDL, plus that modern serialization frameworks such as protobuf/thrift support versioning (although i still have issues with different versions of thrift not working nicely together, argh why is that). the only downside is perhaps that they are a little slower than writables. On Wed, Sep 21, 2011 at 3:12 AM, Uma Maheswara Rao G 72686 <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hadoop has its RPC machanism mainly Writables to overcome some of the disadvantages on normal serializations. For more info: http://www.lexemetech.com/2008/07/rpc-and-serialization-with-hadoop.html Regards, Uma ----- Original Message ----- From: jie_zhou <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> Date: Wednesday, September 21, 2011 12:12 pm Subject: A question about RPC To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> > Dear: > > Nice to meet you! > > I am a beginner of hadoop. Recently, I have seen the source of RPC of > hadoop,but now I have a question. As we know,hadoop RPC make use > of Dynamic > proxy mechanism ,but > > why not use IDL such as CORBA, or AIDL of Android? > > Thanks for your early reply. > > Best Regards, > > jie > > > > > > > > > > This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware". Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations.
-
Re: A question about RPCTed Dunning 2011-09-21, 21:46
IDL's are nice, but old school systems like CORBA are death when you need to
change things. Avro, protobufs and thrift are all miles better. On Wed, Sep 21, 2011 at 1:59 PM, Koert Kuipers <[EMAIL PROTECTED]> wrote: > i would love an IDL, plus that modern serialization frameworks such as > protobuf/thrift support versioning (although i still have issues with > different versions of thrift not working nicely together, argh why is that). > the only downside is perhaps that they are a little slower than writables. > > On Wed, Sep 21, 2011 at 3:12 AM, Uma Maheswara Rao G 72686 < > [EMAIL PROTECTED]> wrote: > >> Hadoop has its RPC machanism mainly Writables to overcome some of the >> disadvantages on normal serializations. >> For more info: >> http://www.lexemetech.com/2008/07/rpc-and-serialization-with-hadoop.html >> >> Regards, >> Uma >> ----- Original Message ----- >> From: jie_zhou <[EMAIL PROTECTED]> >> Date: Wednesday, September 21, 2011 12:12 pm >> Subject: A question about RPC >> To: [EMAIL PROTECTED] >> >> > Dear: >> > >> > Nice to meet you! >> > >> > I am a beginner of hadoop. Recently, I have seen the source of RPC of >> > hadoop,but now I have a question. As we know,hadoop RPC make use >> > of Dynamic >> > proxy mechanism ,but >> > >> > why not use IDL such as CORBA, or AIDL of Android? >> > >> > Thanks for your early reply. >> > >> > Best Regards, >> > >> > jie >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >
-
Re: A question about RPCStephan Gammeter 2011-09-23, 12:14
I don't think protobuf are slower than writable actually, they do really
well in speed. I actually wrote some rpc code in C++ for protocolbuffers and some swig wrappers to have clients in java. A simple c++ server can easily handle about 20k qps in that setup and this is just with a naive implementation where still some excess data copies happen during the processing of requests. If i have time i would like to opensource it, but i would need some help to get it running properly in other languages, so that it can be truly cross language. (right now servers are only supported in c++, clients are synchronous and asynchronous in c++, in java only synchronous clients are supported) On 21.09.2011 22:59, Koert Kuipers wrote: > i would love an IDL, plus that modern serialization frameworks such as > protobuf/thrift support versioning (although i still have issues with > different versions of thrift not working nicely together, argh why is > that). the only downside is perhaps that they are a little slower than > writables. > > On Wed, Sep 21, 2011 at 3:12 AM, Uma Maheswara Rao G 72686 > <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote: > > Hadoop has its RPC machanism mainly Writables to overcome some of > the disadvantages on normal serializations. > For more info: > http://www.lexemetech.com/2008/07/rpc-and-serialization-with-hadoop.html > > Regards, > Uma > ----- Original Message ----- > From: jie_zhou <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> > Date: Wednesday, September 21, 2011 12:12 pm > Subject: A question about RPC > To: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> > > > Dear: > > > > Nice to meet you! > > > > I am a beginner of hadoop. Recently, I have seen the source of > RPC of > > hadoop,but now I have a question. As we know,hadoop RPC make use > > of Dynamic > > proxy mechanism ,but > > > > why not use IDL such as CORBA, or AIDL of Android? > > > > Thanks for your early reply. > > > > Best Regards, > > > > jie > > > > > > > > > > > > > > > > > > > > > >
-
Re: A question about RPCKoert Kuipers 2011-09-23, 13:40
did you build it on top of zmq? i really don't see the need for people
reinventing the low level rcp stuff over and over again. zmq comes with baked in request-response, pub-sub, and pipeline (distributed processing) communication. once you rely on protobuf + zmq for the rpc is it trivial to add clients in other languages, i had java, R and python talking to each other with minimal effort. for rcp the comparison of speed is a bit of a moot point, since all the latency will be in the communication, not so much in the serialization, i suspect. but once you communicate using protobuf it also becomes really tempting to store in hadoop using protobuf instead of writables/sequencefiles, and from what i have heard (i have not tested this myself) it is a good deal slower in that situation. On Fri, Sep 23, 2011 at 8:14 AM, Stephan Gammeter < [EMAIL PROTECTED]> wrote: > I don't think protobuf are slower than writable actually, they do really > well in speed. I actually wrote some rpc code in C++ for protocolbuffers and > some swig wrappers to have clients in java. A simple c++ server can easily > handle about 20k qps in that setup and this is just with a naive > implementation where still some excess data copies happen during the > processing of requests. If i have time i would like to opensource it, but i > would need some help to get it running properly in other languages, so that > it can be truly cross language. (right now servers are only supported in > c++, clients are synchronous and asynchronous in c++, in java only > synchronous clients are supported) > > > On 21.09.2011 22:59, Koert Kuipers wrote: > > i would love an IDL, plus that modern serialization frameworks such as > protobuf/thrift support versioning (although i still have issues with > different versions of thrift not working nicely together, argh why is that). > the only downside is perhaps that they are a little slower than writables. > > On Wed, Sep 21, 2011 at 3:12 AM, Uma Maheswara Rao G 72686 < > [EMAIL PROTECTED]> wrote: > >> Hadoop has its RPC machanism mainly Writables to overcome some of the >> disadvantages on normal serializations. >> For more info: >> http://www.lexemetech.com/2008/07/rpc-and-serialization-with-hadoop.html >> >> Regards, >> Uma >> ----- Original Message ----- >> From: jie_zhou <[EMAIL PROTECTED]> >> Date: Wednesday, September 21, 2011 12:12 pm >> Subject: A question about RPC >> To: [EMAIL PROTECTED] >> >> > Dear: >> > >> > Nice to meet you! >> > >> > I am a beginner of hadoop. Recently, I have seen the source of RPC of >> > hadoop,but now I have a question. As we know,hadoop RPC make use >> > of Dynamic >> > proxy mechanism ,but >> > >> > why not use IDL such as CORBA, or AIDL of Android? >> > >> > Thanks for your early reply. >> > >> > Best Regards, >> > >> > jie >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > > >
-
Re: A question about RPCStephan Gammeter 2011-10-11, 19:44
When i started writing i was not aware of zmq yet, so the connection
layer is just written using boost, but it would be quite simple to replace that with zmq, just have not gotten around to it yet. > i really don't see the need for people reinventing the low level rcp stuff over and over again. I totally agree! The thing I like about the protobufs is however that you can directly write your service definition in the .proto file which gets parsed for you by the protocol buffer compiler and you can access the parsed definitions via protoc plugins for example. With the plugins + protoc compiler nicely integrated into a cross language build system, it's not minimal effort to have clients in different languages, it's zero effort. Also it takes a little bit of code if you want to have multiple services on the same port, being able to list all services and get service definitions, etc... Basically i have just taken care of that part. > for rcp the comparison of speed is a bit of a moot point, since all the latency will be in the communication, not so much in the serialization, i suspect. Yes the communication of course adds latency but i am talking about throughput here. If you believe comparison of speed is a bit of a moot point, then you must be one of the fortunate people that never had to use SOAP ;) also you need to properly hide the latency and also make sure you minimize copying data around in memory etc... But I am certain zeromq does a much better job there than my boost implementation ;) haven't benchmarked it yet though. > but once you communicate using protobuf it also becomes really tempting to store in hadoop using protobuf instead of writables/sequencefiles, and from what i have heard (i have not tested this myself) it is a good deal slower in that situation. What do you mean of just using protobuf instead of writables/sequencefiles exactly? I.e. let's assume you just use some ProtobufToWritable adapter, i don't see how that would be much slower than using writables, writables and protobufs really just do the same job, do they not? Except that protobufs are available in other languages, are defined via the proto language etc... If you use writables or protobufs, you most likely can serialize faster than you can write to disk or to network. At least that is my feeling so far from using protobufs to store stuff in hbase or raw hfiles, but i have to admit, i have not properly benchmarked this. What kind of fileformat would you use to write serialized protobufs to, that would make it so slow? I guess in the end, one just needs to benchmark everything :) TL;DR .proto + protocol buffer plugins for generating rpc clients and servers is really handy. If writables or protobufs are faster needs to be benchmarked, but probably both serialize faster than one can write. On 23.09.2011 15:40, Koert Kuipers wrote: > did you build it on top of zmq? i really don't see the need for people > reinventing the low level rcp stuff over and over again. zmq comes > with baked in request-response, pub-sub, and pipeline (distributed > processing) communication. once you rely on protobuf + zmq for the rpc > is it trivial to add clients in other languages, i had java, R and > python talking to each other with minimal effort. > > for rcp the comparison of speed is a bit of a moot point, since all > the latency will be in the communication, not so much in the > serialization, i suspect. but once you communicate using protobuf it > also becomes really tempting to store in hadoop using protobuf instead > of writables/sequencefiles, and from what i have heard (i have not > tested this myself) it is a good deal slower in that situation. > > On Fri, Sep 23, 2011 at 8:14 AM, Stephan Gammeter > <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote: > > I don't think protobuf are slower than writable actually, they do > really well in speed. I actually wrote some rpc code in C++ for
-
Re: A question about RPCTodd Lipcon 2011-10-11, 19:52
On Tue, Oct 11, 2011 at 12:44 PM, Stephan Gammeter
<[EMAIL PROTECTED]> wrote: > TL;DR > .proto + protocol buffer plugins for generating rpc clients and servers is > really handy. If writables or protobufs are faster needs to be benchmarked, > but probably both serialize faster than one can write. Looking at performance like this is a somewhat narrow view. Of course pretty much any serialization in any language can serialize faster than you can write to network. But if you take 50% of a CPU to fill the network vs 5% of a CPU to fill the network, that drastically impacts how much other work you can be getting done in other tasks at the same time. Although many workloads are IO bound, many are not. For example, if you have a lot of RAM available for HBase, it will become CPU-bound pretty quick as you are primarily hitting cached data. -Todd -- Todd Lipcon Software Engineer, Cloudera |