|
|
-
php to thrift vs java api
Jack Levin 2011-07-12, 05:22
For those who are interested, I did some loadtesting of Puts and Gets speeds using PHP -> Thrift Server -> HBASE, and Java API Client -> HBASE.
Writing and reading 5 - 10 byte cells (from Cache), is 30 times faster using Java API client. So I am going to assume that writing near realtime applications like search will be better with Java API, since it takes a while for php to serialize data, send out of the socket and then for Thrift server to talk to HBase.
Average reads per row were 0.5 ms with Java, and 15 ms (still fast!) with PHP client.
I am thinking that Tomcat with java servlet that does a lot of work on the backend is a way to go. When we set it up, I will follow up with results; Should be just as fast as the HTTP wrap-around should not add significant latency, because we are not doing multiple GETs as most of the logic will be done on the backend.
-Jack
-
Re: php to thrift vs java api
Jeff Whiting 2011-07-12, 16:10
Those are interesting results. Are you using the php thrift extension? It is significantly faster with (de)serialization. You may want to grab the latest nightly build of thrift as it has quite a few bug fixes in the php thrift extension.
~Jeff
On 7/11/2011 11:22 PM, Jack Levin wrote: > For those who are interested, I did some loadtesting of Puts and Gets > speeds using PHP -> Thrift Server -> HBASE, and Java API Client -> > HBASE. > > Writing and reading 5 - 10 byte cells (from Cache), is 30 times faster > using Java API client. So I am going to assume that writing near > realtime applications like search will be better with Java API, since > it takes a while for php to serialize data, send out of the socket and > then for Thrift server to talk to HBase. > > Average reads per row were 0.5 ms with Java, and 15 ms (still fast!) > with PHP client. > > I am thinking that Tomcat with java servlet that does a lot of work on > the backend is a way to go. When we set it up, I will follow up with > results; Should be just as fast as the HTTP wrap-around should not > add significant latency, because we are not doing multiple GETs as > most of the logic will be done on the backend. > > -Jack
-- Jeff Whiting Qualtrics Senior Software Engineer [EMAIL PROTECTED]
-
Re: php to thrift vs java api
Jack Levin 2011-07-16, 16:31
Yes, we are using the latest .so, but unfortunately it does not make any difference, I think this is just a matter of the language, PHP is stateless, where Java runs as servlet inside the JVM with hot Jars; With PHP, even if IO to thrift is not an issue itself, given the task say merge join two arrays of 10000 elements each will take much much longer than Java simply due to how it stores and accesses datastructures in RAM.
-Jack
On Tue, Jul 12, 2011 at 9:10 AM, Jeff Whiting <[EMAIL PROTECTED]> wrote: > Those are interesting results. Are you using the php thrift extension? It > is significantly faster with (de)serialization. You may want to grab the > latest nightly build of thrift as it has quite a few bug fixes in the php > thrift extension. > > ~Jeff > > On 7/11/2011 11:22 PM, Jack Levin wrote: >> >> For those who are interested, I did some loadtesting of Puts and Gets >> speeds using PHP -> Thrift Server -> HBASE, and Java API Client -> >> HBASE. >> >> Writing and reading 5 - 10 byte cells (from Cache), is 30 times faster >> using Java API client. So I am going to assume that writing near >> realtime applications like search will be better with Java API, since >> it takes a while for php to serialize data, send out of the socket and >> then for Thrift server to talk to HBase. >> >> Average reads per row were 0.5 ms with Java, and 15 ms (still fast!) >> with PHP client. >> >> I am thinking that Tomcat with java servlet that does a lot of work on >> the backend is a way to go. When we set it up, I will follow up with >> results; Should be just as fast as the HTTP wrap-around should not >> add significant latency, because we are not doing multiple GETs as >> most of the logic will be done on the backend. >> >> -Jack > > -- > Jeff Whiting > Qualtrics Senior Software Engineer > [EMAIL PROTECTED] > >
-
Re: php to thrift vs java api
Jeff Whiting 2011-07-18, 15:39
I agree. PHP is a slow language especially when it has to create any objects. PHP appears to be fast because so much code is actually in C extensions.
~Jeff
On 7/16/2011 10:31 AM, Jack Levin wrote: > Yes, we are using the latest .so, but unfortunately it does not make > any difference, I think this is just a matter of the language, PHP is > stateless, where Java runs as servlet inside the JVM with hot Jars; > With PHP, even if IO to thrift is not an issue itself, given the task > say merge join two arrays of 10000 elements each will take much much > longer than Java simply due to how it stores and accesses > datastructures in RAM. > > -Jack > > On Tue, Jul 12, 2011 at 9:10 AM, Jeff Whiting<[EMAIL PROTECTED]> wrote: >> Those are interesting results. Are you using the php thrift extension? It >> is significantly faster with (de)serialization. You may want to grab the >> latest nightly build of thrift as it has quite a few bug fixes in the php >> thrift extension. >> >> ~Jeff >> >> On 7/11/2011 11:22 PM, Jack Levin wrote: >>> For those who are interested, I did some loadtesting of Puts and Gets >>> speeds using PHP -> Thrift Server -> HBASE, and Java API Client -> >>> HBASE. >>> >>> Writing and reading 5 - 10 byte cells (from Cache), is 30 times faster >>> using Java API client. So I am going to assume that writing near >>> realtime applications like search will be better with Java API, since >>> it takes a while for php to serialize data, send out of the socket and >>> then for Thrift server to talk to HBase. >>> >>> Average reads per row were 0.5 ms with Java, and 15 ms (still fast!) >>> with PHP client. >>> >>> I am thinking that Tomcat with java servlet that does a lot of work on >>> the backend is a way to go. When we set it up, I will follow up with >>> results; Should be just as fast as the HTTP wrap-around should not >>> add significant latency, because we are not doing multiple GETs as >>> most of the logic will be done on the backend. >>> >>> -Jack >> -- >> Jeff Whiting >> Qualtrics Senior Software Engineer >> [EMAIL PROTECTED] >> >>
-- Jeff Whiting Qualtrics Senior Software Engineer [EMAIL PROTECTED]
|
|