|
S Ahmed
2010-07-13, 04:36
Ryan Rawson
2010-07-13, 04:48
S Ahmed
2010-07-13, 09:47
Fred Zappert
2010-07-13, 12:09
S Ahmed
2010-07-13, 14:19
|
-
stumbleupon and hbaseS Ahmed 2010-07-13, 04:36
I realize that stumbleupon uses hbase for su.pr, and is currently using
hbase for new functionality but isn't necessarily going back and re-coding everything to fit into the hbase model. Having said that, do you guys think hbase could very well be used for things like: 1. when a user logs in, keep the user session in hbase? 2. for pages like: http://www.stumbleupon.com/url/blogs.babble.com/family-kitchen/2010/06/16/hasselback-potatoes So this involves all elements on the page, would this be possible and more importantly make sense with hbase? 3. What sort of things/functionality do you see NOT being suitable in your experiences? Thanks for your insights!
-
Re: stumbleupon and hbaseRyan Rawson 2010-07-13, 04:48
Hi,
To answer your question in broader terms: 1. Sessions are tricky, they tend to have a 1:1 read/write model. This would cause extreme version inflation in HBase without appropriate action to mitigate it. Generally speaking though, I believe in HBase as a high performance low latency data store, so it should fit nicely any systems that require those things. 2. If you can achieve the right level of performance in data retrieval, why not? 3. Very complex relational-based data might not be appropriate to store in HBase. For example, indexes in HBase are not free and require a bunch of things to make it happen. Infrequently written, but heavily read things might not make sense in HBase... ie: things you might use a CDN for. Software downloads. Very large media. You might be able to use HBase as the source for things that are cached in a layer like varnish or akamai. -ryan On Mon, Jul 12, 2010 at 9:36 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > I realize that stumbleupon uses hbase for su.pr, and is currently using > hbase for new functionality but isn't necessarily going back and re-coding > everything to fit into the hbase model. > > Having said that, do you guys think hbase could very well be used for things > like: > > 1. when a user logs in, keep the user session in hbase? > 2. for pages like: > > http://www.stumbleupon.com/url/blogs.babble.com/family-kitchen/2010/06/16/hasselback-potatoes > > > > So this involves all elements on the page, would this be possible and more > importantly make sense with hbase? > > 3. What sort of things/functionality do you see NOT being suitable in your > experiences? > > Thanks for your insights! >
-
Re: stumbleupon and hbaseS Ahmed 2010-07-13, 09:47
>>This would cause extreme version inflation in HBase without
>>appropriate action to mitigate it. Can you please expand on this? What exactly can be done to mitigate this? On Tue, Jul 13, 2010 at 12:48 AM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > Hi, > > To answer your question in broader terms: > > 1. Sessions are tricky, they tend to have a 1:1 read/write model. > This would cause extreme version inflation in HBase without > appropriate action to mitigate it. Generally speaking though, I > believe in HBase as a high performance low latency data store, so it > should fit nicely any systems that require those things. > > 2. If you can achieve the right level of performance in data retrieval, why > not? > > 3. Very complex relational-based data might not be appropriate to > store in HBase. For example, indexes in HBase are not free and require > a bunch of things to make it happen. > > Infrequently written, but heavily read things might not make sense in > HBase... ie: things you might use a CDN for. Software downloads. Very > large media. You might be able to use HBase as the source for things > that are cached in a layer like varnish or akamai. > > -ryan > > On Mon, Jul 12, 2010 at 9:36 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > I realize that stumbleupon uses hbase for su.pr, and is currently using > > hbase for new functionality but isn't necessarily going back and > re-coding > > everything to fit into the hbase model. > > > > Having said that, do you guys think hbase could very well be used for > things > > like: > > > > 1. when a user logs in, keep the user session in hbase? > > 2. for pages like: > > > > > http://www.stumbleupon.com/url/blogs.babble.com/family-kitchen/2010/06/16/hasselback-potatoes > > > > > > > > So this involves all elements on the page, would this be possible and > more > > importantly make sense with hbase? > > > > 3. What sort of things/functionality do you see NOT being suitable in > your > > experiences? > > > > Thanks for your insights! > > >
-
Re: stumbleupon and hbaseFred Zappert 2010-07-13, 12:09
Ahmed,
HBase versions all rows in a keyed entry, up to a limit that you specify and manage through some form of garbage collection. A keyed entry is a set of columns or column family. So, it's not suitable for a distributed session cache, where there are other options available. Ryan's point was that in the course of a session, you would be creating lots of versions, i.e., the inflation. For a distributed and persistent cache, look at the Resin app server, which is in fact used for session management in a cluster. Fred On Tue, Jul 13, 2010 at 2:47 AM, S Ahmed <[EMAIL PROTECTED]> wrote: > >>This would cause extreme version inflation in HBase without > >>appropriate action to mitigate it. > Can you please expand on this? What exactly can be done to mitigate this? > > > On Tue, Jul 13, 2010 at 12:48 AM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > To answer your question in broader terms: > > > > 1. Sessions are tricky, they tend to have a 1:1 read/write model. > > This would cause extreme version inflation in HBase without > > appropriate action to mitigate it. Generally speaking though, I > > believe in HBase as a high performance low latency data store, so it > > should fit nicely any systems that require those things. > > > > 2. If you can achieve the right level of performance in data retrieval, > why > > not? > > > > 3. Very complex relational-based data might not be appropriate to > > store in HBase. For example, indexes in HBase are not free and require > > a bunch of things to make it happen. > > > > Infrequently written, but heavily read things might not make sense in > > HBase... ie: things you might use a CDN for. Software downloads. Very > > large media. You might be able to use HBase as the source for things > > that are cached in a layer like varnish or akamai. > > > > -ryan > > > > On Mon, Jul 12, 2010 at 9:36 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > I realize that stumbleupon uses hbase for su.pr, and is currently > using > > > hbase for new functionality but isn't necessarily going back and > > re-coding > > > everything to fit into the hbase model. > > > > > > Having said that, do you guys think hbase could very well be used for > > things > > > like: > > > > > > 1. when a user logs in, keep the user session in hbase? > > > 2. for pages like: > > > > > > > > > http://www.stumbleupon.com/url/blogs.babble.com/family-kitchen/2010/06/16/hasselback-potatoes > > > > > > > > > > > > So this involves all elements on the page, would this be possible and > > more > > > importantly make sense with hbase? > > > > > > 3. What sort of things/functionality do you see NOT being suitable in > > your > > > experiences? > > > > > > Thanks for your insights! > > > > > >
-
Re: stumbleupon and hbaseS Ahmed 2010-07-13, 14:19
Just to be clear, when I say 'session' I mean a GUID + timestamp + userID
value stored in the system. So whenever someone requests a page that requires authentication, it passes the GUID value stored in the user's cookie with a value in the HBase, if the GUID matches and the timestamp hasn't expired, the user is considered authenticated. On Tue, Jul 13, 2010 at 8:09 AM, Fred Zappert <[EMAIL PROTECTED]> wrote: > Ahmed, > > HBase versions all rows in a keyed entry, up to a limit that you specify > and > manage through some form of garbage collection. > > A keyed entry is a set of columns or column family. So, it's not suitable > for a distributed session cache, where there are other options available. > Ryan's point was that in the course of a session, you would be creating > lots > of versions, i.e., the inflation. > > For a distributed and persistent cache, look at the Resin app server, which > is in fact used for session management in a cluster. > > Fred > > On Tue, Jul 13, 2010 at 2:47 AM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > >>This would cause extreme version inflation in HBase without > > >>appropriate action to mitigate it. > > Can you please expand on this? What exactly can be done to mitigate > this? > > > > > > On Tue, Jul 13, 2010 at 12:48 AM, Ryan Rawson <[EMAIL PROTECTED]> > wrote: > > > > > Hi, > > > > > > To answer your question in broader terms: > > > > > > 1. Sessions are tricky, they tend to have a 1:1 read/write model. > > > This would cause extreme version inflation in HBase without > > > appropriate action to mitigate it. Generally speaking though, I > > > believe in HBase as a high performance low latency data store, so it > > > should fit nicely any systems that require those things. > > > > > > 2. If you can achieve the right level of performance in data retrieval, > > why > > > not? > > > > > > 3. Very complex relational-based data might not be appropriate to > > > store in HBase. For example, indexes in HBase are not free and require > > > a bunch of things to make it happen. > > > > > > Infrequently written, but heavily read things might not make sense in > > > HBase... ie: things you might use a CDN for. Software downloads. Very > > > large media. You might be able to use HBase as the source for things > > > that are cached in a layer like varnish or akamai. > > > > > > -ryan > > > > > > On Mon, Jul 12, 2010 at 9:36 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > > I realize that stumbleupon uses hbase for su.pr, and is currently > > using > > > > hbase for new functionality but isn't necessarily going back and > > > re-coding > > > > everything to fit into the hbase model. > > > > > > > > Having said that, do you guys think hbase could very well be used for > > > things > > > > like: > > > > > > > > 1. when a user logs in, keep the user session in hbase? > > > > 2. for pages like: > > > > > > > > > > > > > > http://www.stumbleupon.com/url/blogs.babble.com/family-kitchen/2010/06/16/hasselback-potatoes > > > > > > > > > > > > > > > > So this involves all elements on the page, would this be possible and > > > more > > > > importantly make sense with hbase? > > > > > > > > 3. What sort of things/functionality do you see NOT being suitable > in > > > your > > > > experiences? > > > > > > > > Thanks for your insights! > > > > > > > > > > |