Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> stumbleupon and hbase


Copy link to this message
-
Re: stumbleupon and hbase
Just to be clear, when I say 'session' I mean a GUID + timestamp + userID
value stored in the system.  So whenever someone requests a page that
requires authentication, it passes the GUID value stored in the user's
cookie with a value in the HBase, if the GUID matches and the timestamp
hasn't expired, the user is considered authenticated.

On Tue, Jul 13, 2010 at 8:09 AM, Fred Zappert <[EMAIL PROTECTED]> wrote:

> Ahmed,
>
> HBase versions all rows in a keyed entry, up to a limit that you specify
> and
> manage through some form of garbage collection.
>
> A keyed entry is a set of columns or column family.  So, it's not suitable
> for a distributed session cache, where there are other options available.
> Ryan's point was that in the course of a session, you would be creating
> lots
> of versions, i.e., the inflation.
>
> For a distributed and persistent cache, look at the Resin app server, which
> is in fact used for session management in a cluster.
>
> Fred
>
> On Tue, Jul 13, 2010 at 2:47 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
>
> > >>This would cause extreme version inflation in HBase without
> > >>appropriate action to mitigate it.
> > Can you please expand on this?  What exactly can be done to mitigate
> this?
> >
> >
> > On Tue, Jul 13, 2010 at 12:48 AM, Ryan Rawson <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hi,
> > >
> > > To answer your question in broader terms:
> > >
> > > 1. Sessions are tricky, they tend to have a 1:1 read/write model.
> > > This would cause extreme version inflation in HBase without
> > > appropriate action to mitigate it.  Generally speaking though, I
> > > believe in HBase as a high performance low latency data store, so it
> > > should fit nicely any systems that require those things.
> > >
> > > 2. If you can achieve the right level of performance in data retrieval,
> > why
> > > not?
> > >
> > > 3. Very complex relational-based data might not be appropriate to
> > > store in HBase. For example, indexes in HBase are not free and require
> > > a bunch of things to make it happen.
> > >
> > > Infrequently written, but heavily read things might not make sense in
> > > HBase... ie: things you might use a CDN for.  Software downloads. Very
> > > large media.  You might be able to use HBase as the source for things
> > > that are cached in a layer like varnish or akamai.
> > >
> > > -ryan
> > >
> > > On Mon, Jul 12, 2010 at 9:36 PM, S Ahmed <[EMAIL PROTECTED]> wrote:
> > > > I realize that stumbleupon uses hbase for su.pr, and is currently
> > using
> > > > hbase for new functionality but isn't necessarily going back and
> > > re-coding
> > > > everything to fit into the hbase model.
> > > >
> > > > Having said that, do you guys think hbase could very well be used for
> > > things
> > > > like:
> > > >
> > > > 1. when a user logs in, keep the user session in hbase?
> > > > 2. for pages like:
> > > >
> > > >
> > >
> >
> http://www.stumbleupon.com/url/blogs.babble.com/family-kitchen/2010/06/16/hasselback-potatoes
> > > >
> > > >
> > > >
> > > > So this involves all elements on the page, would this be possible and
> > > more
> > > > importantly make sense with hbase?
> > > >
> > > > 3.  What sort of things/functionality do you see NOT being suitable
> in
> > > your
> > > > experiences?
> > > >
> > > > Thanks for  your insights!
> > > >
> > >
> >
>