Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - question about multi-transaction queries


Copy link to this message
-
RE: question about multi-transaction queries
Jonathan Gray 2010-12-17, 21:21
I'm not sure exactly what your requirements are but what exactly is your client interface?  There is no persistent process anywhere serving client requests?

> -----Original Message-----
> From: Jack Levin [mailto:[EMAIL PROTECTED]]
> Sent: Friday, December 17, 2010 12:44 PM
> To: [EMAIL PROTECTED]
> Subject: Re: question about multi-transaction queries
>
> Do you happen to know if anyone have written or using something like that
> as open source? I would imagine this being super useful.  There is a question
> of interface too, I assume it would be TCP.  Is there sort of Jetty plugin
> available?  Now I somewhat realize that I am just describing existing REST,
> but afaik, it does not support multi-get.
>
> -Jack
>
> On Fri, Dec 17, 2010 at 11:57 AM, Jonathan Gray <[EMAIL PROTECTED]> wrote:
> > Yes, some kind of running JVM.  I would not recommend starting a JVM
> > for each query :)
> >
> >> -----Original Message-----
> >> From: Jack Levin [mailto:[EMAIL PROTECTED]]
> >> Sent: Friday, December 17, 2010 11:28 AM
> >> To: [EMAIL PROTECTED]
> >> Subject: Re: question about multi-transaction queries
> >>
> >> Ok, does it mean though we would incur Java startup cost?  Or do you
> >> propose we write some sort of java server that has the JVM running
> >> and is able to get multi-get queries?
> >>
> >> Thanks.
> >>
> >> -Jack
> >>
> >> On Fri, Dec 17, 2010 at 11:15 AM, Jonathan Gray <[EMAIL PROTECTED]> wrote:
> >> > All of my experience doing something like this was with straight Java.
> >> >
> >> > There are MultiGet and MultiPut capabilities in the Java client
> >> > that will help
> >> you out significantly.
> >> >
> >> > I played with Jython and HBase a couple years ago and back then the
> >> performance was horrible.  I never looked back but I have no idea if
> >> it's gotten better in the meantime.
> >> >
> >> > JG
> >> >
> >> >> -----Original Message-----
> >> >> From: Jack Levin [mailto:[EMAIL PROTECTED]]
> >> >> Sent: Friday, December 17, 2010 11:01 AM
> >> >> To: [EMAIL PROTECTED]
> >> >> Subject: Re: question about multi-transaction queries
> >> >>
> >> >> Lets just say its one row key with two columns.  Non contiguous
> >> >> records.  We want to read as fast as possible.  So we did some
> >> >> tests, and with MongoDB the random reads of 1000 records is about
> 80ms.
> >> >> While HBASE with jython is 400ms or so.
> >> >> Question is, as we develop our applications what is the best
> >> >> method to retrieve many rows the fastest way possible?  We are
> >> >> talking about
> >> >> 1 client here, not many clients.  For many clients, REST seems to
> >> >> be appropriate, but here we have a Frontend server rendering
> >> >> content quickly and we need to reduce the query overhead for HBASE
> >> >> and get
> >> data fast.
> >> >>
> >> >> -Jack
> >> >>
> >> >> On Sat, Dec 11, 2010 at 10:55 AM, Stack <[EMAIL PROTECTED]> wrote:
> >> >> > How many columns?  Its columns right, and not column families?
> >> >> >
> >> >> > Are the 1k rows contiguous?  Can you Scan?  For insert of 1k
> >> >> > rows, you know how to do that now, right?  Will they be
> >> >> > substantial rows
> >> >> > -- 10s to 100s of ks? -- or just small?  Do you have multiput
> >> >> > available in the REST interface, I don't recall.
> >> >> >
> >> >> > Try REST since you know that interface.  Jython might be faster
> >> >> > though a test done more than a year ago had jython as slow
> >> >> > (http://ryantwopointoh.blogspot.com/2009/01/performance-of-
> hbase
> >> >> > -
> >> >> impor
> >> >> > ting.html) but a bunch has changed since then -- hbase-wise and
> >> >> > jython has probably gotten a lot better.  If jython route, make
> >> >> > sure you keep the interpreter afloat rather than launch it per
> >> >> > request (so yes, fastcgi would make sense).
> >> >> >
> >> >> > St.Ack
> >> >> >
> >> >> > On Fri, Dec 10, 2010 at 9:59 PM, Jack Levin <[EMAIL PROTECTED]>
> wrote:
> >> >> >> Hello.   We plan to run a set of queries on tables with
> >> >> >> multiple columns.  What is the most efficient method to say,