Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> keyvalue cache


Copy link to this message
-
Re: keyvalue cache
random afterthought: a delete-row command could cause problems for a simple
HashSet style cache, so you would need a sorted implementation or a more
elaborate nested structure.
On Wed, Apr 4, 2012 at 3:55 PM, Enis Söztutar <[EMAIL PROTECTED]> wrote:

> I think you are right that if you replicate the row MVCC semantics in the
> cache, then
> you can get a consistent view. I was referring to a more client side
> approach.
>
> I guess the take aways are:
>  - forget about scans, and shoot for point gets, which I agree
>  - per-kv cache overhead might be huge, but still worth trying it out.
>  - can also be architected on top of coprocessors
>  - might be complex to implement, but still some use cases would benefit a
> lot.
>  - row cache / family cache / kv cache
>
> Thanks for all the input!
>
> Enis
>
> On Wed, Apr 4, 2012 at 3:28 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:
>
> > A client-side memcached setup can stay pretty consistent if you send all
> of
> > your puts and deletes through it before sending them to hbase, but yeah,
> I
> > guess you lose strict consistency under heavy read/write from multiple
> > simultaneous clients.  But, like Andy is saying, if you route the
> requests
> > through the regionserver and it talks to memcached/hazelcast, couldn't
> that
> > be fully consistent?
> >
> >
> > On Wed, Apr 4, 2012 at 3:09 PM, Andrew Purtell <[EMAIL PROTECTED]>
> > wrote:
> >
> > > I thought about trying this out once with a coprocessor, hooking the
> > Gets,
> > > with an embedded Hazelcast. That would just be a proof of concept. The
> > idea
> > > is to scale the KV cache independent of regionserver limits (maybe
> we're
> > > only giving 1 GB per RS to the value cache and a 10 GB region is hot)
> and
> > > the next step could be modifying the client to spread read load over
> > > replicas (HBASE-2357). This doesn't consider scans either.
> > >
> > >
> > > Best regards,
> > >
> > >     - Andy
> > >
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> > >
> > >
> > > >________________________________
> > > > From: Matt Corgan <[EMAIL PROTECTED]>
> > > >To: [EMAIL PROTECTED]
> > > >Sent: Wednesday, April 4, 2012 2:46 PM
> > > >Subject: Re: keyvalue cache
> > > >
> > > >It could act like a HashSet of KeyValues keyed on the
> > > >rowKey+family+qualifier but not including the timestamp.  As writes
> come
> > > in
> > > >it would evict or overwrite previous versions (read-through vs
> > > >write-through).  It would only service point queries where the
> > > >row+fam+qualifier are specified, returning the latest version.
>  Wouldn't
> > > be
> > > >able to do a typical rowKey-only Get (scan behind the scenes) because
> it
> > > >wouldn't know if it contained all the cells in the row, but if you
> could
> > > >specify all your row's qualifiers up-front it could work.
> > > >
> > > >
> > > >On Wed, Apr 4, 2012 at 2:30 PM, Vladimir Rodionov
> > > ><[EMAIL PROTECTED]>wrote:
> > > >
> > > >> 1. 2KB can be too large for some applications. For example, some of
> > our
> > > >> k-v sizes < 100 bytes combined.
> > > >> 2. These tables (from 1.) do not benefit from block cache at all (we
> > did
> > > >> not try 100 B block size yet :)
> > > >> 3. And Matt is absolutely right: small block size is expensive
> > > >>
> > > >> How about doing point queries on K-V cache and  bypass K-V cache on
> > all
> > > >> Scans (when someone really need this)?
> > > >> Implement K-V cache as a coprocessor application?
> > > >>
> > > >> Invalidation of K-V entry is not necessary if all upserts operations
> > go
> > > >> through K-V cache firstly if it sits in front of MemStore.
> > > >> There will be no "stale or invalid" data situation in this case.
> > > Correct?
> > > >> No need for data to be sorted and no need for data to be merged
> > > >> into a scan (we do not use K-V cache for Scans)
> > > >>
> > > >>
> > > >> Best regards,
> > > >> Vladimir Rodionov
> > > >> Principal Platform Engineer
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB