Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> incrementColumnValue, WAL and timestamp


Copy link to this message
-
RE: incrementColumnValue, WAL and timestamp
I think there are some hacky fixes in there to prevent duplicate timestamps.

With the new seqid, this should not be an issue.  I think we should then be able to generate a single now() value and use it everywhere (MemStore, HLog).  Duplicate timestamps won't be a problem as the seqid ordering will take care of it and we can still use the same logic to cleanup old values.

A separate issue is how "atomic" we want counters to be.  Currently the multi-column Increment operation does not utilize any of the RWCC stuff.  I guess here we should just use seqid now?  Is it one seqid per KV or per "row transaction"?

JG

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Stack
> Sent: Wednesday, January 12, 2011 11:54 AM
> To: HBase Dev List
> Subject: incrementColumnValue, WAL and timestamp
>
> This is how incrementColumnValue looks currently.  I have a few questions
> on it (I'm trying to add a sequence number to KeyValue -- hbase-2856).
>
>   public long incrementColumnValue(byte [] row, byte [] family,
>       byte [] qualifier, long amount, boolean writeToWAL)
>   throws IOException {
>     ...
>     try {
>       ...
>         // Get the old value:
>         Get get = new Get(row);
>         get.addColumn(family, qualifier);
>
> ..
>         // build the KeyValue now:
>         KeyValue newKv = new KeyValue(row, family,
>             qualifier, EnvironmentEdgeManager.currentTimeMillis(),
>             Bytes.toBytes(result));
>
>         // Now log it:
>         if (writeToWAL) {
>           long now = EnvironmentEdgeManager.currentTimeMillis();
>           WALEdit walEdit = new WALEdit();
>           walEdit.add(newKv);
>           this.log.append(regionInfo, regionInfo.getTableDesc().getName(),
>             walEdit, now);
>         }
>
>         // Now request the ICV to the store, this will set the timestamp
>         // appropriately depending on if there is a value in memcache or not.
>         // returns the change in the size of the memstore from operation
>         long size = store.updateColumnValue(row, family, qualifier, result);
>
> ....
>       } finally {
>         releaseRowLock(lid);
>       }
>     } finally {
>       closeRegionOperation();
>     }
>
> ....
>
>     return result;
>   }
>
> Ignoring stuff like the double
> EnvironmentEdgeManager.currentTimeMillis() call -- i'll fix that -- the thing
> that is of interest to me is this allowance that what is written to WAL is not
> necessarily the edit that makes it out to the persisted store file (because the
> timestamp for the ICV is set later up in the memstore and the timestamp
> may not be 'now' but rather will be the largest timestamp currently in
> memstore).
>
> This seems off to me.  What do you lot see as repercussions of my writing
> into memstore the newKv we made above?  Are increments going to come
> in out of order, is that what we're trying to protect against?
>  Or are we trying to protect against some errant edit that has an inflated
> timestamp up in memstore?
>
> Thanks,
> St.Ack
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB