Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question


+
Jean-Marc Spaggiari 2013-08-09, 23:00
+
Ted Yu 2013-08-09, 23:03
+
Stack 2013-08-11, 05:36
Copy link to this message
-
Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question
File a jira, JMS?
Looks like to simple fix, can pull into 0.94.11 (which I'll spin early next week, promised).
-- Lars

________________________________
 From: Stack <[EMAIL PROTECTED]>
To: HBase Dev List <[EMAIL PROTECTED]>
Sent: Saturday, August 10, 2013 10:36 PM
Subject: Re: IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() logic question
 

In Store.canSplit, we check for references.  If references, we should not
split a region.

  public boolean canSplit() {

    this.lock.readLock().lock();

    try {

      // Not split-able if we find a reference store file present in the
store.

      boolean result = !hasReferences();

      if (!result && LOG.isDebugEnabled()) {

        LOG.debug("Cannot split region due to reference files being there");

      }

      return result;

    } finally {

      this.lock.readLock().unlock();

    }

  }
So, yeah, we should keep going through stores in case one is unsplittlable.

Good on you JMS,

St.Ack
On Fri, Aug 9, 2013 at 4:00 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]
> wrote:

> Hi,
>
> In IncreasingToUpperBoundRegionSplitPolicy, w.e loop over all the stores
> and we look at 2 things.
>
> First, if the store can't be split, then we return immediatly with false
> without checking the other stores. Then if the store can be split and is
> big enought, we return immediatly (break) with true.
>
> The logic seems to be wrong to be. In the 2nd case, should we not continue
> to loop over the other stores and see if one can't be split? Else we will
> trigger a split on a region where some stores can't be split. I will sply
> remove the "break".
>
> Does anyone have another opinion on that? Else I will open a JIRA and send
> a patch...
>
> JM
>
>   @Override
>   protected boolean shouldSplit() {
>     if (region.shouldForceSplit()) return true;
>     boolean foundABigStore = false;
>     // Get count of regions that have the same common table as this.region
>     int tableRegionsCount = getCountOfCommonTableRegions();
>     // Get size to check
>     long sizeToCheck = getSizeToCheck(tableRegionsCount);
>     LOG.debug("sizeToCheck = " + sizeToCheck);
>
>     for (Store store : region.getStores().values()) {
>       // If any of the stores is unable to split (eg they contain reference
> files)
>       // then don't split
>       if ((!store.canSplit())) {
>         return false;
>       }
>
>       // Mark if any store is big enough
>       long size = store.getSize();
>       if (size > sizeToCheck) {
>         LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
>           " size=" + size + ", sizeToCheck=" + sizeToCheck +
>           ", regionsWithCommonTable=" + tableRegionsCount);
>         foundABigStore = true;
>         break;
>       }
>     }
>
>     return foundABigStore;
>   }
>
+
Jean-Marc Spaggiari 2013-08-11, 16:04
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB