Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> why HTableDescriptor.getFamiliesKeys is so lag?


Copy link to this message
-
Re: why HTableDescriptor.getFamiliesKeys is so lag?
Hi Jack,

>From the code...

// method 1 will call
  /**
   * Returns an array all the {@link HColumnDescriptor} of the column
families
   * of the table.
   *
   * @return Array of all the HColumnDescriptors of the current table
   *
   * @see #getFamilies()
   */
  public HColumnDescriptor[] getColumnFamilies() {
    return getFamilies().toArray(new HColumnDescriptor[0]);
  }

Where getFamilies is return
Collections.unmodifiableCollection(this.families.values());

// method 2 will call
  /**
   * Returns all the column family names of the current table. The map of
   * HTableDescriptor contains mapping of family name to
HColumnDescriptors.
   * This returns all the keys of the family map which represents the
column
   * family names of the table.
   *
   * @return Immutable sorted set of the keys of the families.
   */
  public Set<byte[]> getFamiliesKeys() {
    return Collections.unmodifiableSet(this.families.keySet());
  }

// method 3 will call
  /**
   * Returns an unmodifiable collection of all the {@link
HColumnDescriptor}
   * of all the column families of the table.
   *
   * @return Immutable collection of {@link HColumnDescriptor} of all the
   * column families.
   */
  public Collection<HColumnDescriptor> getFamilies() {
    return Collections.unmodifiableCollection(this.families.values());
  }
So method 1 and 3 are almost the same thing. 1 is a wrapper around 3.

So let's see the difference betwee, 2 and 3. They both do almost the
samething, but one arround keySet() and the otherone around values(). Both
of them are calling those mehods on families which is a TreeMap. So sound
like TreeMap.values() is faster than TreeMap.keySet();

Looking into the TreeMap code (and we are no more into HBase here):
    public Collection<V> values() {
        Collection<V> vs = values;
        return (vs != null) ? vs : (values = new Values());
    }
values() will just return the internal values object if it exist (which is
most probably the case), while keySet() will do almost the same thing but
has to call another method too:

    /**
     * Returns a {@link Set} view of the keys contained in this map.
     * The set's iterator returns the keys in ascending order.
     * The set is backed by the map, so changes to the map are
     * reflected in the set, and vice-versa.  If the map is modified
     * while an iteration over the set is in progress (except through
     * the iterator's own <tt>remove</tt> operation), the results of
     * the iteration are undefined.  The set supports element removal,
     * which removes the corresponding mapping from the map, via the
     * <tt>Iterator.remove</tt>, <tt>Set.remove</tt>,
     * <tt>removeAll</tt>, <tt>retainAll</tt>, and <tt>clear</tt>
     * operations.  It does not support the <tt>add</tt> or <tt>addAll</tt>
     * operations.
     */
    public Set<K> keySet() {
        return navigableKeySet();
    }

    /**
     * @since 1.6
     */
    public NavigableSet<K> navigableKeySet() {
        KeySet<K> nks = navigableKeySet;
        return (nks != null) ? nks : (navigableKeySet = new KeySet(this));
    }
So now, 2 options.

1) If you can run each of your method twice, most probably the 2nd time
they will all be as fast.
2) the navigableKeySet() call from keySet costs 100ms, which will really
surprise me since I guess the compiler will optimize that.

Last, I'm not sure why those 100ms are important for you, but if they are
because you need to call this method multiple times, then just cache the
result on the client side.

HTH.

JM

Le jeudi 17 octobre 2013, Jack Chan a écrit :

> Hi all~
>     I need to get all column families from specified table,When I look
> into the class "org.apache.hadoop.hbase.HTableDescriptor",I found that
> there are more than three methods can be used.
>     See the code below,there are method1,method2,method3 to do the same
> thing:
>
> /*___________code begin___________*/
>
> HTable table = new HTable(config, "mytable");
> HTableDescriptor htd = table.getTableDescriptor();