Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # dev - Review Request 17687: HIVE-6256 add batch dropping of partitions to Hive metastore (as well as to dropTable)


Copy link to this message
-
Re: Review Request 17687: HIVE-6256 add batch dropping of partitions to Hive metastore (as well as to dropTable)
Ashutosh Chauhan 2014-02-08, 15:26

This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17687/#review34020

metastore/if/hive_metastore.thrift
<https://reviews.apache.org/r/17687/#comment63956>

    Should this be list<String> partitionSpec? Seems like client (if ever) will only be intersted in name of partitions which got dropped. Partition object is quite bulky.

metastore/if/hive_metastore.thrift
<https://reviews.apache.org/r/17687/#comment63955>

    Should this default to false? Otherwise, we will send back list of Partition object back to client, which could be huge and in most cases, client is not interested in those.

metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
<https://reviews.apache.org/r/17687/#comment63957>

    I think there is a way to do bulk delete from filesystem. If we fire 10K requests on NN to delete dirs, this will take a while. Plus, we will swamp NN, when not needed. Further, this might us give better atomicity guarantees, than current one.
    If there is no such api on NN, than lets fire jira on HDFS to request this.

metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
<https://reviews.apache.org/r/17687/#comment63958>

    setNeedResult(false) ?

metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
<https://reviews.apache.org/r/17687/#comment63963>

    Looks like this import is unneeded. Also, it seems like currently bulk deletion is supported only via orm? Are you planning to do bulk deletion via direct sql in subsequent jira ?
    Asking because since perf is major driver for this work and in past we have seen major diff between direct sql and orm perf.

metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/17687/#comment63959>

    Shall this commented line be deleted?

metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/17687/#comment63960>

    Will be good to add a comment here explaining what detach means and why is it required ?

metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/17687/#comment63961>

    Shall this commented line be deleted?

ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
<https://reviews.apache.org/r/17687/#comment63962>

    Do we need both of these two boolean variables?
- Ashutosh Chauhan
On Feb. 7, 2014, 10:55 p.m., Sergey Shelukhin wrote: