Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Enhancing Query Join to speed up Query


Copy link to this message
-
Re: Enhancing Query Join to speed up Query
Navis류승우 2013-06-16, 02:53
Yes, it's identical, as expected.

2013/6/16 Naga Vijay <[EMAIL PROTECTED]>:
> Hi,
>
> Thanks for all the responses!
>
> ------------------------------
>
> Here's output of "explain" for query option 1 ...
>
> ------------------------------
>
> ABSTRACT SYNTAX TREE:
>   (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME a)) (TOK_TABREF
> (TOK_TABNAME b)) (= (. (TOK_TABLE_OR_COL a) item_id) (. (TOK_TABLE_OR_COL b)
> item_id)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT
> (TOK_SELEXPR (. (TOK_TABLE_OR_COL a) item_id)) (TOK_SELEXPR (.
> (TOK_TABLE_OR_COL a) create_dt))) (TOK_WHERE (AND (= (. (TOK_TABLE_OR_COL a)
> item_id) 'I501') (= (. (TOK_TABLE_OR_COL a) category_name) 'C1')))))
>
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 is a root stage
>
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Alias -> Map Operator Tree:
>         b
>           TableScan
>             alias: b
>             GatherStats: false
>             Filter Operator
>               isSamplingPred: false
>               predicate:
>                   expr: (item_id = 'I501')
>                   type: boolean
>               Sorted Merge Bucket Map Join Operator
>                 condition map:
>                      Inner Join 0 to 1
>                 condition expressions:
>                   0 {item_id} {create_dt}
>                   1
>                 handleSkewJoin: false
>                 keys:
>                   0 [Column[item_id]]
>                   1 [Column[item_id]]
>                 outputColumnNames: _col0, _col3
>                 Position of Big Table: 1
>                 Select Operator
>                   expressions:
>                         expr: _col0
>                         type: string
>                         expr: _col3
>                         type: string
>                   outputColumnNames: _col0, _col1
>                   File Output Operator
>                     compressed: false
>                     GlobalTableId: 0
>                     directory:
> hdfs://sandbox:8020/tmp/hive-root/hive_2013-06-14_11-01-17_851_562334803109383952/-ext-10001
>                     NumFilesPerFileSink: 1
>                     Stats Publishing Key Prefix:
> hdfs://sandbox:8020/tmp/hive-root/hive_2013-06-14_11-01-17_851_562334803109383952/-ext-10001/
>                     table:
>                         input format:
> org.apache.hadoop.mapred.TextInputFormat
>                         output format:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                         properties:
>                           columns _col0,_col1
>                           columns.types string:string
>                           escape.delim \
>                           serialization.format 1
>                     TotalFiles: 1
>                     GatherStats: false
>                     MultiFileSpray: false
>       Needs Tagging: false
>       Path -> Alias:
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-11 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-12 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-13 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-14 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-15 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-16 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-17 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-18 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-19 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-20 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-21 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-22 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-23 [b]
>         hdfs://sandbox:8020/apps/hive/warehouse/b/create_dt=2013-06-24 [b]