Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # dev >> Re: [jira] [Work started] (DRILL-47) Generate Logical Plans for TPC-H Queries


Copy link to this message
-
Re: [jira] [Work started] (DRILL-47) Generate Logical Plans for TPC-H Queries
Hi J,

- The goal was to come up with manually generated tpc-h logical
queries.  We'll use these to validate the output of sql parser.

I am doing the later. feed the tpc-h queries to sql parser and come up with logical plan, then verify manually or by feeding into execution engine.

- DrQL parser is not currently being used.
I realized it later.

- Why are you creating pojos for anything?
TPC-H data set is in PSV files.
It is easy with POJOs for this work flow.
From PSV -> pojos -> JSON for now and any other format later.

At the end, we can give out a data set and sqls, respective logical plan and physical plan, for drill users to play with and refer to.

V

________________________________
 From: Jacques Nadeau <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; Sree V <[EMAIL PROTECTED]>
Sent: Friday, July 26, 2013 10:59 AM
Subject: Re: [jira] [Work started] (DRILL-47) Generate Logical Plans for TPC-H Queries
 

Some thoughts (not in any particular order):

- The goal was to come up with manually generated tpc-h logical
queries.  We'll use these to validate the output of sql parser.
- DrQL parser is not currently being used.
- Why are you creating pojos for anything?

J

> [Sree Vaddi:] Seems, I should be using 'sqlparser' project.  Any sample/thought ?
>
>
> 3.
> How to apply the parsed sql from 2. above to the data in 1. above, to output the
> Logical Plan ?
>
>
> Please advise.
>
>
> Thanking you.
> With Regards
> Sree
>
>
>
> Supporting code for 2. above and debug info:
>
>     @Test
>     public void testTPCHSql1() {
>         String drqlQueryText = "select " +
>             "l_returnflag, l_linestatus, " +
>             "sum(l_quantity) as sum_qty, " +
>             "sum(l_extendedprice) as sum_base_price, " +
>             "sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, " +
>             "sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, " +
>             "avg(l_quantity) as avg_qty, " +
>
>      "avg(l_extendedprice) as avg_price, " +
>             "avg(l_discount) as avg_disc, " +
>             "count(*) as count_order " +
>         "from " +
>             "lineitem " +
>         "where " +
>             "l_shipdate <= date '1998-12-01' - interval ':1' day (3) " +
>         "group by " +
>             "l_returnflag, " +
>             "l_linestatus " +
>         "order by " +
>             "l_returnflag, " +
>
>  "l_linestatus;";
>
>         DrqlParser parser = new AntlrParser();
>         SemanticModelReader query = parser.parse(drqlQueryText);
>
>         System.out.println(query.getFromClause());
>         System.out.println(query.getGroupByClause());
>         System.out.println(query.getJoinOnClause());
>         System.out.println(query.getjustATable());
>         System.out.println(query.getLimitClause());
>         System.out.println(query.getOrderByClause());
>         System.out.println(query.getResultColumnList().size());
>
>  System.out.println(query.getWhereClause());
>         /*
> setup debug info:
> line#2299 DrqlAntlrParser
> 2320
> 3682
> 4884
> 5363
>
> 392
> 6664
>
> #1207 DrqlAntlrLexer.mDiv()
> part of the sql parsing:
> // l_shipdate <= date '1998-12-01' - interval ':1' day (3)
> variable value: (parsing location in sql i.e the location of letter 'd' in date)
> [@125,378:379='<=',<52>,1:378]
>
> looks like the 'date' is interpreted as 'div' ?!
>
> test method console output:
> line 1:382 mismatched character 'A' expecting ' '
> line 1:416 mismatched character 'A' expecting ' '
>
> [org.apache.drill.parsers.impl.drqlantlr.SemanticModel@3a86edfe]
> []
> null
> null
> null
> []
> 10
> null
>
>          */
>     }
>
> ________________________________
>  From: Sree Vaddi (JIRA) <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Thursday, July 25, 2013 7:07 AM
> Subject: [jira] [Work started] (DRILL-47) Generate Logical Plans for TPC-H Queries
>
>
>
>      [ https://issues.apache.org/jira/browse/DRILL-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]