Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # dev >> Re: [jira] [Work started] (DRILL-47) Generate Logical Plans for TPC-H Queries


Copy link to this message
-
Re: [jira] [Work started] (DRILL-47) Generate Logical Plans for TPC-H Queries
Hi J,

- The goal was to come up with manually generated tpc-h logical
queries.  We'll use these to validate the output of sql parser.

I am doing the later. feed the tpc-h queries to sql parser and come up with logical plan, then verify manually or by feeding into execution engine.

- DrQL parser is not currently being used.
I realized it later.

- Why are you creating pojos for anything?
TPC-H data set is in PSV files.
It is easy with POJOs for this work flow.
From PSV -> pojos -> JSON for now and any other format later.

At the end, we can give out a data set and sqls, respective logical plan and physical plan, for drill users to play with and refer to.

V

________________________________
 From: Jacques Nadeau <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; Sree V <[EMAIL PROTECTED]>
Sent: Friday, July 26, 2013 10:59 AM
Subject: Re: [jira] [Work started] (DRILL-47) Generate Logical Plans for TPC-H Queries
 

Some thoughts (not in any particular order):

- The goal was to come up with manually generated tpc-h logical
queries.  We'll use these to validate the output of sql parser.
- DrQL parser is not currently being used.
- Why are you creating pojos for anything?

J

> [Sree Vaddi:] Seems, I should be using 'sqlparser' project.  Any sample/thought ?
>
>
> 3.
> How to apply the parsed sql from 2. above to the data in 1. above, to output the
> Logical Plan ?
>
>
> Please advise.
>
>
> Thanking you.
> With Regards
> Sree
>
>
>
> Supporting code for 2. above and debug info:
>
>     @Test
>     public void testTPCHSql1() {
>         String drqlQueryText = "select " +
>             "l_returnflag, l_linestatus, " +
>             "sum(l_quantity) as sum_qty, " +
>             "sum(l_extendedprice) as sum_base_price, " +
>             "sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, " +
>             "sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, " +
>             "avg(l_quantity) as avg_qty, " +
>
>      "avg(l_extendedprice) as avg_price, " +
>             "avg(l_discount) as avg_disc, " +
>             "count(*) as count_order " +
>         "from " +
>             "lineitem " +
>         "where " +
>             "l_shipdate <= date '1998-12-01' - interval ':1' day (3) " +
>         "group by " +
>             "l_returnflag, " +
>             "l_linestatus " +
>         "order by " +
>             "l_returnflag, " +
>
>  "l_linestatus;";
>
>         DrqlParser parser = new AntlrParser();
>         SemanticModelReader query = parser.parse(drqlQueryText);
>
>         System.out.println(query.getFromClause());
>         System.out.println(query.getGroupByClause());
>         System.out.println(query.getJoinOnClause());
>         System.out.println(query.getjustATable());
>         System.out.println(query.getLimitClause());
>         System.out.println(query.getOrderByClause());
>         System.out.println(query.getResultColumnList().size());
>
>  System.out.println(query.getWhereClause());
>         /*
> setup debug info:
> line#2299 DrqlAntlrParser
> 2320
> 3682
> 4884
> 5363
>
> 392
> 6664
>
> #1207 DrqlAntlrLexer.mDiv()
> part of the sql parsing:
> // l_shipdate <= date '1998-12-01' - interval ':1' day (3)
> variable value: (parsing location in sql i.e the location of letter 'd' in date)
> [@125,378:379='<=',<52>,1:378]
>
> looks like the 'date' is interpreted as 'div' ?!
>
> test method console output:
> line 1:382 mismatched character 'A' expecting ' '
> line 1:416 mismatched character 'A' expecting ' '
>
> [org.apache.drill.parsers.impl.drqlantlr.SemanticModel@3a86edfe]
> []
> null
> null
> null
> []
> 10
> null
>
>          */
>     }
>
> ________________________________
>  From: Sree Vaddi (JIRA) <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Thursday, July 25, 2013 7:07 AM
> Subject: [jira] [Work started] (DRILL-47) Generate Logical Plans for TPC-H Queries
>
>
>
>      [ https://issues.apache.org/jira/browse/DRILL-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB