Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Drill, mail # dev - Optiq: Adding sugared syntax and allowing ANY in aggregate expressions...


+
Jacques Nadeau 2013-08-09, 15:47
Copy link to this message
-
Re: Optiq: Adding sugared syntax and allowing ANY in aggregate expressions...
Julian Hyde 2013-08-09, 18:31
Jacques,

All of this work should be done as extensions to Optiq's type inference algorithm, which is part of its SQL validator. (If by "preprocessing step" you mean before the SQL validator is called, then I disagree. Without type information you'd inevitably run into ambiguity problems. An analogy is the C preprocessor. If you implement inline functions as macros using the C preprocessor, you can't have different implementation based on the types of the arguments.)

I can add hooks to the validator based on (a) whether a table has a special "_MAP" field, (b) whether a field has a special "ANY" type. I don't like changing a large and complex piece of code, but in this case it is appropriate, and we'd be easily able to find where that functionality was being triggered by searching for uses of _MAP and ANY.

You list 2 cases where syntactic sugar is needed. Some more are:
 * Converting "select x from t" to "select t._MAP['x'] from x" (similar to your case, but no table alias is used)
 * "select x from t1, t2" is ambiguous if t1 and t2 both have a _MAP column
 * Need to resolve x[y] to either (Cast(x as MAP))[y] or (Cast(x as ARRAY))[y] based on whether y is a string or numeric
 * If y is ANY then (yuck) we'd generate a call to an ITEM(ANY, ANY) operator that would only get resolved at runtime

Some ambiguities will arise, because SQL uses overloading. For example, if you apply MIN to a column with the following values:

row 1: {c: 9}
row 2: {c: "10"}

Should Optiq resolve this to MIN(<int>), MIN(<varchar>) or MIN(<any>)? If the latter, what will Drill do?

Julian

On Aug 9, 2013, at 8:47 AM, Jacques Nadeau <[EMAIL PROTECTED]> wrote:

> Sorry about the cross-post, wasn't sure where this discussion should go...
>
> As I was working through issues with the SQL parser I had a couple of questions.
>
> - I want to get started on adding sugared syntax for Apache Drill in
> Optiq to get rid of the table._MAP['fieldname'] stuff and just make it
> table.fieldname.  Any thoughts on the right way to approach this?  It
> seems like a simple fix may be to just add the wrapper as a
> preprocessing step before we go to the schema for all of Drill's
> tables.
>
> - Right now, I believe if you try to do an aggregate function against
> an ANY type, you get an exception.  How can we make it so that Optiq
> allows this and we throw the exception lower down only if we realize
> that we can't aggregate once the type is materialized?
>
> Thanks,
> Jacques
>
> --
> You received this message because you are subscribed to the Google Groups "optiq-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to optiq-dev+[EMAIL PROTECTED].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>