Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # dev >> hangout


Neat notes Jinfeng. Thanks.
Let me go through the points and pick up few tasks.
Will keep you posted.
________________________________________
From: Jinfeng Ni [[EMAIL PROTECTED]]
Sent: Thursday, November 21, 2013 10:42 AM
To: [EMAIL PROTECTED]
Subject: Re: hangout

Hi Yash,

I took a quick look at your github repo. Here are my thoughts.

1. We need implement a cast function.  This cast function will be used 1)
as the explicit cast function used by drill user, 2)to support the implicit
cast used internally within Drill.

For instance, in OperatorFunctionResolver.getBestMatch(), once find the
best matched function implementation, if the parameter type is not same as
argument type, we need inject a cast function as : CAST(
argument_logical_expression   as parameter_type), replace this cast
expression as the argument to the function call.

2. OperatorFunctionResolver.getBestMatch() currently did not check type
match; it simply computes the distance between the param type and argument
type in the TypePrecedeence map, and returns the function implementation
with the smallest total distance (i.e mininum cost).  This seems not right,
since for implicit cast,  at least we need make sure the argument type is
legal to cast to param type.  This is not always true.  For instance, if we
have operator :   DATE op DATE.  Now if we have an expression  float8 op
float 8. Are we implicitly casting from float8 into DATE?  ( Let's say op
is only defined for DATE type, so it's the best match available for
expression float8 op float 8).

Here, I think we probably miss the part to check if one type is CASTABLE
from another type.

3.  For the above reason, internally we need implement an isCastable()
mechanism.  We probably can partly re-use optiq code, which uses a
Map<SqlTypeName, Set<SqlTypeName>> to check if one type is castable to
another.

4. Null expression is castable to any data type. It's right that you put in
NULL as the 1st one in TypePrecedence map.  When function argument(s) is
NULL, once we find the best match function implementation, we could either
replace it with a target_type NULL constant (As Jacques suggested), or
replace NULL with CAST(NULL as target_type) expression.

For instance,  1 + NULL : it matches Int + Int, and NULL is replaced
either int NULL constant, or CAST(NULL as int).

 NULL + NULL : it matches TinyInt + TinyInt (using the getBestMatch
method).  Both NULL operands will be replaced with TinyINT null
constant,
or CAST(NULL as TinyInt).

5. Nullable vs Non-nullable. I think implicit cast probably need
support cast from Non-nullable exp

Nullable exp. Otherwise, for each operator and each type, we at least
have to implement 4 versions:

     1) Nullable  int +  Nullable int

     2) Nullable  int + Non-Nullalbe int

     3) Non-Nullable int + Nullable int

     4) Non-Nullable int + Non-Nullalbe int.

By implicit cast Non-nullable exp into Nullable exp, we only need
define one version: Nullable int + Nullable int.
. How to handle the null input could be specified in
function/operator's implementation.

6. Output type. Current code uses OutputTypeDeterminer to decide the
output type. In my view, once we

find the best match function implementation ( based on the argument
types vs parameter types), we would

use the matched function implementations' output type. That is,
function resolver would 1) find the best match

function implementation, 2) decide the output type. These two tasks
will be done in one step.
On Wed, Nov 20, 2013 at 2:52 PM, Jinfeng Ni <[EMAIL PROTECTED]> wrote:

> Hi Yash,
>
> I'll take a look at your github repo, and start from there.
>
> To get null expression and pass optiq's parsing, you may try this:
>
> SELECT 1 + _MAP['NON_EXISTS_COLUMN'] from ...
>
> When the column does not exists, it will return with
> NullExpression.INSTANCE. This way, you can see how the operator "+" is
> binded and the code generator handles the code generation for "+" operator,
> when input  is 1 and a null expression.

________________________________
NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.