Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill, mail # dev - Pseudo proposal for Drill typecast plan


Copy link to this message
-
Re: Pseudo proposal for Drill typecast plan
Jinfeng Ni 2013-10-27, 00:31
1.  Implicit casting for op with two operands.

Implicit casting could happen for operators with two operands, eg,
comparisons op : <, >, etc and arithmetic op : +, -, *, / etc. For such
case, the concept of data type precedence could be used to describe the
rule when implicit casting would be applied.

*"When an operator combines two expressions of different data types, the
rules for data type precedence specify that the data type with the lower
precedence is converted to the data type with the higher precedence."*

For complete list of data type precedence on Microsoft SQL server, see :
http://msdn.microsoft.com/en-us/library/ms190309.aspx

For complete list of allowed type cast on Microsoft SQL server, see :
http://msdn.microsoft.com/en-us/library/ms187928.aspx

Using this data type precedence type,

* 2 + 2.3 : int 2 should be implicit cast to decimal, since decimal (2.3)
has higher precedence than int, leading to result of 4.3*
*
*
2 + '2.3' : string would be implicit cast to numeric type ( On DB2, string
is implicit casting to DECFLOAT(34)) .
*
*
*2. Implicit casting with assignment operators.*
When the source type is not same as the target type, implicit casting could
apply.

eg:
String_var = 123; /* cast 123 to '123' */
Num_var = '123'; /* cast '123' to 123 */

For Drill, as Julian suggested, we probably need specify the precedence
list for all data types supported in Drill, and describe which types could
be implicitly casted to which types, similar as MS SQL docs did.

On Thu, Oct 24, 2013 at 11:44 AM, Julian Hyde <[EMAIL PROTECTED]> wrote:

> Your proposal very quickly dives into implementation details. For
> type-casting to work, the rules need to be simple to understand for
> end-users. I think you should describe those rules first, and only then
> describe how the rules should be implemented.
>
> One particular case that needs to be handled is aggregate functions.
> Suppose that you write "min(c)", and c contains the values {10, ‘9’}. The
> system needs to know whether to invoke the numeric or string variant of
> “min”, before it has looked at any data.
>
> There is also the issue that the type of a bind variable needs to be
> inferred from its context, say “select * from emp where x + ? > 10”.
>
> Optiq’s validator has very extensive support for type-derivation, and that
> work is consistent with the SQL standard. (SQL leaves some of the
> type-derivation rules to the implementor, and Optiq has chosen sensible
> defaults. We argued long and hard about what should be the precision and
> scale of the return from dividing two decimals.) See this package:
> http://www.hydromatic.net/optiq/optiq-core/apidocs/org/eigenbase/sql/type/package-summary.html.
> Even if you don’t use it, it will give you an idea of some of the concepts
> at play.
>
> Julian