Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # dev >> Query regarding Scalar Functions implementation


Copy link to this message
-
Re: Query regarding Scalar Functions implementation
I haven't run the test yet on MSSQL but reading this suggests that it
int/int == int as opposed to oracle int/int == float4

http://technet.microsoft.com/en-us/library/ms175009.aspx

We should probably pick one and stick to it.  I personally prefer MS but
Oracle is more prevalent.  If I remember correctly, Optiq is modeled more
after one of the two and we should probably continue that trend.  Maybe
Julian can comment here...

J
On Wed, Oct 2, 2013 at 1:42 AM, Harri Kinnunen
<[EMAIL PROTECTED]>wrote:

> Hi,
>
> I'm not sure if we're talking about implementation specifics or what would
> be visible to end user.
>
> But if this is about end user experience, I'd say the functionality should
> reflect the one in Oracle:
> SELECT 5/2 FROM DUAL;
> Returns
> 2.5
>
> Even:
> select CAST(10 AS INTEGER)/CAST(3 AS INTEGER) FROM DUAL
> returns:
> 3.33333333333333
>
> Doing:
> select DUMP(CAST(10 AS INTEGER)/CAST(2 AS INTEGER)) FROM DUAL
> Reports the datatype as "NUMBER(precision, scale)".
>
> So ... I guess we don't have the luxury of "NUMBER(p,s). But somehow we
> should not force users to do explicit casts to achieve the "obvious
> results" (as defined above :) ).
>
> Cheers,
> Harri
>
>
> -----Original Message-----
> From: Jason Altekruse [mailto:[EMAIL PROTECTED]]
> Sent: 2. lokakuuta 2013 1:57
> To: drill-dev
> Subject: Re: Query regarding Scalar Functions implementation
>
> Hello All,
>
> I would assume we would want to follow the conventions of most programming
> languages. If users are interested in a decimal result, they would have to
> explicitly cast one of the arguments to a float or float8.
>
> In regards to mismatched types, there are two ways I can think if doing it.
> We could define a bunch of overloaded methods for each combination, but it
> seems like we have to define each twice for different arrangements of the
> types, such as with  mult(float, float8) and mult(float*, float).
>
> I think the way we will want to do it is add additional logic to the code
> generation portion of the query, rather than define a bunch of different
> functions.
>
> For example, as new batches arrive at an operator, if they have a new
> schema we generate code to process the particular types of value vectors
> involved in the operation. I think at this step we should be able to add a
> cast to one of the parameters to direct to a function that defines an
> operation between two operands of the same type.
>
> Example:
> incoming types int, float
> - cast first parameter to a float
>
> Deciding which one to cast seems to be pretty standard, as seen here in
> the sql server documentation. They just define a strict hierarchy of types.
>
> http://technet.microsoft.com/en-us/library/ms190309.aspx
>
> The only problem I could see with this approach is that the Drill Funcs
> take the value holders as parameters, so we will have to define casting
> rules between the various types. Not sure what this will do for code
> inlining. A major goal of the templates and code generation was allowing
> UDFs while keeping the whole system fast.
>
> It would also be possible to define additional methods on the various
> value vectors to allow extraction of values directly into different types,
> such as a double extraction method on the float vectors. This might aid
> inlining, as we handle a bit more of the logic while dealing with
> primitives (rather than pulling out a value, sticking it in a holder object
> and then casting the holder to a different object type).
>
> -Jason
>
>
> On Tue, Oct 1, 2013 at 1:06 AM, Yash Sharma <[EMAIL PROTECTED]
> >wrote:
>
> > Hi Team,
> > I had two  questions regarding the  implementation of  Scalar Functions.
> >
> > 1. What would be the Output type of Division func (given: Input types
> > are all Integers)
> >
> > Currently I have provided an implementation of the DIVISION func which
> > has input/output params as :
> >         @Param  IntHolder left;
> >         @Param  IntHolder right;
> >          @Output IntHolder out;