Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # dev >> Query regarding Scalar Functions implementation


Copy link to this message
-
Re: Query regarding Scalar Functions implementation
I haven't run the test yet on MSSQL but reading this suggests that it
int/int == int as opposed to oracle int/int == float4

http://technet.microsoft.com/en-us/library/ms175009.aspx

We should probably pick one and stick to it.  I personally prefer MS but
Oracle is more prevalent.  If I remember correctly, Optiq is modeled more
after one of the two and we should probably continue that trend.  Maybe
Julian can comment here...

J
On Wed, Oct 2, 2013 at 1:42 AM, Harri Kinnunen
<[EMAIL PROTECTED]>wrote:

> Hi,
>
> I'm not sure if we're talking about implementation specifics or what would
> be visible to end user.
>
> But if this is about end user experience, I'd say the functionality should
> reflect the one in Oracle:
> SELECT 5/2 FROM DUAL;
> Returns
> 2.5
>
> Even:
> select CAST(10 AS INTEGER)/CAST(3 AS INTEGER) FROM DUAL
> returns:
> 3.33333333333333
>
> Doing:
> select DUMP(CAST(10 AS INTEGER)/CAST(2 AS INTEGER)) FROM DUAL
> Reports the datatype as "NUMBER(precision, scale)".
>
> So ... I guess we don't have the luxury of "NUMBER(p,s). But somehow we
> should not force users to do explicit casts to achieve the "obvious
> results" (as defined above :) ).
>
> Cheers,
> Harri
>
>
> -----Original Message-----
> From: Jason Altekruse [mailto:[EMAIL PROTECTED]]
> Sent: 2. lokakuuta 2013 1:57
> To: drill-dev
> Subject: Re: Query regarding Scalar Functions implementation
>
> Hello All,
>
> I would assume we would want to follow the conventions of most programming
> languages. If users are interested in a decimal result, they would have to
> explicitly cast one of the arguments to a float or float8.
>
> In regards to mismatched types, there are two ways I can think if doing it.
> We could define a bunch of overloaded methods for each combination, but it
> seems like we have to define each twice for different arrangements of the
> types, such as with  mult(float, float8) and mult(float*, float).
>
> I think the way we will want to do it is add additional logic to the code
> generation portion of the query, rather than define a bunch of different
> functions.
>
> For example, as new batches arrive at an operator, if they have a new
> schema we generate code to process the particular types of value vectors
> involved in the operation. I think at this step we should be able to add a
> cast to one of the parameters to direct to a function that defines an
> operation between two operands of the same type.
>
> Example:
> incoming types int, float
> - cast first parameter to a float
>
> Deciding which one to cast seems to be pretty standard, as seen here in
> the sql server documentation. They just define a strict hierarchy of types.
>
> http://technet.microsoft.com/en-us/library/ms190309.aspx
>
> The only problem I could see with this approach is that the Drill Funcs
> take the value holders as parameters, so we will have to define casting
> rules between the various types. Not sure what this will do for code
> inlining. A major goal of the templates and code generation was allowing
> UDFs while keeping the whole system fast.
>
> It would also be possible to define additional methods on the various
> value vectors to allow extraction of values directly into different types,
> such as a double extraction method on the float vectors. This might aid
> inlining, as we handle a bit more of the logic while dealing with
> primitives (rather than pulling out a value, sticking it in a holder object
> and then casting the holder to a different object type).
>
> -Jason
>
>
> On Tue, Oct 1, 2013 at 1:06 AM, Yash Sharma <[EMAIL PROTECTED]
> >wrote:
>
> > Hi Team,
> > I had two  questions regarding the  implementation of  Scalar Functions.
> >
> > 1. What would be the Output type of Division func (given: Input types
> > are all Integers)
> >
> > Currently I have provided an implementation of the DIVISION func which
> > has input/output params as :
> >         @Param  IntHolder left;
> >         @Param  IntHolder right;
> >          @Output IntHolder out;
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB