Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # dev >> Query regarding Scalar Functions implementation

Copy link to this message
Re: Query regarding Scalar Functions implementation
Hello All,

I would assume we would want to follow the conventions of most programming
languages. If users are interested in a decimal result, they would have to
explicitly cast one of the arguments to a float or float8.

In regards to mismatched types, there are two ways I can think if doing it.
We could define a bunch of overloaded methods for each combination, but it
seems like we have to define each twice for different arrangements of the
types, such as with  mult(float, float8) and mult(float*, float).

I think the way we will want to do it is add additional logic to the code
generation portion of the query, rather than define a bunch of different

For example, as new batches arrive at an operator, if they have a new
schema we generate code to process the particular types of value vectors
involved in the operation. I think at this step we should be able to add a
cast to one of the parameters to direct to a function that defines an
operation between two operands of the same type.

incoming types int, float
- cast first parameter to a float

Deciding which one to cast seems to be pretty standard, as seen here in the
sql server documentation. They just define a strict hierarchy of types.


The only problem I could see with this approach is that the Drill Funcs
take the value holders as parameters, so we will have to define casting
rules between the various types. Not sure what this will do for code
inlining. A major goal of the templates and code generation was allowing
UDFs while keeping the whole system fast.

It would also be possible to define additional methods on the various value
vectors to allow extraction of values directly into different types, such
as a double extraction method on the float vectors. This might aid
inlining, as we handle a bit more of the logic while dealing with
primitives (rather than pulling out a value, sticking it in a holder object
and then casting the holder to a different object type).

On Tue, Oct 1, 2013 at 1:06 AM, Yash Sharma <[EMAIL PROTECTED]>wrote:

> Hi Team,
> I had two  questions regarding the  implementation of  Scalar Functions.
> 1. What would be the Output type of Division func (given: Input types are
> all Integers)
> Currently I have provided an implementation of the DIVISION func which has
> input/output params as :
>         @Param  IntHolder left;
>         @Param  IntHolder right;
>          @Output IntHolder out;
> now, the issue is the data type of output field:
> output type will be integer if left & right are divisible integers, while..
> output type would be decimal if left & right are non-divisible integers
> (i.e.  have a remainder)
> So my question is,
> Do I have to provide 3 overloaded methods for division with different
> @output types, (IntHolder, Float4Holder, Float8Holder) ?
> or shall I have a  Float8 output type irrespective of the inputs?
> Other functions like add/multiple & subtract won't be having this issue..
> . It's only the issue with division.
> 2. What would be the input type for any Scalar func (given: Input types
> might not always be Integers).
> Inputs would also be of different data types as Float4Holder &
> Float8Holder, so would we have to provide overloaded methods for different
> combinations of input types?
> This would be the case with all scalar functions (+_*/).
> Any Suggestions?
> Thanks,
> Yash Sharma
> ________________________________
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.