Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill, mail # dev - Query regarding Scalar Functions implementation


Copy link to this message
-
Re: Query regarding Scalar Functions implementation
Jacques Nadeau 2013-10-02, 15:46
Jason brings up an important point.  We don't yet have implicit casting at
any level.  We need to incorporate it.
On Tue, Oct 1, 2013 at 3:57 PM, Jason Altekruse <[EMAIL PROTECTED]>wrote:

> Hello All,
>
> I would assume we would want to follow the conventions of most programming
> languages. If users are interested in a decimal result, they would have to
> explicitly cast one of the arguments to a float or float8.
>
> In regards to mismatched types, there are two ways I can think if doing it.
> We could define a bunch of overloaded methods for each combination, but it
> seems like we have to define each twice for different arrangements of the
> types, such as with  mult(float, float8) and mult(float*, float).
>
> I think the way we will want to do it is add additional logic to the code
> generation portion of the query, rather than define a bunch of different
> functions.
>
> For example, as new batches arrive at an operator, if they have a new
> schema we generate code to process the particular types of value vectors
> involved in the operation. I think at this step we should be able to add a
> cast to one of the parameters to direct to a function that defines an
> operation between two operands of the same type.
>
> Example:
> incoming types int, float
> - cast first parameter to a float
>
> Deciding which one to cast seems to be pretty standard, as seen here in the
> sql server documentation. They just define a strict hierarchy of types.
>
> http://technet.microsoft.com/en-us/library/ms190309.aspx
>
> The only problem I could see with this approach is that the Drill Funcs
> take the value holders as parameters, so we will have to define casting
> rules between the various types. Not sure what this will do for code
> inlining. A major goal of the templates and code generation was allowing
> UDFs while keeping the whole system fast.
>
> It would also be possible to define additional methods on the various value
> vectors to allow extraction of values directly into different types, such
> as a double extraction method on the float vectors. This might aid
> inlining, as we handle a bit more of the logic while dealing with
> primitives (rather than pulling out a value, sticking it in a holder object
> and then casting the holder to a different object type).
>
> -Jason
>
>
> On Tue, Oct 1, 2013 at 1:06 AM, Yash Sharma <[EMAIL PROTECTED]
> >wrote:
>
> > Hi Team,
> > I had two  questions regarding the  implementation of  Scalar Functions.
> >
> > 1. What would be the Output type of Division func (given: Input types are
> > all Integers)
> >
> > Currently I have provided an implementation of the DIVISION func which
> has
> > input/output params as :
> >         @Param  IntHolder left;
> >         @Param  IntHolder right;
> >          @Output IntHolder out;
> >
> > now, the issue is the data type of output field:
> > output type will be integer if left & right are divisible integers,
> while..
> > output type would be decimal if left & right are non-divisible integers
> > (i.e.  have a remainder)
> >
> > So my question is,
> > Do I have to provide 3 overloaded methods for division with different
> > @output types, (IntHolder, Float4Holder, Float8Holder) ?
> > or shall I have a  Float8 output type irrespective of the inputs?
> >
> > Other functions like add/multiple & subtract won't be having this issue..
> > . It's only the issue with division.
> >
> >
> > 2. What would be the input type for any Scalar func (given: Input types
> > might not always be Integers).
> >
> > Inputs would also be of different data types as Float4Holder &
> > Float8Holder, so would we have to provide overloaded methods for
> different
> > combinations of input types?
> > This would be the case with all scalar functions (+_*/).
> >
> > Any Suggestions?
> > Thanks,
> > Yash Sharma
> >
> >
> >
> > ________________________________
> >
> >
> >
> >
> >
> >
> > NOTE: This message may contain information that is confidential,
> > proprietary, privileged or otherwise protected by law. The message is