Drill, mail # dev - B[yi]teSize execwork tasks someone could potentially help out with...

Jacques Nadeau 2013-04-26, 04:06
David Alves 2013-04-26, 04:10
David Alves 2013-04-26, 04:22
Jacques Nadeau 2013-04-26, 14:52
Re: B[yi]teSize execwork tasks someone could potentially help out with...
Timothy Chen 2013-04-26, 04:51
So if no one picks anything up you will be done with all the work in the next couple of days? :)

Would like to help out but I'm traveling to la over the weekend.

I'll sync with you Monday to see how I can help then.


Sent from my iPhone

On Apr 25, 2013, at 9:06 PM, Jacques Nadeau <[EMAIL PROTECTED]> wrote:

> I'm working on the execwork stuff and if someone would like to help out,
> here are a couple of things that need doing.  I figured I'd drop them here
> and see if anyone wants to work on them in the next couple of days.  If so,
> let me know otherwise I'll be picking them up soon.
> *RPC*
> - RPC Layer Handshakes: Currently, I haven't implemented the handshake that
> should happen in either the User <> Bit or the Bit <> Bit layer.  The plan
> was to use an additional inserted event handler that removed itself from
> the event pipeline after a successful handshake or disconnected the channel
> on a failed handshake (with appropriate logging).  The main validation at
> this point will be simply confirming that both endpoints are running on the
> same protocol version.   The only other information that is currently
> needed is that that in the Bit <> Bit communication, the client should
> inform the server of its DrillEndpoint so that the server can then map that
> for future communication in the other direction.
> *DataTypes*
> - General Expansion: Currently, we have a hodgepodge of datatypes within
> the org.apache.drill.common.expression.types.DataType.  We need to clean
> this up.  There should be types that map to standard sql types.  My
> thinking is that we should actually have separate types for each for
> nullable, non-nullable and repeated (required, optional and repeated in
> protobuf vernaciular) since we'll generally operate with those values
> completely differently (and that each type should reveal which it is).  We
> should also have a relationship mapping from each to the other (e.g. how to
> convert a signed 32 bit int into a nullable signed 32 bit int.
> - Map Types: We don't need nullable but we will need different map types:
> inline and fieldwise.  I think these will useful for the execution engine
> and will be leverage depending on the particular needs-- for example
> fieldwise will be a natural fit where we're operating on columnar data and
> doing an explode or other fieldwise nested operation and inline will be
> useful when we're doing things like sorting a complex field.  Inline will
> also be appropriate where we have extremely sparse record sets.  We'll just
> need transformation methods between the two variations.  In the case of a
> fieldwise map type field, the field is virtual and only exists to contain
> its child fields.
> - Non-static DataTypes: We have a need types that don't fit the static data
> type model above.  Examples include fixed width types (e.g. 10 byte
> string), polymorphic (inline encoded) types (number or string depending on
> record) and repeated nested versions of our other types.  These are a
> little more gnarly as we need to support canonicalization of these.  Optiq
> has some methods for how to handle this kind of type system so it probably
> makes sense to leverage that system.
> *Expression Type Materialization*
> - LogicalExpression type materialization: Right now, LogicalExpressions
> include support for late type binding.  As part of the record batch
> execution path, these need to get materialized with correct casting, etc
> based on the actual found schema.  As such, we need to have a function
> which takes a LogicalExpression tree, applies a materialized BatchSchema
> and returns a new LogicalExpression tree with full type settings.  As part
> of this process, all types need to be cast as necessary and full validation
> of the tree should be done.  Timothy has a pending work for validation
> specifically on a pull request that would be a good piece of code to
> leverage that need.  We also have a visitor model for the expression tree
Jacques Nadeau 2013-04-26, 14:53
kishore g 2013-04-26, 15:39
Jacques Nadeau 2013-04-26, 16:25
David Alves 2013-04-26, 16:30
Timothy Chen 2013-04-26, 17:04
David Alves 2013-04-26, 17:12
Timothy Chen 2013-04-26, 17:27
kishore g 2013-04-26, 18:13
kishore g 2013-04-27, 15:54
Jacques Nadeau 2013-04-27, 19:51