Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # dev >> B[yi]teSize execwork tasks someone could potentially help out with...


Copy link to this message
-
Re: B[yi]teSize execwork tasks someone could potentially help out with...
Hi Jacques

I can take the RPC stuff.
Have you made any progress in Bit<>Bit comms?

Best
David

On Apr 25, 2013, at 11:06 PM, Jacques Nadeau <[EMAIL PROTECTED]> wrote:

> I'm working on the execwork stuff and if someone would like to help out,
> here are a couple of things that need doing.  I figured I'd drop them here
> and see if anyone wants to work on them in the next couple of days.  If so,
> let me know otherwise I'll be picking them up soon.
>
> *RPC*
> - RPC Layer Handshakes: Currently, I haven't implemented the handshake that
> should happen in either the User <> Bit or the Bit <> Bit layer.  The plan
> was to use an additional inserted event handler that removed itself from
> the event pipeline after a successful handshake or disconnected the channel
> on a failed handshake (with appropriate logging).  The main validation at
> this point will be simply confirming that both endpoints are running on the
> same protocol version.   The only other information that is currently
> needed is that that in the Bit <> Bit communication, the client should
> inform the server of its DrillEndpoint so that the server can then map that
> for future communication in the other direction.
>
> *DataTypes*
> - General Expansion: Currently, we have a hodgepodge of datatypes within
> the org.apache.drill.common.expression.types.DataType.  We need to clean
> this up.  There should be types that map to standard sql types.  My
> thinking is that we should actually have separate types for each for
> nullable, non-nullable and repeated (required, optional and repeated in
> protobuf vernaciular) since we'll generally operate with those values
> completely differently (and that each type should reveal which it is).  We
> should also have a relationship mapping from each to the other (e.g. how to
> convert a signed 32 bit int into a nullable signed 32 bit int.
>
> - Map Types: We don't need nullable but we will need different map types:
> inline and fieldwise.  I think these will useful for the execution engine
> and will be leverage depending on the particular needs-- for example
> fieldwise will be a natural fit where we're operating on columnar data and
> doing an explode or other fieldwise nested operation and inline will be
> useful when we're doing things like sorting a complex field.  Inline will
> also be appropriate where we have extremely sparse record sets.  We'll just
> need transformation methods between the two variations.  In the case of a
> fieldwise map type field, the field is virtual and only exists to contain
> its child fields.
>
> - Non-static DataTypes: We have a need types that don't fit the static data
> type model above.  Examples include fixed width types (e.g. 10 byte
> string), polymorphic (inline encoded) types (number or string depending on
> record) and repeated nested versions of our other types.  These are a
> little more gnarly as we need to support canonicalization of these.  Optiq
> has some methods for how to handle this kind of type system so it probably
> makes sense to leverage that system.
>
> *Expression Type Materialization*
> - LogicalExpression type materialization: Right now, LogicalExpressions
> include support for late type binding.  As part of the record batch
> execution path, these need to get materialized with correct casting, etc
> based on the actual found schema.  As such, we need to have a function
> which takes a LogicalExpression tree, applies a materialized BatchSchema
> and returns a new LogicalExpression tree with full type settings.  As part
> of this process, all types need to be cast as necessary and full validation
> of the tree should be done.  Timothy has a pending work for validation
> specifically on a pull request that would be a good piece of code to
> leverage that need.  We also have a visitor model for the expression tree
> that should be able to aid in the updated LogicalExpression construction.
> -LogicalExpression to Java expression conversion: We need to be able to
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB