Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Drill >> mail # dev >> Initial pass at data types

Jacques Nadeau 2013-06-05, 17:27
Copy link to this message
Re: Initial pass at data types
That's a really long list! Do we really need e.g. MONEY. It will require a lot of QA to get the whole list working.

Which levels of the system do these apply to? I could imagine several logical types mapping to the same physical type (e.g. timestamp and uint8 all mapping to the physical type uint8).

Some of the user types could be parameterized (e.g. DECIMAL(10, 3)) and mapped onto physical types (e.g. uint8 or binary(5)). Then we avoid having to support all variants decimal4, decimal8 etc.

For user type system, please strongly consider making the system a superset of SQL types [ see http://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql.Types.ARRAY ] both in terms of semantics and in terms of the enum code. The SQL system has space above 1000 for extensions.

And then to just implement the same subset of the SQL standard as another database.

Note that SQL's timestamp semantics are different to say Java's. A Java timestamp is relative to the UTC epoch. A SQL timestamp has no timezone -- the interpretation is left to the reader/writer. JDBC does its best to do translation on the way in and out based on the JVM's timezone. I don't claim that SQL's system is better or worse.

On Jun 5, 2013, at 10:27 AM, Jacques Nadeau <[EMAIL PROTECTED]> wrote:

> I did an initial pass at data types.  I've posted it here: http://bit.ly/15JO9bC
> Note that the Variable length fields are incorrect here (why they are
> in red).  Will be working on updating.
> I'm using this as a foundation for the types of value vectors will be
> working on.  Ben should be sending out the value vector design draft
> shortly and its going to leverage this.
> Jacques