Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill, mail # dev - Review Request 18055: DRILL-356 : Support for Date, Time data types

Copy link to this message
Re: Review Request 18055: DRILL-356 : Support for Date, Time data types
Mehant Baid 2014-02-18, 00:30

This is an automatically generated e-mail. To reply, visit:

(Updated Feb. 18, 2014, 12:30 a.m.)
Review request for drill.

Added unit tests.
Added some more common functions like: current_date(), current_time(), current_timestamp()
Incorporated patch for DRILL-364 (uploaded it as a parent diff).
Repository: drill-git

Added value vectors, holders, cast and comparison functions for the following data types.


Internal representation of these data types:
For the data types Date, DateTime, TimeStamp we internally store milliseconds (from epoch) in UTC. We use a 'long' to store the milliseconds. Since TimeStamp has a timezone associated with it, we use another field (int) to store the index of the timezone into a static list of timezones we maintain (in DateUtility class)

For the Time data type we again store milliseconds, but since Time has a small range (00:00:00.000 - 23:59:59.999) we use an 'int' to store the value.

For Interval data type, which can be used to express an interval of time using one or more of the parameters: years, months, days, hours, minutes, seconds, milliseconds we internally store three fields. Months (int), days (int) and milliseconds (long). Using these three fields we can express all the above parameters with simple conversions. Eg: Years (parameter) - Months (internal) is a simple 1 Year = 12 months conversion. Similar conversion can be applied to hours, minutes and seconds to convert them to milliseconds.

Sorting, comparison functions for Date, DateTime, TimeStamp, Time are straight forward as internally they are stored as long or int.

In addition to these types we also have a way to represent each type as a literal. For eg: a Date literal can be expressed using the expression: "datetype(2008, 1, 20)", a TimeStamp literal can be expressed using the expression: "timestamptype(2008, 1, 27, 0, 0, 0, 0, 'UTC')" and so on for all the types.

For performing date functions the design is to use the internal representation to construct a Joda object (MutableDateTime) and use Joda for date arithmetic and other functions. As an example in this patch I've added a date_add() function that adds a Date and an Interval (DateTypeFunctions class)
Diffs (updated)

  common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprLexer.g be2a3f2
  common/src/main/antlr3/org/apache/drill/common/expression/parser/ExprParser.g b5cf292
  common/src/main/java/org/apache/drill/common/expression/FunctionRegistry.java 8ffc07a
  common/src/main/java/org/apache/drill/common/expression/OutputTypeDeterminer.java 66523c4
  common/src/main/java/org/apache/drill/common/expression/fn/CastFunctionDefs.java 6a98f94
  exec/java-exec/pom.xml 1c4dc32
  exec/java-exec/src/main/codegen/config.fmpp cd2b2cc
  exec/java-exec/src/main/codegen/data/Casts.tdd ceb9cde
  exec/java-exec/src/main/codegen/data/DateTypes.tdd PRE-CREATION
  exec/java-exec/src/main/codegen/data/ValueVectorTypes.tdd 88a5974
  exec/java-exec/src/main/codegen/templates/DateCastFunctions.java PRE-CREATION
  exec/java-exec/src/main/codegen/templates/DateFunctions.java PRE-CREATION
  exec/java-exec/src/main/codegen/templates/FixedValueVectors.java c357dd6
  exec/java-exec/src/main/codegen/templates/NullableValueVectors.java 051c62d
  exec/java-exec/src/main/codegen/templates/ValueHolders.java 41dc049
  exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/DateTypeFunctions.java PRE-CREATION
  exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/DateUtility.java PRE-CREATION
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java fd965a7
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/SimpleParallelizer.java ffb4c5b
  exec/java-exec/src/test/java/org/apache/drill/exec/record/vector/TestDateTypes.java PRE-CREATION
  exec/java-exec/src/test/resources/record/vector/test_all_date_literals.json PRE-CREATION
  exec/java-exec/src/test/resources/record/vector/test_date.json PRE-CREATION
  exec/java-exec/src/test/resources/record/vector/test_date_add.json PRE-CREATION
  exec/java-exec/src/test/resources/record/vector/test_datetime.json PRE-CREATION
  exec/java-exec/src/test/resources/record/vector/test_interval.json PRE-CREATION
  exec/java-exec/src/test/resources/test_simple_date.json PRE-CREATION
  exec/java-exec/src/test/resources/test_simple_interval.json PRE-CREATION
  protocol/src/main/java/org/apache/drill/common/types/TypeProtos.java e4fd94a
  protocol/src/main/java/org/apache/drill/exec/proto/BitControl.java 761f19f
  protocol/src/main/protobuf/BitControl.proto d96f7cf
  protocol/src/main/protobuf/Types.proto 3434110

Diff: https://reviews.apache.org/r/18055/diff/
Testing (updated)

Added unit tests and some manual testing.

Mehant Baid