Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Drill, mail # dev - Review Request 19782: DRILL-468 Support for FileSystem partitions


Copy link to this message
-
Review Request 19782: DRILL-468 Support for FileSystem partitions
Steven Phillips 2014-03-28, 09:02

This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19782/

Review request for drill.
Repository: drill-git
Description

For filesystem partitioning, we want to use the existing directory structure of the data. So, if a selection is a directory that contains subdirectories, the name of the directory a given record was stored in can be included as a field in that record. For example, given this structure:
/data
/a
file.csv
/b
file.csv
select * from dfs.`/data`
will include a column named dir0, with possible values a and b. This can be extended to a hierarchy of partitions. For example,
/data
/a
/1
file.csv
/2
file.csv
/b
file.csv
would have columns dir0 (with possible values a and b) and dir1 (with possible values 1, 2 and null).
The data type will always be VARCHAR for the partition columns.
Diffs

  exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java bcc113f
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java 24ea9c4
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/DrillPathFilter.java PRE-CREATION
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSelection.java 5ab2c1a
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java c69edb7
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyFormatPlugin.java d7949c3
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyGroupScan.java a7f556e
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasySubScan.java 72d1fe6
  exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/shim/fallback/FallbackFileSystem.java 5743ca1
  exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java d327b77
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java bfaaa45
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java f76e59a
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRowGroupScan.java 0e672d0
  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetScanBatchCreator.java 17e7da2
  exec/java-exec/src/main/resources/drill-module.conf a929a69
  exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java 7895897
  exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TextRecordReaderTest.java PRE-CREATION
  exec/java-exec/src/test/resources/storage-engines.json 6e4d23e
  exec/java-exec/src/test/resources/store/text/data/d1/regions.csv PRE-CREATION
  exec/java-exec/src/test/resources/store/text/data/regions.csv PRE-CREATION
  exec/java-exec/src/test/resources/store/text/regions.csv PRE-CREATION

Diff: https://reviews.apache.org/r/19782/diff/
Testing

Added tests to TestExampleQueries and TestTextRecordReader
Thanks,

Steven Phillips