Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # dev - Review Request 14243: HIVE-5325: Implement statistics providing ORC writer and reader interfaces


Copy link to this message
-
Re: Review Request 14243: HIVE-5325: Implement statistics providing ORC writer and reader interfaces
Ashutosh Chauhan 2013-09-30, 16:16

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14243/#review26490
-----------------------------------------------------------

ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java
<https://reviews.apache.org/r/14243/#comment51676>

    Can you add a comment when this if condition will be true and when it will be false. It isn't obvious.

ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
<https://reviews.apache.org/r/14243/#comment51672>

    Log a message here, saying unknown category.

ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
<https://reviews.apache.org/r/14243/#comment51671>

    What is more useful is how much size these objects will take when loaded back in memory (e.g while doing map-joins). What you have is how much memory they take up while ORC writer has buffered them in memory. So, what we want is numVals * sizeof(boolean)

ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
<https://reviews.apache.org/r/14243/#comment51673>

    Log a msg here too.

ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
<https://reviews.apache.org/r/14243/#comment51674>

    Class JavaDataModel has almost identical info. Consider using that.

ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
<https://reviews.apache.org/r/14243/#comment51670>

    We already have class called JavaDataModel, consider using that.

ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java
<https://reviews.apache.org/r/14243/#comment51675>

    You have removed this test. Is this getting covered via new one?
- Ashutosh Chauhan
On Sept. 24, 2013, 10:18 p.m., Prasanth_J wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14243/
> -----------------------------------------------------------
>
> (Updated Sept. 24, 2013, 10:18 p.m.)
>
>
> Review request for hive, Ashutosh Chauhan and Owen O'Malley.
>
>
> Bugs: HIVE-5325
>     https://issues.apache.org/jira/browse/HIVE-5325
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> HIVE-5324 adds new interfaces that can be implemented by ORC reader/writer to provide statistics. Writer provided statistics is used to update table/partition level statistics in metastore. Reader provided statistics can be used for reducer estimation, CBO etc. in the absence of metastore statistics.
>
>
> Diffs
> -----
>
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/BinaryColumnStatistics.java PRE-CREATION
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/ColumnStatisticsImpl.java 6268617
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c80fb02
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java c454f32
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/StringColumnStatistics.java 72e779a
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 44961ce
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java PRE-CREATION
>   ql/src/protobuf/org/apache/hadoop/hive/ql/io/orc/orc_proto.proto edbf822
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 34b2305
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java e6569f4
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java b93db84
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcSerDeStats.java PRE-CREATION
>   ql/src/test/resources/orc-file-dump-dictionary-threshold.out 003c132
>   ql/src/test/resources/orc-file-dump.out fac5326
>
> Diff: https://reviews.apache.org/r/14243/diff/
>
>
> Testing
> -------
>
> ORC related unit and qfile tests are passing.
>
>
> Thanks,
>
> Prasanth_J
>
>