Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # dev >> Review Request 18459: FS based stats.


+
Ashutosh Chauhan 2014-02-25, 08:10
+
Gunther Hagleitner 2014-02-25, 21:36
+
Lefty Leverenz 2014-02-26, 01:52
+
Ashutosh Chauhan 2014-02-26, 16:37
Copy link to this message
-
Re: Review Request 18459: FS based stats.

This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18459/#review35533
Updated patch with Gunther feedback.
trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
<https://reviews.apache.org/r/18459/#comment66124>

    Didn't update template file, since as Lefty pointed out, soon it won't be required anymore.
    Updating test hive-site may result in failures in existing test cases which are written for jdbc stats collection (like hashing key etc). Each of those tests need to be examined, then updated etc and given the rate at which patch queue is moving, that will delay this patch endlessly.

trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java
<https://reviews.apache.org/r/18459/#comment66125>

    Good point. Instead of task attempt id, I have switched to using partition # in filenames. Since, partition # is tied to task # and is independent of attempt #, its guaranteed to give same filename for a task across different attempts. Further, I have changed create() with overwrite option, so once we get same filename we overwrite it again if task is running in its second attempt. See, changes in FSStatsPublisher.

trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsPublisher.java
<https://reviews.apache.org/r/18459/#comment66126>

    Updated logging at higher level at various places.
- Ashutosh Chauhan
On Feb. 26, 2014, 4:37 p.m., Ashutosh Chauhan wrote:
 
+
Lefty Leverenz 2014-02-27, 01:02
+
Navis Ryu 2014-02-27, 06:30