Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # dev - Review Request 18459: FS based stats.


Copy link to this message
-
Re: Review Request 18459: FS based stats.
Ashutosh Chauhan 2014-02-26, 16:55

This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18459/#review35533
Updated patch with Gunther feedback.
trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
<https://reviews.apache.org/r/18459/#comment66124>

    Didn't update template file, since as Lefty pointed out, soon it won't be required anymore.
    Updating test hive-site may result in failures in existing test cases which are written for jdbc stats collection (like hashing key etc). Each of those tests need to be examined, then updated etc and given the rate at which patch queue is moving, that will delay this patch endlessly.

trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java
<https://reviews.apache.org/r/18459/#comment66125>

    Good point. Instead of task attempt id, I have switched to using partition # in filenames. Since, partition # is tied to task # and is independent of attempt #, its guaranteed to give same filename for a task across different attempts. Further, I have changed create() with overwrite option, so once we get same filename we overwrite it again if task is running in its second attempt. See, changes in FSStatsPublisher.

trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsPublisher.java
<https://reviews.apache.org/r/18459/#comment66126>

    Updated logging at higher level at various places.
- Ashutosh Chauhan
On Feb. 26, 2014, 4:37 p.m., Ashutosh Chauhan wrote: