Log aggregation is great. However, if a yarn application runs a large number of tasks which generate large logs, it takes some finite amount of time for all of the logs to be collected and written to the HDFS.
Currently our client code runs the equivalent of the "yarn logs" command once all tasks have completed. This works fine provided log aggregation is complete.
But it fails in a variety of ways if aggregation is not complete. This includes one case where the "yarn logs" code encounters no exceptions and no non-zero return codes from methods, but returns an empty string.
So, is there a way to determine if log aggregation is complete?
CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Thanks for letting us know this issue has been recognized.
On Mar 10, 2014, at 12:09 PM, Zhijie Shen <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext