> On Sep 8, 2017, at 9:25 AM, Jian He <[EMAIL PROTECTED]> wrote:

Somewhat. Greatly improved, but there’s still way too much “we’re working on this” and “here’s a link to a JIRA” and just general brokenness going on.

Here’s some examples from concepts.  Concepts!  The document I’d expect to give me very basic “when we talk about X, we mean Y” definitions:

"A host of scheduling features are being developed to support long running services.”

Yeah, ok?  How is this a concept?


"[YARN-3998](https://issues.apache.org/jira/browse/YARN-3998) implements a retry-policy to let NM re-launch a service container when it fails.”
The patch itself went through nine revisions and a long discussion. Would an end user care about the details in that JIRA?  

If the answer to the last question is YES, then the documentation has failed.  The whole point of documentation is so they don’t have to go digging into the details of the implementation, the decision process that got us there, etc.  If they care enough about the details, they’ll run through the changelog and click on the JIRA link there.  If the summary line of the changelog isn’t obvious, well… then we need better summaries.

etc, etc.


The sleep example is nice.  Now, let’s see a non-toy example:  multiple instances of Apache httpd or MariaDB or something real and not from the Hadoop echo chamber (e.g., non-JVM-based).  If this is for “native” services, this shouldn’t be a problem, right?  Give a real example and users will buy what you’re selling.  I also think writing the docs and providing an example of doing something big and outside the team’s comfort zone will clarify where end users are going to need more help than what’s being provided.  Getting a MariaDB instance or three up will help tremendously here.

Which reminds me: something the documentation doesn’t cover is storage. What happens to it, where does it come from, etc, etc.  That’s an important detail that I didn’t see covered.  (I may have missed it.)  

Why are there directions to enable other, partially unrelated services in here?  Shouldn’t there be pointers to their specific documentation?  Is the expectation that if the requirements for those other services change that contributors will need to update multiple documents?

"Start the DNS server”

Just… yikes.

a) yarn classname … This is not how we do user-facing things. The fact it’s not really possible for a *daemon* to be put in the YarnCommands.md doc should be a giant red flag that something isn’t going correctly here.
b) no jsvc support for something that it’s strongly hinted at wanting to run privileged = an instant -1 for failing basic security practices.  There’s zero reason for it to be running continually as root.
c) If this would have been hooked into the shell scripts appropriately, logs, user switching, etc would have been had for free.
d) Where’s stop?  Right. Since it’s outside the scripts, there is no pid support so one has to do all of that manually….

"3. Supports reverse lookups (name based on IP). Note, this works only for Docker containers.”


"It should not be used as a fully-functional corporate DNS.”

Scratch corporate.  It’s not a fully functional DNS server if it can’t do reverse lookups.  (Which, ironically, means it’s not suitable for use with Apache Hadoop, given it requires both fwd and rev DNS ...)

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB