Skip to main content

Detailed Documentation Checklist

What should your project's documentation include?

All teams must document the behavior of their services when things start going wrong.

At minimum, that should include:

  • Failover
    • For each point of failure, describe the process to restore service
  • Environments and their usage (qa/staging/prod)
    • If it runs in the cloud, using a 3rd party, runs on a laptop
  • Services dependencies
    • graphql-api, discord for logins, etc
  • At least basic runbooks of how to troubleshoot.
  • How to find out who's on-call
  • How to notify on-call / page them

Beyond that, here are some topics to consider:

  • Detailed description of it. Include enough detail to make the potential points of failure clear.
  • Where the code is
  • What subparts there are (and where to find them)
  • How a deploy works
    • Including CI/CD type stuff
    • Including any autoscaling aspects?
  • Where monitoring/metrics are
  • Logging / finding stuff