• 6 Open
    5 Closed

    Goal: Reduce time required for maintenance.

    Targets:

    • Infrastructure changes are rolled out automatically...
      • for VMs
      • for physical hosts
    • Monitoring makes sure OPS gets notified if a
      • host has problems (storage, memory, load) or is not available (SSH, ping)
      • service is misbehaving. (port, metrics)
    • We have remote access to all important machines, also during boot time.
      • Automatic HDD unlock with secure boot
      • Remote management option on important hosts
    • Commits are tested automatically to reduce risk of breaking things.
    • There is a disaster recovery path for our services that are
      • consumer facing (cloud, vault)
      • OPS-critical (Backups, VPN)
      • developer services (git, CI)