Incident: Several internal services became undiscoverable as part of a routine rollback operation. Many customer-facing services that depend on these were unable to serve traffic as a result.
Impact: For 19 minutes from 2023-05-10 01:52-02:11 UTC, almost all API calls timed out. New pages in the web app failed to load, but most existing pages continued to function.
Moving forward: We have fixed the bug in our rollback process which caused the outage.