Incident: A bug in our cleanup scripts caused some active container repositories to become unavailable. At approximately 13:33 UTC, a memory problem caused servers which handle pageload (initial load of our web application) to crash, and because of the bug in the cleanup script they were unable to restart without operator intervention. Customers were unable to start new sessions for about 28 minutes, but existing sessions were unaffected.
Impact: Asana was largely unavailable to new sessions for 28 minutes on Asana Desktop. Existing sessions, mobile and API were unaffected.
Moving Forward: We will be conducting a 5 Whys analysis to identify process changes that can prevent similar incidents in the future. We will also work on detecting this type of issue before they impact customer traffic.
Our metric considers a weighted average of uptime experienced by users at each data center. The number of minutes of downtime shown reflects this weighted average.