Incident: Asana pushed a code change that caused a misconfiguration in our database connection pools. During a push, we reached limits in our connection pools that caused saturation of our databases, preventing requests from being completed.
Impact: Asana was inaccessible to users in all regions for 29 minutes. Many users experienced delays in loading the app for an additional 7 minutes. No customer data was lost.
Moving forward: After fixing the original misconfiguration, we are reviewing and updating other database connection limits. We will be improving dashboards and monitoring to reduce time to resolution for database connection issues.
Our metric considers a weighted average of uptime experienced by users at each data center. The number of minutes of downtime shown reflects this weighted average.