App down

Incident Report for Asana

Postmortem

Incident: Around 15:50 UTC, a system used for caching entered an overload state due to network connection tracking within our hosting environment. Connections from servers to this cache timed out, and request queuing created a cascading failure. Reducing load via application changes allowed recovery, and we then made configuration changes to provide additional capacity.

Impact: Asana was unavailable in all regions for as long as 16 minutes, with a partial outage for 6 additional minutes.

Moving forward: We've updated configuration to avoid overload, and are adding monitoring to detect this type of saturation before failure.

Our metric considers a weighted average of uptime experienced by users at each data center. The number of minutes of downtime shown reflects this weighted average.

Posted Feb 09, 2023 - 15:56 UTC

Resolved

We are currently investigating this issue.

Posted Feb 06, 2023 - 16:00 UTC