Incident Response Guide | Notion

1. Detection & reporting

Monitor alerts from observability tools
Accept reports from customer support, users, or team members
Document the initial report with timestamp and symptoms

2. Initial assessment

Verify the incident is real and not a false alarm
Determine severity level
Identify which systems and users are affected

3. Mobilize the team

Incident Commander: Coordinates response, makes decisions, communicates status

Technical Lead: Directs investigation and remediation efforts

Communications Lead: Handles internal and external communications

Support Lead: Interfaces with affected users and support team

4. Investigation & mitigation

Gather logs, metrics, and other diagnostic information
Identify root cause or implement temporary workaround
Test and deploy fix
Monitor for stability

5. Resolution & verification

Confirm all systems are operating normally