An incident is triggered within a service and is the event that sets off an alert to the on-call user.
Here are our best practice tips for responding to incidents:
Agree on when incidents should be acknowledged and resolved
To make your incident response metrics meaningful (i.e. mean time to acknowledge, mean time to resolve), talk to your team and agree on when incidents should be acknowledged and when incidents should be resolved.
Generally, acknowledgements should happen when a user is taking ownership and investigating an incident. Resolutions should happen when the incident itself has been fully resolved.
Stop incidents from bothering you and your team mates again by snoozing them
If you acknowledged an incident and know that it's going to take a while to resolve it, then you'll want to snooze the incident.
Snoozing an incident extends an incident's acknowledgement period to up to 24 hours. Snoozing is great to use if you know that your initial acknowledgement is going to time out too soon or if you don't want yourself or your team mates to be re-notified about an incident that you are already working on.
Know when to reassign an incident
If you're assigned an incident that needs the attention of someone else because, for example:
- you're already working on a separate issue and aren't able to immediately look into the incident that you are assigned
- you need the expertise and help of someone else to investigate the incident
then reassign the incident to another level of your escalation policy, to a single person from your (or another) team, or to a completely different escalation policy/team.
You can escalate an incident to the next level on-call directly from a phone call or SMS alert. You can reassign and escalate an incident from the web and mobile app.