Escalation policies determine who is notified when an incident is triggered and what should happen if nobody responds to that incident.
We recommend keeping the following items in mind to make your escalation policies more meaningful:
Check for schedule gaps
If the schedules that you are adding to your escalation policies have schedule gaps, check to make sure that there is somebody on-call at the times that you need an incident to trigger.
The reason for this is because incidents will not be created if nobody is on-call in an escalation policy; incidents can only be created if somebody is on-call on any level of an escalation policy.
If you have an escalation policy with 3 levels, each containing a schedule where nobody is on-call between 6pm-9am, then if an event comes in at 10pm, no incident will trigger and nobody will be notified.
If you have an escalation policy with 3 levels, with the first 2 levels containing schedules where nobody is on-call between 6pm-9am and the last level containing somebody on-call all the time, then if an event comes in at 10pm, an incident will trigger and will be assigned immediately to the level 3 on-call.
Add a repeat rule
A "Repeat" rule acts as a safety net when an incident has gone through all levels of an escalation policy. If an incident reaches the end of an escalation policy and no action has been taken on it, then you can have the escalation policy repeat itself to ensure that no incidents fall through the cracks. To set up a repeat, edit the escalation policy and check the "If no one acknowledges..." Box. You can repeat an escalation policy up to 9 times.
Add multiple escalation levels
We recommend setting up multiple escalation levels, escalating incidents to a back-up or tier 2 support team if for example the primary on-call does not respond to an incident within a specified time period or needs to escalate an incident to another level of support within that escalation policy.
Some teams will also add their manager to their escalation process.
Notify multiple people at once
A great example of this is if you have an 'all hands on deck' type of situation, or if you need a critical incident to notify all members of a team if the primary person does not respond immediately.
Not sure if notifying multiple people at once fits your use case? Check out this article here.
Create separate escalation policies based on the criticality of your incidents
If your incidents will have different levels of criticality, make sure your escalation timeout periods reflect this.
For example, some customers will have a 5 minute escalation timeout period between escalation levels for extremely critical incidents.
Incidents that aren't so critical might have a 10, 15, or 20 minute escalation.
You will need to create separate escalation policies with these different escalation timeout periods and associate these escalation policies with separate services. These services should capture only those incidents that should follow the respective escalation policies that you created.
Note: Low urgency incidents do not escalate unless they become high urgency.