Escalation policies allow you to connect services to on-call schedules, and they ensure that the right people are notified at the right time.
To connect a schedule to a service via escalation policy:
- Create an escalation policy with a schedule as a target (skip to step 3 after creating), or add your schedule to an existing escalation policy by navigating to Configuration → Escalation Policies.
- Find your desired escalation policy and click the gear icon, select Edit and then locate the escalation level where you would like to add your schedule. Search the schedule name or select it from the dropdown in the Notify the following users or schedules field. Click Save.
- Now that you have your schedule configured with an escalation policy, navigate to Configuration and select Services.
- If you are creating a new service, click +New Service. If you are adding this escalation policy to an existing service, find that service, click the gear icon and select Edit Service.
- In the Incident Settings section, select the Escalation Policy created or identified in step 1 (above).
- Click Save changes.
Escalation policies and schedules can be configured in many ways to fit your team’s needs. Below are examples of common, more complex methods of configuring these features together to achieve particular goals.
Configuring schedules with different escalation policy levels allows you to satisfy two different use cases:
- Notifying multiple users at once based on a specific span of time. For instance, if you want more than one person to be notified only during the weekend or after business hours.
- Notifying a Secondary on-call user when the Primary on-call does not respond.
In many cases, you will have a group of people on-call during business hours and a different group of people on-call off business hours, or different groups of people on-call at various times of the day. The default nature of an escalation policy is to notify one person at a time, but a common objective is to notify multiple people or teams at the same time.
To notify multiple users at the same time based on a specific span of time, then you will want to:
- Create individual on-call schedules for each person or team that reflects their on-call rotations.
- Add each of these schedules to the appropriate layer of an escalation policy.
To demonstrate this, we'll use the following example:
- George, Emily, and Jennifer are on-call during business hours 0800-1700 and should all be notified at the same time.
- Naomi, Liam, and Max are on-call outside of business hours 1700-0800 and should all be notified at the same time.
- On the weekends, both George and Naomi are on-call and should be notified at the same time.
The first step is creating individual on-call schedules for each user that reflects their specific on-call rotations and times.
For example, below are George and Max's schedules. Notice that we restrict the on-call times to certain times of the week for each schedule. Max is only on-call between 1700-0800 on weekdays:
George is only on-call between 0800-1700 on weekdays and all day on weekends:
Once we have created each user's on-call schedule, we can then add each schedule to an escalation policy.
In the screenshot below, we have added each user's schedule (6 in total) to the first layer of an escalation policy. This means that if an incident is triggered between 0800-1700, then George, Emily, and Jennifer will be assigned and notified at the same time, since they are the only users on-call within that escalation layer.
When an incident is triggered between 1700-0800, then Naomi, Liam, and Max will be assigned and notified at the same time, since they are the only users on-call within that escalation layer.
On the weekends, only George and Naomi will be assigned and notified at the same time, since they are the only ones on-call over the weekend in that escalation layer.
Notice that we have set up our escalation policy so that the same group of users are re-notified after 10 minutes if they haven’t acknowledged or resolved the incident. If the group does not respond 10 minutes after that, then the incident is escalated to their manager, Tony Wagner.
To create Primary and Secondary on-call users:
- Create individual schedules for both the Primary and Secondary on-call levels.
- Add each schedule to the appropriate escalation level.
The first step is creating individual on-call schedules to represent Primary and Secondary (etc.) levels that reflect their specific on-call rotations and times.
In the example below, we have two on-call schedules: a Primary schedule and a Secondary schedule. Both schedules have the same rotation (i.e. weekly at 09:00) but with different people on-call. The person from the Primary schedule is our first responder and the person from the Secondary schedule is our back-up in case the first responder does not take action on an incident.
Why create two separate on-call schedules? If I have two people on-call at the same time, wouldn't it make more sense to create one schedule with two schedule layers?
No, please visit our section below explaining why schedules alone are ineffective for multi-user notifications.
Once we have created individual on-call schedules to represent Primary and Secondary levels, we can then add each schedule to an escalation policy.
In the screenshot below, the Primary schedule is the first escalation rule and the Secondary schedule is the second escalation rule. Expected behavior for this escalation policy is as follows:
- When an incident is triggered, it will be immediately assigned to whomever is on-call in the Primary schedule.
- The Primary schedule user has 2 minutes to take action on the incident (i.e. acknowledge, resolve, re-assign)
- If the user on the Primary schedule does not take action on the incident within 30 minutes, the incident will be escalated and assigned to the user on-call on the Secondary schedule.
- If the person on the Secondary schedule does not take action on the incident in 30 minutes, then the incident is reassigned and escalated to the person on the Primary schedule. This is accomplished by clicking the box that says "If no one acknowledges, repeat this policy" and setting the number of policy repeats.
Alternatively, if you would like both of your Primary and Secondary on-call staff to be notified immediately after an incident is triggered, you can utilize multi-user alerting by adding both schedules to the same escalation level.
You must have a minimum of 5 minutes between escalation rules if you have more than one person/schedule in an escalation rule.
An important concept to grasp is that only one user can be on call per schedule, and using a schedule on its own (without an escalation policy) will not notify multiple users at once. A natural first instinct would be to add multiple layers to a schedule, however, the way PagerDuty's system works, the lowest layer on a schedule takes precedence over any others.
The following screenshot demonstrates what will happen if multiple layers are stacked on top of each other (Layer 3 takes precedence over Layers 1 and 2):
Thus, adding schedules and/or users to escalation policies is the only way to notify multiple users at once.
If you would like to pause or deactivate notifications from a service connected to a particular schedule, you will need to change the escalation policy used by that service.
To change the escalation policy for a service:
- Navigate to Configuration and select Services.
- Find the service that is notifying the schedule you would like to pause or deactivate, click the gear icon and select Edit Service.
- In the Incident Settings section, select an Escalation Policy that does not use the schedule you wish to pause or deactivate. Click Save Changes.
- Repeat this process for any additional services this escalation policy is linked to.
Updated about a month ago