You can run a response play whenever an incident is created on a service. Choose the service and click on the Edit Settings button. Once on the edit screen, you can choose or clear the response play that will be used for new incidents on the chosen service.
Urgencies is a feature which allows you to customize how your team is notified based on the criticality of an incident: incidents can be either high-urgency (requires immediate attention) or low-urgency (it can wait). What does this mean for responders? As an incident responder, you can set up notification rules so that you won't be woken up for low-urgency incidents that can be handled in the morning, or you can set a service to notify you with only high-urgency or low-urgency notification methods at specific times of day.
- Step 1: Configure Service
- Step 2: Configure User Profiles
- Use Case 1: Critical and Non-Critical Incidents
- Use Case 2: Support Hours
Urgencies are defined at the Service level, and all services will initially default to High. This can be adjusted within the settings of a service; to access the settings, navigate to the service you wish to adjust and click Edit Service.
You have the option to apply one of two (or four) options to all incidents originating from a particular service, depending on your account's plan:
- Notify responders until someone responds, escalate as needed (use high-urgency notification rules)
- Notify responders, do not escalate (use low-urgency notification rules)
- Use alert severity to determine how responders are notified for each incident
- Use defined support hours to determine how responders are notified
Low-urgency incidents will not escalate. Only high-urgency incidents will escalate according to the rules defined in an escalation policy.
Additionally, accounts with access to Support Hours can configure a setting to raise the urgency of all open incidents once on-call hours begin.
Responders will next need to navigate to their User Profile and specify how they'd like to be notified based on the urgency of an incident.
It's a best practice to create several notification rules for high-urgency incidents and tier them so that the on-call responder will be notified numerous times until they acknowledge the incident.
With non-urgent incidents, however, the on-call responder may only want to receive a push notification, email, or no notification at all, for instance.
The level of urgency you choose for a service depends on your use case. For example, in this test we have two Nagios services. One is for CRITICAL Nagios events that require immediate attention, the other is for NON-CRITICAL Nagios incidents that don't require immediate attention.
We would want the on-call engineer to respond to incidents originating from the Nagios Critical service immediately. Incidents on this service will trigger as high-urgency.
On the other hand, there may be less critical incidents that need to be addressed, but that do not require our immediate attention. We'll configure the Nagios Non-Critical service to trigger low-urgency incidents.
Let's take a support team as another example. Since a support team responds to customer inquiries during business hours, we want incidents to be categorized as critical during business hours and non-critical after hours. This way they are fresh for work the next day!
You have a few options for configuring your support hours. Your options are:
- Notify until someone responds (escalates) -- maps to high urgency notification rules
- Notify but do not escalate -- maps to low urgency notification rules
- Notify based on alert severity -- Dynamic Notifications
All three of these options are available both during support hours and outside of support hours.
Beneath your support hours you can select the option Raise urgency of all triggered incidents for this service to High when service support hours begin. This means that all open incidents on the service will become high urgency, and responders will be notified using high urgency notification rules, and that notifications will be sent until someone responds.
This setting is useful for reams who rely on support hours to give their team a rest during off-hours, but want to make sure that any incidents that come in during that time will be dealt with with some urgency by their team once their support hours begin.
Urgencies can be configured to notify you based on your notification rules. When both high and low urgency incidents have been triggered, PagerDuty makes it easy to pinpoint which incidents are high urgency and which are low urgency by color coding them and prioritizing them according to their urgency level.
In the PagerDuty Dashboard, incident urgency is indicated in the Urgency column. High urgency incidents will be displayed at the top, followed by low urgency incidents. High urgency incidents will also have a lighter background than low urgency incidents. This helps you quickly identify incidents that are most critical.
For example in the screenshot below, incident #142 was triggered most recently, but because it’s low urgency, it’s displayed below the higher priority incidents.
Incidents will maintain this urgency-based order, even after acknowledgment. In the example below, incidents #140 and #142 are acknowledged. The high urgency incident remains at the top of the table, followed by the low urgency incident.
If you have high and low urgency incidents open, high urgency incidents will be displayed at the top of the queue and low urgency incidents at the bottom.
After an incident has been acknowledged, the bar to the left side of the incident will change from red to yellow.
High urgency incidents are still displayed at the top of queue, even after low urgency incidents have been acknowledged.
If you want an incident to re-trigger after a given period of time, you can set up an acknowledgement timeout period for your service. This option ensures that an incident isn't disregarded after being ack'd.
With this feature enabled, an incident will re-trigger and re-notify the user it is currently assigned to in the escalation policy. This means that once the acknowledgement timeout expires, the acknowledged incident will again become a triggered incident.
An incident re-triggered by the acknowledgement timeout will re-notify the user who ack'd and the current user on-call if on-call responsibilities have changed since the ack.
You can configure the length of the timeout or turn it off completely in the settings for a service.
Go to the Configuration menu and select Services.
Click on the service you want to edit.
Click Edit Service in the top right corner of the page.
Under Incident Settings, select a time from the drop-down list. Or, to disable the timeout, clear the Acknowledgement timeout checkbox.
If a user wants to quiet an individual incident for a different length of time than the service's incident ack timeout, they can snooze the incident.