Incidents
Trigger, acknowledge and resolve incidents created by service integrations
An incident represents a problem or an issue that needs to be addressed and resolved. Incidents trigger on services, and a service’s escalation policy prompts notifications to go out to on-call responders to remediate the issue.
Incident Statuses
- Triggered: An active service (i.e., someone is on call and the service is not disabled or in maintenance mode) will trigger an incident when it receives an event. The incident will escalate according to the service's escalation policy. By default, PagerDuty sends notifications when an incident is triggered, but not when it is acknowledged or resolved. Users create their own notification rules — or they can use webhooks — to receive notifications when an incident is acknowledged or resolved.
- Acknowledged: An acknowledged incident is being worked on, but is not yet resolved. The user that acknowledges an incident claims ownership of the issue, and halts the escalation process. While an incident is acknowledged, notifications are not sent until the acknowledgement timeout is reached. If the acknowledgement timeout is reached, the incident goes from the acknowledged status back to the triggered status. The escalation process also resumes.
- Resolved: A resolved incident has been fixed. Once an incident is resolved, no additional notifications are sent and the incident cannot be triggered again.
Priority, Urgency and Severity
- Priority is tied to incidents and it specifies the order in which incidents should be addressed (P1, Sev-1, etc.) Please see our article on Incident Priority for more information.
- Urgency is tied to incidents and it determines how you should be notified if an incident is assigned to you (high or low). Please see our article on Notification Urgency for more information.
- Severity is tied to alerts and it describes the impact on a specific service or piece of infrastructure (critical, warning, error, etc.) Please see our article on Event/Alert Severity Levels for more information.
Incident Lifecycle
1. Received through Services
PagerDuty receives events from monitoring systems via integrations. An event creates an alert and an associated incident in PagerDuty.
Note
Suppression can be used to collect data without triggering an incident or notifying responders.
2. Assignment via Escalation Policies and Schedules
Unlike an alert or a suppressed event, an incident must be assigned to a user. The escalation policy determines whom an incident is assigned to. An escalation policy has one or more levels, and can accept either a schedule or a user as a target. An incident will escalate through the layers of an escalation policy until it finds someone who is on-call. This user will be notified and the incident will be assigned to them. If the user fails to acknowledge the incident before the escalation timeout, the incident escalates to the next escalation level.
Incidents are only created when an escalation policy has an on-call user. In other words, if there is nobody to assign an incident to when an event is sent to PagerDuty (due to a coverage gap on a schedule, for example), then an incident will not be created.
3. Notifications via Push, Phone, Slack, Email or SMS
Each user configures notification rules in their user profile. PagerDuty contacts users according to their notification rules until the incident is acknowledged, resolved, or escalated, either manually or due to escalation timeout.
4. Acknowledging and Resolving
Notifications provide a way for responders to acknowledge that they're working on an incident or that it's been resolved. Depending on a user's permissions, it's also possible for users who are not currently assigned to an incident to acknowledge or resolve an incident on the Incidents dashboard in the web app.
Resolving an incident closes the incident, while acknowledging only halts escalation. If the incident is not resolved before the end of the service's acknowledgement timeout, it re-triggers and continues to escalate.
It is important to note that alerts cannot be acknowledged, only triggered or resolved. If all alerts in an incident are resolved, the incident will be resolved. Similarly, when an incident is resolved, all alerts under that incident are also resolved.
Incident Timeline
Each incident has a Timeline tab in the incident details page, showing timestamps of each incident status along with all other actions taken and notifications sent from the incident.
Trigger an Incident
There are multiple ways to trigger PagerDuty incidents depending on your use case:
- Trigger an Incident via Integration
- Trigger an Incident via Web App
- Trigger an Incident via Mobile App
- Trigger an Incident via API
- Trigger an Incident via Email Integration
- Trigger an Incident via Slack
Incident Trigger Limitations
- Please note that incidents triggered via email or the events API have a trigger limit of 100. After receiving 100 triggers, the Alerts Log stops showing more events. If you would like to send more events, you must first resolve the incident.
- In order for an incident to trigger, someone must be on-call per the service's escalation policy. If no one is on-call an incident will not trigger.
High Open Incident Volume
For services with over 100K open incidents, we will automatically enable and require to have the auto-resolve feature enabled. When this feature is enabled, all new incidents for that particular service will be auto-resolved after they have been open for 24 hours, and no further notifications will be sent for those incidents.
It will not be possible to disable this feature for the service in question unless the service's open incident count is reduced to under 100K. To reduce the open incident count, we recommend using the update an incident API to bulk resolve incidents. Additionally, you can use this script for an automated way to bulk resolve incidents.
Trigger an Incident via Integration
It is a common workflow to integrate with a third-party platform (e.g., a monitoring tool), and to configure the integration to trigger an incident in PagerDuty when specific criteria are met. Visit our Integrations library for more information about integrating the products in your tool chain with PagerDuty.
Event Orchestration
Event Orchestration is a feature that allows you to centralize your integrations by sending all your events to a single endpoint, and then routing alerts to the appropriate service based on the logic that you provide.
Trigger an Incident via Web App
You can manually open an incident on the Incidents page or a service's details page. Manually opening an incident will trigger an incident and notify the on-call responder(s). A common use case is to test notification rules, or to contact the on-call person to let them know about an issue on a particular service.
Required User Permissions
All users, except Limited Stakeholders and Full Stakeholders, can manually trigger incidents.
Restricted Access and Observer users can only trigger incidents for Teams they are associated with.
Availability
Depending on your account's pricing plan, some features below may not be available. Please contact our Sales Team to upgrade your pricing plan.
- On the Incidents page or a service's details page, click New Incident in the top-right.
- In the Create New Incident dialog, enter the following:
Field | Description |
---|---|
Title (Required) | Enter a title for the incident. |
Incident Type (Required) | Select Base Incident. |
Impacted Service (Required) | Select the impacted service for the incident. |
Description | Optionally enter an incident description. |
Urgency | Optionally assign an urgency level to the incident: High or Low. If you have configured support hours on the service, leave this option blank to use the default urgency setting, or make a selection to override it. You may only select urgency on services where it has been enabled. Please see our section on notification urgencies for more information. |
Priority | Optionally assign an incident priority. You may only select priority on services where it has been enabled. Please see our article on incident priority for more information. |
Assignee (Required) | Assign the incident by selecting an escalation policy or a user from their respective tabs. This selection will override the service's escalation policy, and the incident will notify the escalation policy or user you've selected. |
Advanced Options | |
Add additional responders to help | Optionally add responders to the incident by selecting escalation policies and/or users from their respective tabs. |
Add a conference bridge | Optionally add a conference bridge to the incident by selecting a preconfigured Conference Bridge from the dropdown. |
Dial-in number | Optionally enter a dial-in number. |
Meeting URL | Optionally enter a meeting URL. |
- Click Create Incident.
Self-Assigned Incidents
If you assign a manually triggered incident to yourself, PagerDuty will not notify you. The incident will be in an Acknowledged state, since it is understood that you are aware of the incident and working to resolve it.
Trigger an Incident via Mobile App
Please read our article about triggering incidents in the mobile app for more information.
Trigger an Incident via API
Events API
If a service has an API integration, you can trigger an incident via Events API by sending a properly-formatted POST
request with your integration key. Integration keys can be found by navigating to Services Service Directory select the service where the integration is configured Integrations tab click the to the right of the integration.
Developer Documentation
More info about the Events API can be found here. Please see this article for code samples in Ruby, Python, and PHP.
REST API
You may also trigger incidents using the REST API.
Trigger an Incident via Email Integration
If a service has an email integration, you can trigger an incident by sending an email to the integration's email address. To view an email integration's address go to Services Service Directory, select the service, click service's Integrations tab and click the to the right of the integration.
When you send an email to the integration email address, an incident will trigger on that service. The incident will appear in the Incidents tab.
Trigger an Incident via Slack
If your account has the Slack integration configured, you may also trigger an incident using Slack slash commands.
Acknowledge an Incident
There are multiple ways to acknowledge PagerDuty incidents depending on your use case:
- Acknowledge an Incident via Web App
- Acknowledge an Incident via Mobile
- Acknowledge an Incident via API
- Acknowledge an Incident via Slack
Acknowledge an Incident via Web App
There are two ways to acknowledge an incident in the web app:
Acknowledge an Incident on the Open Incidents Page
- Navigate to Incidents to view the Open Incidents page.
- Select the checkbox to the left of the triggered incident you wish to acknowledge.
- Click Acknowledge in the top left actions menu.
Acknowledge an Incident on the Incident Details Page
- Click an open incident's Title to enter the incident details page.
- Click Acknowledge in the top left actions menu.
Acknowledge an Incident via Mobile
Please read our article about acknowledging incidents in the mobile app for more information.
Acknowledge an Incident via API
Please see our API Reference for more information.
Acknowledge an Incident via Slack
Please see our Slack Integration User Guide for more information.
Unacknowledge an Incident
If you accidentally acknowledge an incident, you can undo this by clicking the More... button in the incident, and then Unacknowledge Incident.
Unacknowledging an incident brings the incident back to a Triggered state, and causes notifications to be sent out again. The escalation process also resumes.
Resolve an Incident
There are multiple ways to resolve PagerDuty incidents depending on your use case:
- Resolve an Incident via Web App
- Resolve an Incident via Mobile
- Resolve an Incident via API
- Resolve an Incident via Slack
Resolve Warning
Once you resolve an incident, it cannot be reopened.
Resolve an Incident via Web App
There are two ways to resolve an incident in the web app:
Resolve an Incident on the Open Incidents Page
- Navigate to Incidents to view the Open Incidents page.
- Select the checkbox to the left of the incident you wish to resolve.
- Click resolve in the top left actions menu.
Resolve an Incident on the Incident Details Page
- Click an open incident's Title to enter the incident details page.
- Click Resolve in the top left actions menu.
Resolve an Incident via Mobile
Please read our article about resolving incidents in the mobile app for more information.
Resolve an Incident via API
Please see our API Reference for more information.
Resolve an Incident via Slack
Please see our Slack Integration User Guide for more information.
Redact an Incident
Required User Permissions
This action is only available to the Account Owner. Redaction cannot be undone, not even by PagerDuty Support.
In the event that an incident contains sensitive information, the Account Owner can permanently delete the incident's details by selecting More... and clicking the Redact Incident button.
After confirming that you would like to redact an incident’s name and details, it will be updated to show who redacted the data and when.
Incident Redaction and Analytics
Redacting deletes the incident description and incident key, but does not affect Analytics metrics associated with the incident.
Where is incident number ___?
In the past, we made sure that incidents started at #1 and never skipped a number. There can be cases, however, where we're unable to create incidents fast enough. To address this, you might notice "missing" incident numbers. We don't delete your incident numbers, so if you see a skipped number, this means it was skipped when the incident was created. You should not see this often, and it does not indicate a problem.
Updated 2 months ago