Incident Lifecycle

Follow

Every PagerDuty incident goes through a basic lifecycle:

1. Received through Services

PagerDuty receives events from monitoring systems through integrations. An integrated monitoring tool's event creates an alert and a correlating incident in PagerDuty. There are several integration types that can accept different inputs, are capable of some de-duplication and, in the case of a new incident, will contact an escalation policy.

Not all incoming events need to result in a notification or be assigned to a user. With alerts and incidents and suppression, data can be sent to your PagerDuty account and collected without being assigned to users or notifying anyone.

2. Assignment via Escalation Policies and Schedules

Unlike an alert or a suppressed event, an incident must be assigned to a user. Assignment is done via an escalation policy associated with the integration. Escalation policies are levels of schedules and/or users that an incident will escalate through if it isn't acknowledged or resolved quickly. Schedules are customizable calendars of who is on-call, and when. An incident will escalate through the layers of an escalation policy until it finds someone who is on-call. This user will be notified and the incident will be assigned to them. If the user fails to acknowledge the incident before the time limit set on the escalation policy, the incident will continue to escalate.

3. Notifications via Phone, SMS, Email, or Push Notification

Each user configures how they would like to be notified in their user profile. PagerDuty will contact the user by the indicated notification rules until the incident is acknowledged, resolved, or escalated, either manually or due to escalation timeout.

4. Acknowledging and Resolving

Notifications provide a vehicle through which to acknowledge (ack) that an incident is being addressed or resolved. Depending on their user role permissions, users who are not currently assigned to an incident may acknowledge or resolve an incident via the incidents dashboard in the web application.

If a service is using alerts triage, alerts cannot be acknowledged, only triggered or resolved. If all alerts in an incident are resolved, the incident will be resolved. Conversely, when the incident is resolved, all alerts under that incident are also resolved.

Resolving an incident closes the incident, whereas acknowledging only halts the escalation process. If the incident is not resolved by the service's acknowledgement timeout period, it re-enters the escalation chain.

Services can also be configured to automatically resolve incidents through the Auto-resolution option.

Have more questions? Submit a request

Comments