Sample Workflows

Follow

PagerDuty can be integrated into different kinds of workflows. When thinking about how you want to integrate PagerDuty into your workflow, consider 2 things - 

  1. How PagerDuty incidents will be triggered
  2. How users should respond to PagerDuty incidents

Here we document a few sample workflows for different types of teams:

Workflow for a DevOps team

DevOps teams have direct ownership over the services that they build and deploy, meaning that they are also responsible and need to be on-call for any issues that impact their services. Because of this, it is a best practice for DevOps teams to configure monitoring on their services and to pipe in their monitoring alerts into PagerDuty so that they can keep track and be notified about any problems that happen on their services.

  • PagerDuty incidents are immediately assigned to the team who owns the impacted service
  • PagerDuty incidents are escalated to back-up on-calls and managers

How PagerDuty incidents are triggered

For a DevOps team, PagerDuty incidents are mostly triggered automatically either via email or via the API. 

How users respond to PagerDuty incidents

Generally, users will acknowledge the PagerDuty notification and begin investigating the problem. Once the problem is resolved, the monitoring tool will send a "resolve" event back to PagerDuty and automatically resolve the incident. 

Depending on the complexity of the incident, a user may also start a conference bridge, add responders, pr add stakeholders to the incident.

Workflow for a Central Ops organizational structure

In a Central Operations (Central Ops) organization, the Operations team owns uptime and reliability and is the first level on-call for all incidents. They are the first line of defense charged with fixing issues and triaging incidents to the engineering team if their remediation efforts require escalations.

  • PagerDuty incidents are immediately assigned to the Operations group, who will begin investigating and fixing the issue
  • The Operations group has their own internal escalation process but also escalates incidents by reassigning them outside of their group to the appropriate engineering team as needed.

How PagerDuty incidents are triggered

PagerDuty incidents are triggered by a monitoring tool, event manager tool, or ticketing tool.

How users respond to PagerDuty incidents

The Operations group will acknowledge the PagerDuty incident and begin remediation efforts. They will either resolve the PagerDuty incident manually or let the monitoring/ticketing tool automatically resolve the PagerDuty incident for them. 

If they are not able to resolve the incident, they will reassign the PagerDuty incident to the engineering team who should own the incident OR they will add the engineering teams as responders to the incident to begin collaborating on the remediation efforts.

Workflow for a NOC

The NOC is responsible for monitoring, investigating, fixing, and triaging all events. If the NOC needs to notify a team on-call to help fix or investigate an issue, they will use PagerDuty to automate the notification and escalation process and to also identify who is currently working on the issue.

  • NOC detects an issue that needs to be escalated
  • NOC opens an incident in PagerDuty which is then assigned to the appropriate team

How PagerDuty incidents are triggered

PagerDuty incidents are generally triggered manually by the NOC via email, the API, or the website:

  • Email
    Email is sent to a PagerDuty email integration address to trigger an incident.
  • Website
    Click on a button in the PagerDuty web app to manually open an incident. The NOC can either open an incident directly for a team OR open an incident. 
  • API
    Configure a button on your wiki page that, when clicked, triggers a PagerDuty incident.
  • Chat
    Type in a command in Slack or HipChat to trigger an incident for the on-call team.

Read more about these NOC workflows

How users respond to PagerDuty incidents

Since the NOC is usually triggering incidents manually, it is up to the teams being assigned to these incidents to either acknowledge the incident (if they are directly assigned to the incident) or join/decline the incident (if the teams are being added as responders).

If the team is directly assigned to the incident, then they will manually resolve it in PagerDuty. If the team is added as a responder to the incident, then the NOC will manually resolve the incident. 

Workflow for a support/helpdesk/MSP team

These teams use PagerDuty to receive notifications for urgent tickets submitted by internal or external customers. Tickets submitted by external customers might come from high-value customers, customers that have agreed SLA, or from any customer that might identify an issue as high priority.

 

How PagerDuty incidents are triggered

PagerDuty incidents are triggered manually or automatically:

  • Manually
    A customer can send an email to a designated email address that would trigger a PagerDuty incident when contacted.

  • Automatically
    Ticketing systems like JIRA, Zendesk, ServiceNow, etc. can be configured to automatically trigger a PagerDuty incident when certain conditions of the ticket are met (i.e. Priority level = P1).

How users respond to PagerDuty incidents

If an incident is triggered manually, then the on-call user will acknowledgement the incident to stop the escalation process and begin investigating the problem. The on-call team will then manually resolve the incident to indicate that they are have resolved the customer's problem.

If an incident is triggered automatically, then the team assigned to the incident will acknowledge it, work on the issue in their ticketing system, and then let the PagerDUty incident automatically resolve itself through the integration. For example, if integrating with JIRA, the PagerDuty incident will automatically resolve once the JIRA ticket is set to "Done".

Have more questions? Submit a request

Comments