Global Alert Grouping
Combine similar alerts into a single incident to reduce notification noise and provide more context across multiple services
Global Alert Grouping reduces noise by using Content-Based Alert Grouping to group alerts across multiple technical services. Content-Based Alert Grouping identifies alerts that share an exact match on a set of chosen fields and groups them together. Grouped alerts mean fewer incidents and interruptions for responders, richer incident context, and lower resolution times.
Availability
Global Alert Grouping is available with our PagerDuty AIOps add-on. The feature is also available for the duration of an AIOps Trial. Please contact our Sales team to upgrade to a pricing plan with this feature.
Required User Permissions
Users with the following roles can edit a service’s Alert Grouping settings:
- Account Owner
- Admin and Global Admin
- User
- Manager base role and Team roles
- Manager Team roles can only configure the services for which they are assigned a Manager role.
Enable Global Alert Grouping
Global Alert Grouping leverages Content-Based Alert Grouping to enable alert grouping across technical services. Please read Content-Based Alert Grouping for a deeper understanding of that feature.
Data Format Requirement
Content-Based Alert Grouping does not support email integrations at this time. Data should be formatted in PagerDuty's Common Event Format (PD-CEF).
To enable Global Alert Grouping:
- In the web app, navigate to Services Service Directory and select the name of your desired service.
- Select the Settings tab and click Edit next to Reduce Noise.
- In the dropdown Services to reduce noise across, select two or more services whose alerts you'd like to group together when an incident triggers.
- Note: A service cannot be part of more than one multiservice group.
- With Alert Content selected, make a selection in the dropdown to determine whether to match on Any or All fields.
- Make a selection in the Select Fields dropdown to determine which field you'd like to match on. Optionally, click Add Field to add more matching criteria.
- You can click See Recent Alerts to open a pane of recent alerts from selected services. Click an alert's title to open its payload, and click a key/value pair to automatically update the matching criteria.
- Make a selection in the dropdown to determine how long the rolling grouping window should be. See the section Flexible Time Window below for more information.
- Click Save Settings.
You can review whether a service is taking part in Global Alert Grouping by going to its details page in the Service Directory. In the Reduce Noise section, you will see Global Alert Grouping is active, along with any other services that are taking part.
Flexible Time Window
You can configure the grouping time window as part of the Global Alert Grouping setting. The time window can be between five and 60 minutes. The time window is a rolling window and counted from the most recently grouped alert. The window extends each time an alert is grouped, up to 24 hours, or until the incident is resolved. If an alert comes in after 24 hours, it will trigger a new incident.
View Global Alert Grouping On an Incident
The primary indicator that an incident has alerts grouped from other services is the Multiservice Group pill, along with the description of how many alerts have been grouped, at the top of the incident's details page.
You can also view Global Alert Grouping on an incident's details page in the following places:
- The Impacted Services field will show the incident’s source service, along with the name(s) of any other service(s), whose alerts have been grouped into the incident.
- With the Alerts tab selected, users can also see which service an alert originated on.
- An incident's Timeline will show an entry when Global Alert Grouping adds an alert from another service.
Best Practices
To get the most out of Global Alert Grouping, we recommend the following:
- Make sure that responders on all associated Teams and escalation policies know that their services are part of a Global Alert Grouping configuration.
- Ensure that on-call responders evaluate all alerts, including alerts that originated on other services.
- If an on-call responder is unsure whether an alert should belong to an incident on their assigned service, please advise the on-call responder to add a responder from the originating service to review the alert, or move the alert to create an incident on the originating service. By creating a new incident on the originating service, the appropriate on-call responder will be notified.
- We recommend that users have a base role of Responder or higher for services that are within a multiservice group. Additionally, refrain from using Advanced Permissions to restrict users' access to services that are part of a multiservice group (e.g., restricting users from viewing incidents on a specific service).
FAQ
What service does an incident get assigned to when grouping across services?
Global Alert Grouping will group subsequent alerts into the incident on the service that received the initial alert. Global Alert Grouping will continue grouping matched alerts based on the configured criteria and rolling time window, or until the incident is resolved.
Which escalation policy will be assigned to the incident?
PagerDuty will use the escalation policy from the service where the first alert triggers.
How do I bring in other responders?
You have multiple options:
- The on-call responder can re-assign an incident to other users or escalation policies.
- The on-call responder for that incident can add responders.
- The on-call responder can execute an Incident Workflow that adds responders.
Where can I check if an alert from my service has been grouped into an incident on a different service?
Grouped alerts maintain their originating service, so your team’s alerts will be visible in the Alerts Table when the My Teams filter is selected. They will not appear on the Incidents page when My Teams is selected.
In the Service Directory, the multiservice incident will only be visible with the source service (i.e., the service that received the first alert). This means that a single technical service will show as impacted on your status page. If you have Business Services configured, and there is a priority assigned to the incident, then the business service will also display as impacted.
We are currently evaluating an area on the service directory page to show multiservice incidents when a service has an alert assigned to a multiservice incident.
How do I move an alert that was grouped into an incident to another service?
Read Move Alerts to Another Incident for more information about merging alerts, moving alerts to a new service, and using a grouped alert to trigger a new incident.
What is the maximum number of services that can be in a multiservice group?
You can include up to 250 services in a multiservice group.
Can a service have more than one type of Alert Grouping enabled at a time?
We do not allow a service to have more than one type of Alert Grouping enabled at a time, or to be part of more than one multiservice group, due to the potential conflict between grouping types. If two alerts matched more than one grouping type, PagerDuty would not know which setting should take precedence.
What is the difference between “grouping” and “merging” an alert?
Grouping is an automated process using our alert grouping feature set for automatically grouping alerts into a single incident based on criteria set by a user.
Merging is a manual process that is done by a user where an incident or alert can be merged into a new or existing incident.
How does Global Alert Grouping interact with Event Orchestration?
Event Orchestration is a powerful tool that can manipulate an event's payload before routing it to a service. As their names imply, Event Orchestration acts at the event level (i.e., upstream from Global Alert Grouping), while Global Alert Grouping acts on alerts (i.e., downstream from Event Orchestration). This means that Global Alert Grouping will evaluate an alert as it's received from Event Orchestration, after any transformations in Global Orchestrations or Service Orchestrations have taken place.
Can I use the REST API to configure Global Alert Grouping?
Yes, the following endpoints are documented in our Developer Docs to help you configure Global Alert Grouping:
Updated 1 day ago