Global Alert Grouping reduces noise by using Content-Based Alert Grouping to group alerts across multiple technical services. Content-Based Alert Grouping identifies alerts that share an exact match on a set of chosen fields and groups them together. Grouped alerts mean fewer incidents and interruptions for responders, richer incident context, and lower resolution times.
Global Alert Grouping is available with our PagerDuty AIOps add-on. The feature is also available for the duration of an AIOps trial. Please contact our Sales team to upgrade to a pricing plan with this feature.
Required User Permissions
Users with the following roles can edit a service’s Alert Grouping settings:
- Account Owner
- Admin and Global Admin
- Manager base role and Team roles
- Manager Team roles can only configure the services for which they are assigned a Manager role.
Global Alert Grouping leverages Content-Based Alert Grouping to enable grouping across services. Please read Content-Based Alert Grouping for a deeper understanding of that feature.
Data Format Requirement
Content-Based Alert Grouping does not support email integrations at this time. Data should be formatted in PagerDuty's Common Event Format (PD-CEF).
To enable Global Alert Grouping:
- In the web app, navigate to Services Service Directory and select the name of your desired service.
- Select the Settings tab and click Edit next to Reduce Noise.
- In the dropdown Services to reduce noise across, select two or more services whose alerts you'd like to group together when an incident triggers.
- Note: A service cannot be part of more than one multiservice group.
- Enable the toggle Group similar alerts on this service so that only one notification is sent.
- Make a selection in the dropdown to determine whether to match on Any or All fields.
- Under Match alerts based on, make a selection in the dropdown to determine which field you'd like to match on. Optionally, click Add Field to add more matching criteria.
- You can also click See Recent Alerts to open a pane of recent alerts from selected services. Click an alert's title to open its payload, and click a key/value pair to automatically update the matching criteria.
- Make a selection in the dropdown to determine how long the rolling grouping window should be. See the section Flexible Time Window below for more information.
- Click Save Settings.
You can review whether a service is taking part in Global Alert Grouping by going to its details page in the Service Directory. In the Reduce Noise section, you will see Global Alert Grouping is active, along with any other services that are taking part.
You can configure the grouping time window as part of the Global Alert Grouping setting. The time window can be between five minutes and one hour. The time window is a rolling window and counted from the most recently grouped alert. The window extends each time an alert is grouped, up to 24 hours, or until the incident is resolved. If an alert comes in after 24 hours, it will trigger a new incident.
A primary indicator that an incident has alerts grouped from other services is the Multiservice Group pill, along with the description of how many alerts have been grouped, at the top of the incident's details page.
You can also view Global Alert Grouping on an incident's details page in the following places:
- The Impacted Services field will show the incident’s source service, along with the name(s) of any other service(s), whose alerts have been grouped into the incident.
- With the Alerts tab selected, users can also see which service an alert originated on.
- An incident's Timeline will show an entry when Global Alert Grouping adds an alert from another service.
To get the most out of Global Alert Grouping, we recommend the following:
- Make sure that responders on all associated Teams and escalation policies know that their services are part of a Global Alert Grouping configuration.
- Ensure that on-call responders evaluate all alerts, including alerts that originated on other services.
- If an on-call responder is unsure whether an alert should belong to an incident on their assigned service, please advise the on-call responder to add a responder from the originating service to review the alert, or move the alert to create an incident on the originating service. By creating a new incident on the originating service, the appropriate on-call responder will be notified.
- We recommend that users have a base role of Responder or higher for services that are within a multiservice group. Additionally, refrain from using Advanced Permissions to restrict users' access to services that are included in a multiservice group (for example, restricting users from viewing incidents on a specific service via Advanced Permissions).
Global Alert Grouping will group subsequent alerts into the incident on the service that received the initial alert. Global Alert Grouping will continue grouping matched alerts based on the configured criteria and rolling time window, or until the incident is resolved.
PagerDuty will use the escalation policy from the service where the first alert triggers.
You have multiple options:
- The on-call responder can re-assign an incident to other users or escalation policies.
- The on-call responder for that incident can add responders.
- The on-call responder can execute an Incident Workflow.
Where can I check if an alert from my service has been grouped into an incident on a different service?
Alerts will maintain their origin service, so your team’s alerts will be visible on the Alerts page when the My Teams filter is selected. It will not appear on the Incidents page when My Teams is selected.
On the Service Directory page, the multi-service incident will only be visible with the source service (i.e., the service that received the first alert). This means that a single technical service will show as impacted on your status page. If you have Business Services configured, and there is a priority assigned to the incident, then the business service will also display as impacted.
We are currently evaluating an area on the service directory page to show multi-service incidents when a service has an alert assigned to a multi-service incident.
During Early Access, you can include up to 250 services in a multiservice group.
We do not allow a service to have more than one type of Alert Grouping enabled at a time, or to be part of more than one multiservice group, due to the potential conflict between grouping types. If two alerts matched more than one grouping type, PagerDuty would not know which setting should take precedence.
Grouping is an automated process using our alert grouping feature set for automatically grouping alerts into a single incident based on criteria set by a user.
Merging is a manual process that is done by a user where an [incident]/docs/edit-incidents#merge-incidents) or alert can be merged into a new or existing incident.
Event Orchestration is a powerful tool that can manipulate an event's payload before routing it to a service. As their names imply, Event Orchestration acts at the event level (i.e., upstream from Global Alert Grouping), while Global Alert Grouping acts on alerts (i.e., downstream from Event Orchestration). This means that Global Alert Grouping will evaluate an alert as it's received from Event Orchestration, after any transformations in Global Orchestrations or Service Orchestrations have taken place.
Updated 3 months ago