Global Alert Grouping

Global Alert Grouping reduces noise by grouping alerts across more than one technical service. Global Alert Grouping allows you to select the scope of services to evaluate for alert grouping. Global Alert Grouping supports the following methods:

📘

AIOps Feature

This feature is included with the PagerDuty AIOps add-on. If you would like to sign up for a trial of PagerDuty AIOps features, please read PagerDuty AIOps Trials.

Your service configuration must have AIOps enabled in order to use this feature. AIOps Service Configuration is in Limited General Availability. Please refer to Configurable Service Settings for more information and enablement steps.

🚧

Required Permissions

If you have one of the following roles, you can edit a service’s alert grouping settings:

  • Account Owner
  • Admin and Global Admin
  • User
  • Manager base role and Team roles
    • Manager Team roles can only configure the services for which they are assigned a Manager role.

Enable Global Alert Grouping

Global Alert Grouping groups alerts across multiple technical services.

  1. In the PagerDuty web app, navigate to Services Service Directory and click the name of your desired service.
  2. Select the Settings tab and click Edit next to Reduce Noise.
  3. Select one of the following options and configure your preferences:
  4. Click Save.

Intelligent

Selecting Intelligent enables Intelligent Alert Grouping on your service. There are differences in functionality depending on whether you select more than one service. The information in this section is relevant when you select more than one service. Read Intelligent Alert Grouping if you select a single service.

Advanced Options

Advanced Options allow you to customize which alert fields the grouping model analyzes when determining textual similarities and reset the grouping model's learner cache.

Configure Grouping Fields

By default, Intelligent Alert Grouping uses the summary field of alerts to determine textual similarity. Depending on your data, however, you can use alternate fields to determine similarity.
In the PagerDuty web app:

  1. While configuring Intelligent Alert Grouping, click the dropdown Advanced Options.
  2. In the section Select Fields, select a field from the dropdown you want the Intelligent Alert Grouping model to consider for textual similarity.
    1. If you select Custom Details, enter a Custom field name.
  3. Optional: Click Add Field and select another field. Note: You can repeat this step as needed to select up to five fields.
  4. At the bottom of the page, click Save.
📘

Considerations

Summary is the only field that PagerDuty requires. If you do not select Summary as one of the fields, all selected fields can be blank. In such cases, alert grouping does not occur because the selected fields are empty.

When any field selected as part of the Advanced Options configuration is blank, the system does not analyze or consider those fields for Intelligent Alert Grouping. The grouping model only considers the fields with available data.

The selected fields are analyzed for textual similarity and do not evaluate based on an exact match between fields. To specify exact matching criteria, use the Intelligent + Alert Content option.

The maximum number of characters for all selected fields is 1000.

Reset Learner Cache

This option allows you to reset the grouping model's learner caches. This is useful if a data structure for a service changes significantly, or after a period of testing.
In the PagerDuty web app:

  1. While configuring Intelligent Alert Grouping, click the dropdown Advanced Options.
  2. Activate the checkbox Reset learner cache.
  3. At the bottom of the page, click Save.

Algorithm Behavior

The Intelligent Alert Grouping algorithm observes real-time alert data and incident history, and adapts as new alerts trigger on a service. After you have enabled Intelligent Alert Grouping on a service, no explicit configuration is required, though you can optionally configure the Flexible Time Window.

Intelligent Alert Grouping groups an alert into an existing incident when the model detects the following criteria. The reason for grouping each alert appears in the incident timeline.

  • Textual similarity: The Intelligent Alert Grouping algorithm deems the alerts similar based on alert title similarity.
  • Past co-occurrences: The Intelligent Alert Grouping model detects a high rate of co-occurrence between alerts.
  • Learning from prior incident merges: You have merged prior similar alerts and the algorithm learns from this behavior.

Alerts that do not meet these criteria do not group and trigger a new incident. When a Global Alert Grouping setting is enabled, Intelligent Alert Grouping evaluates alerts from all services in a global setting, and groups alerts from one or more services into the same incident when the criteria in the previous section are met. New alerts group under the first created incident and notifications go to that incident's responder.

The algorithm also reacts to feedback from you and your team. The best way for the algorithm to learn and adapt to new grouping behaviors is to manually merge related incidents, and to manually move alerts to a different incident when they are not related. For more information about moving alerts from one incident to another, see Merge Incidents. You can also automatically update alert titles using Event Orchestration, which influences the algorithm.

📘

Tip

Merging and unmerging alerts through the REST API does not affect the Intelligent Alert Grouping algorithm. Only manual merges and unmerges in the PagerDuty web app influence the algorithm.

Alert Content

Selecting Alert Content enables Content-Based Alert Grouping on your service. Content-Based Alert Grouping enables customized alert grouping on services with predictable, homogenous alert data, without the need to train an algorithm. With Content-Based Alert Grouping, alerts that share an exact match on a set of chosen fields group together into the most recent open incident. Grouped alerts mean fewer incidents and interruptions for responders, richer context on the incidents that trigger, and lower resolution times.

To learn more about the Alert Content option, read Content-Based Alert Grouping.

Intelligent + Alert Content

Selecting Intelligent + Alert Content enables Unified Alert Grouping on your service. Unified Alert Grouping combines Content-Based Alert Grouping and Intelligent Alert Grouping with a flexible time window for increased precision and correlation control. Unified Alert Grouping groups alerts when alert content matches and Intelligent Alert Grouping determines alerts are similar. Alerts group only when both conditions are satisfied.

To learn more about the Intelligent + Alert Content option, read Unified Alert Grouping.

Flexible Time Window

You can configure the grouping time window as part of the Global Alert Grouping setting. The time window can be between five and 60 minutes. The time window is a rolling window and counts from the most recently grouped alert. The window extends each time an alert groups, up to 24 hours, or until the incident resolves. If an alert arrives after 24 hours, it triggers a new incident.

View Global Alert Grouping On an Incident

A primary indicator that an incident has alerts grouped from other services is the Multiservice Group pill, along with the description of how many alerts have grouped, at the top of the incident details page.

Multiservice Group indicator

You can also view Global Alert Grouping on an incident details page in the following places:

  • The Impacted Services field shows the incident’s source service and the names of any other services whose alerts have grouped into the incident.
Impacted services
  • With the Alerts tab selected, you can also see the originating service for an alert.
Alerts tab selected
  • An incident's Timeline shows an entry when Global Alert Grouping adds an alert from another service.
Incident Timeline

Best Practices

To get the most out of Global Alert Grouping, follow these practices:

  1. Make sure that responders on all associated Teams and escalation policies know that their services are part of a Global Alert Grouping configuration.
  2. Ensure that on-call responders evaluate all alerts, including alerts that originated on other services.
  3. If an on-call responder is unsure whether an alert belongs to an incident on their assigned service, instruct the on-call responder to add a responder from the originating service to review the alert, or move the alert to create an incident on the originating service. Creating a new incident on the originating service notifies the appropriate on-call responder.
  4. Ensure you have a base role of Responder or higher for services within a multiservice group. Do not use Advanced Permissions to restrict access to services that are part of a multiservice group (for example, restricting access to view incidents on a specific service).

FAQ

What service does an incident get assigned to when grouping across services?

Global Alert Grouping groups subsequent alerts into the incident on the service that received the initial alert. Global Alert Grouping continues grouping matched alerts based on the configured criteria and rolling time window, or until the incident resolves.

Which escalation policy will be assigned to the incident?

PagerDuty uses the escalation policy from the service where the first alert triggers.

How do I bring in other responders?

You have multiple options:

Where can I check if an alert from my service has been grouped into an incident on a different service?

Grouped alerts maintain their originating service, so the alerts for your team are visible in the Alerts Table when the My Teams filter is selected. They do not appear on the Incidents page when My Teams is selected.

In the Service Directory, the multiservice incident is only visible with the source service (the service that received the first alert). This means that a single technical service shows as impacted on your status page. If you configure Business Services and a priority is assigned to the incident, then the business service also displays as impacted.

We are currently evaluating an area on the service directory page to show multiservice incidents when a service has an alert assigned to a multiservice incident.

How do I move an alert that was grouped into an incident to another service?

Read Move Alerts to Another Incident for more information about merging alerts, moving alerts to a new service, and using a grouped alert to trigger a new incident.

What is the maximum number of services that can be in a multiservice group?

You can include up to 250 services in a multiservice group.

Can a service have more than one type of Alert Grouping enabled at a time?

We do not allow a service to have more than one type of alert grouping enabled at a time, or to be part of more than one multiservice group, due to the potential conflict between grouping types. If two alerts matched more than one grouping type, PagerDuty does not know which setting takes precedence.

What is the difference between “grouping” and “merging” an alert?

Grouping is an automated process using our alert grouping feature set for automatically grouping alerts into a single incident based on criteria set by a user.

Merging is a manual process executed by a user where an incident or alert can merge into a new or existing incident.

How does Global Alert Grouping interact with Event Orchestration?

Event Orchestration is a powerful tool that can manipulate an event's payload before routing it to a service. As their names imply, Event Orchestration acts at the event level (upstream from Global Alert Grouping), while Global Alert Grouping acts on alerts (downstream from Event Orchestration). This means that Global Alert Grouping evaluates an alert as it is received from Event Orchestration, after any transformations in Global Orchestrations or Service Orchestrations have taken place.

Can I use the REST API to configure Global Alert Grouping?

Yes, the following endpoints are documented in our Developer Docs to help you configure Global Alert Grouping: