Intelligent Alert Grouping

Intelligent Alert Grouping automatically consolidates related alerts into a single, actionable incident, using machine learning to reduce alert noise for DevOps teams and SREs. Instead of managing dozens of separate notifications, incident responders can focus on problem-solving while the system combines related alerts in real time. The technology continuously learns from your team's response patterns and service behavior, becoming more accurate over time and leading to progressively faster incident resolution.

📘

AIOps Feature

This feature is included with the PagerDuty AIOps add-on. If you would like to sign up for a trial of PagerDuty AIOps features, please read PagerDuty AIOps Trials.

Your service configuration must have AIOps enabled in order to use this feature. AIOps Service Configuration is in Limited General Availability. Please refer to Configurable Service Settings for more information and enablement steps.

📘

Legacy Availability

Intelligent Alert Grouping is also available with Legacy Event Intelligence.

Newer features are only available to PagerDuty AIOps customers and are not available on Legacy Event Intelligence plans. This includes:

Contact our Sales Team to upgrade your account's pricing plan.

🚧

Required User Permissions

The following roles can edit a service's Alert Grouping settings:

  • Account Owner
  • Admin and Global Admin
  • User
  • Manager base role and team roles
    • Manager team roles can only manage services associated with their team.

Enable Intelligent Alert Grouping

  1. Navigate to Services Service Directory and select your desired service.
  2. Select the Settings tab and click New Grouping under Reduce Noise.
  3. Select Intelligent.
    • Optional: To group alerts across multiple services, select additional services in the Select Services to group the alerts dropdown. See Global Alert Grouping for more information.
  4. Select the desired grouping time window for alerts on the service. The Recommended time window in the dropdown uses historical service data to calculate the average time between alerts.
  5. Click Save.
Intelligent Alert Grouping on a service

Intelligent Alert Grouping on a service

📘

Scope

Intelligent Alert Grouping only looks at alerts on a single service and does not group alerts from other services. It does, however, consider alerts sent to other integrations on the same service, if any exist.

See Global Alert Grouping if you are interested in grouping alerts across multiple services.

👍

How PagerDuty Uses Intelligent Alert Grouping

PagerDuty's DataOps team uses Intelligent Alert Grouping to manage high incident volumes across multiple monitors. By implementing alert suppression and intelligent grouping, they achieved a 37% reduction in incidents and increased efficiency two to three times, allowing 10 FTEs to support a workload that would typically require 20–30 FTEs. This automation eliminated the productivity loss from manually grouping duplicate incidents and helped resolve issues faster. Read the full story →

View Intelligent Alert Grouping on an Incident

You can see Intelligent Alert Grouping actively grouping alerts on an incident's detail page under the Alerts tab. The Grouping Now label indicates that an incident is using alert grouping. You can also see how many alerts are grouped into the incident and their status. In the example below, two alerts have been grouped: one is triggered and the other is resolved.

View Intelligent Alert Grouping

View Intelligent Alert Grouping

Select Alert grouping details to see which Alert Grouping method is in effect, when grouping started, and the conditions under which grouping will stop.

Alert grouping details

Alert grouping details

Disable Intelligent Alert Grouping

To select a different grouping method or disable Alert Grouping entirely:

  1. Navigate to Services Service Directory and select your desired service.
  2. Select the Settings tab and click Edit next to Reduce Noise.
  3. Click Delete in the bottom-left corner.
  4. Click Delete in the confirmation modal.
🚧

Non-Reversible Action

This action cannot be undone.

Advanced Options

📘

Advanced Options for AIOps Customers

Advanced Options — including configuring grouping fields and resetting the learner cache — are only available to PagerDuty AIOps customers and are not available on Legacy Event Intelligence plans.

Advanced Options allow you to customize which alert fields the grouping model analyzes when determining textual similarities, and to reset the grouping model's learner cache.

A screenshot of the PagerDuty UI indicating where to configure Intelligent Alert Grouping's advanced options

Advanced Options

Configure Grouping Fields

By default, Intelligent Alert Grouping uses alerts' Summary field to determine textual similarity. Depending on your event data, you may want to use alternate fields to determine similarity.

  1. While configuring Intelligent Alert Grouping, click the Advanced Options dropdown.
  2. Under Select Fields, select a field from the dropdown for the Intelligent Alert Grouping model to consider for textual similarity.
    • If you select Custom Details, enter a Custom field name.
  3. Optional: Click Add Field and select another field. You can repeat this step to configure up to five fields.
  4. Click Save.
📘

Considerations

  • Summary is the only required field. If you do not select Summary, it is possible that all selected fields will be blank — in which case no alert grouping occurs because all selected fields are empty.
  • When any selected field is blank, that field is not analyzed or considered for grouping. The model only considers fields with available data.
  • Intelligent Alert Grouping analyzes selected fields for textual similarity and does not evaluate based on exact matches. To specify exact matching criteria, use the Intelligent + Alert Content option.
  • The maximum number of characters across all selected fields is 1,000.

Reset Learner Cache

This option resets the grouping model's learner cache. This can be useful if a service's data structure changes significantly or after a period of testing.

  1. While configuring Intelligent Alert Grouping, click the Advanced Options dropdown.
  2. Enable the Reset learner cache checkbox.
  3. Click Save.

Algorithm Behavior

The Intelligent Alert Grouping algorithm observes real-time alert data and incident history and adapts as new alerts trigger on a service. After you have enabled Intelligent Alert Grouping on a service, no explicit configuration is required, though you may optionally configure the Flexible Time Window.

📘

Single-Service Alert Grouping

The following information applies when you have selected a single service during configuration. The algorithm behaves slightly differently when you select more than one service. See Global Alert Grouping for more information.

Intelligent Alert Grouping groups an alert into an existing incident when all of the following criteria are met:

  1. The most recent alert was created within the specified grouping time window. This works on a rolling basis — the timestamp on the incoming alert is compared to the most recently grouped alert.
  2. The incident is less than 24 hours old.
  3. The Intelligent Alert Grouping algorithm determines that the alerts are similar.

Alerts that do not meet these criteria trigger a new incident.

The algorithm also reacts to feedback from you and your team. The best way for the algorithm to learn and adapt is to manually merge related incidents and to manually move alerts to a different incident when they are not related. See Merge Incidents for more information. Alert titles can also be updated automatically using Event Orchestration, which influences the algorithm.

📘

Merge Incidents API

Merging and unmerging alerts through the API does not factor into the Intelligent Alert Grouping algorithm. Only manual merges and unmerges in the PagerDuty web app influence the algorithm.

We strongly recommend against sending test data to try to influence the algorithm, as this can result in unpredictable behavior.

Flexible Time Window

You can configure the grouping time window on each service. The Recommended time window is calculated from the average time between alerts using historical service data. The larger the grouping time window, the higher the chance of overgrouping — where an unrelated alert is grouped into an incident. After increasing the time window, monitor alert grouping on the service for a period of time and adjust as needed based on accuracy.

👍

Best Practice

For critical services, use the standard five-minute grouping window unless you are not seeing a satisfactory reduction in noise or the service owner deems a larger window appropriate. If you increase the time window, do so gradually and monitor closely to confirm that only the desired alerts are grouped.

FAQ

Can you enable the grouping time window via API?

Yes. The time window can be enabled using the Update a service API endpoint and specified in seconds up to 3,600 seconds. You can enable the recommended time window by setting the time window to zero: "time_window": 0.

Can you retrieve the grouping time window via API?

Yes. You can retrieve the grouping time window using the Get a service API endpoint via alert_grouping_parameters.config.recommended_time_window.

Can we expose the machine learning-based model via the API?

No, not at this time.

Can we plug our own machine learning code into PagerDuty?

No, not at this time.

Does this take into account rules or correlations configured outside of PagerDuty?

No. This model is entirely based on actions taken within PagerDuty.

Does it affect the machine learning capabilities if I rename the service?

No, it does not.

Can Intelligent Alert Grouping group alerts together from multiple services?

By default, Intelligent Alert Grouping only looks at alerts from a single service. To group alerts from different services, see Global Alert Grouping.

Why didn't my alerts get grouped together?

There are three main reasons the algorithm may not have grouped alerts on the same service:

  1. The alerts did not arrive within the time window specified for that service.
  2. The incoming alert data was not similar enough to the desired alerts, or was more similar to the alerts it was grouped with.
  3. Human response behavior — such as merging or moving alerts out of incidents — overrode the desired behavior.

The algorithm takes several factors into consideration, which makes tracing specific grouping decisions difficult. If you believe there has been enough history for an alert to be grouped but are still seeing unexpected grouping behavior, contact our Support team with links to specific incidents and alert groupings and a summary of why the behavior is unexpected.

Why don't I see any alert grouping options?

Your current pricing plan may not support Alert Grouping. If you are interested in trying Alert Grouping, contact our Sales team to start a free PagerDuty AIOps trial.

Is there a limit to how many alerts can group into a single incident?

Yes. Incidents are limited to 1,000 alerts each. After this limit is reached, a new incident is created and subsequent alerts are grouped into the new incident.

Are incidents resolved only when all alerts within that incident are resolved?

Yes. An incident resolves when all of its associated alerts are resolved. Similarly, if you resolve an incident, PagerDuty automatically resolves any associated triggered alerts. See Resolve Alerts for more information.