Event Orchestration Examples
Event Orchestration is a highly customizable suite of features and capabilities. The examples on this page are taken from real-world customer use cases, which you're welcome to replicate in your own PagerDuty account.
We have included the following examples to assist you with your own orchestrations:
- If Events Match Certain Conditions
- On a Recurring Weekly Schedule
- Advanced Configuration with the
and
Operator - Capture Unrouted Trigger Events
- Standardized Triage
- Notification Management for Specific Incidents
- Route Alerts with Dynamic Service Routes
- Automated Remediation for Specific Incidents
If Events Match Certain Conditions
In this example, we will create a routing rule that routes events to a service when matching conditions are met.
- Create a routing rule and, under When should events be routed here?, select If events match certain conditions.
- Indicate the event conditions that you would like the orchestration to match using one of the following methods:
Method | Description |
---|---|
Base conditions on incoming JSON | Depending on your account's activity, you may have recent events appear on the left side of the screen if events have been sent to the incoming event source. View these events to determine which values to use. |
Events sent through the API | You will use the JSON field names directly (e.g., summary ). For nested fields, separate names with a dot (. ) (e.g., payload.taskid ). If you are sending data through additional fields, enter them exactly as they are sent to PagerDuty. For example, if your events have a tags field, enter that field name in your rule condition as tags . |
Events sent through email | Note: The following functionality will only work when you send emails to a global email integration address. It will not work for emails sent directly to a service-level email integration. Rules may be based on the content of an email by entering the appropriate email field as custom details in the event field. The email subject will be used as a default deduplication key. In other words, emails with the same subject line will automatically be deduplicated. To change this behavior, add a custom deduplication key with an orchestration rule action. The most common email fields are: - event.custom_details.from[0] (the from address).*- event.custom_details.subject (the subject line)**- event.custom_details.plain_body (the email body)*The [0] refers to the 1st position in a list of emails. If you would like to generally search through a list of emails (either in the to field or the from field), please enter event.custom_details.from or event.custom_details.to .**For email-generated events, the field event.summary is populated with the same value as event.custom_details.subject (the email subject) by default. However, these are separate fields and their values may differ based on customer-specified configurations. |
data:image/s3,"s3://crabby-images/72140/721406ac339bece0625142df4cdb55f8cad269e5" alt="Configure routing rule Configure routing rule"
Configure routing rule
- In the middle dropdown, select how the event should be filtered.
Filter Options | Description |
---|---|
- matches part - does not match part | The field contains/does not contain a value. |
- matches - does not match | The field equals/does not equal a value (this operation requires the field to be passed in as a string). |
- exists - does not exist | The field exists/does not exist. |
- matches regex - does not match regex | The field matches/does not match a regular expression. Regular expressions must use RE2 syntax. |
Negative Operations
Rules with negative operations, such as does not contain or does not equal, will match events that do not contain your specified value and events that do not contain the field at all. As an example:
- severity field does not equal
critical
- This will match events where the severity field does not equal
critical
and events that do not contain a severity field at all.If you'd like to avoid this, you must add an additional condition that matches only when the field exists. For example:
When all conditions are true:
- severity field exists
- severity field does not equal
critical
Note: You must select all conditions must be true for the rule to match.
- In the second value field, input the value that should be met from the payload. This can be a string or regular expression.
Case Sensitivity
Condition values in Event Orchestrations are case-insensitive.
For example, if a condition is set with
Summary
matches partDOWN
, this will match if theSummary
containsDown
,down
and other variations of the word.
- When additional conditions should be added, use the following options:
- + And: Additional conditions should be met.
- + New Condition: Create another set of conditions that should be met. A new condition block creates an
OR
operator of conditions that will also be evaluated alongside other blocks.
- Click Save to save your configuration.
Condition Limits
You can create up to 25 condition blocks within a rule, and you can have up to 64 operators (i.e.,
AND
,OR
), or a maximum of 2048 bytes (e.g., if you're using PCL) in a single condition block.
On a Recurring Weekly Schedule
Availability
Scheduled conditions are available with Advanced Event Orchestration.
In this example, we will create a routing rule that routes events to a service on a weekly recurring schedule.
- Create a routing rule and, under When should events be routed here?, select On a recurring weekly schedule.
- Enter appropriate times for Start and End, select your preferred Days of the week, and pick your Timezone.
- Click Save.
Advanced Configuration with the and
Operator
and
OperatorAvailability
Scheduled and threshold conditions are available with Advanced Event Orchestration.
This example details how to edit PCL to create an advanced condition that the UI does not natively allow for.
If you select one of the following options for When should events be routed here?, the UI does not offer a way to set an and
condition:
- On a recurring weekly schedule
- During a scheduled date range
- Depends on event frequency
You can work around this, however, by directly editing the PagerDuty Condition Language (PCL).
Say you'd like to route events to a service when the following conditions are met:
- The event's
source
matchesmainframe
. - It is Wednesday or Sunday between 11:00 a.m. and 12:00 p.m. Eastern.
The following PCL statement will meet these conditions: event.source matches 'mainframe' and now in Wed,Sun 11:00:00 to 12:00:00 America/New_York
.
Capture Unrouted Trigger Events
Orchestration service routes include a catch all rule for events that do not match the configured rules. The default behavior creates a suppressed alert from unrouted trigger events, however, you have the ability to route these events to a specific service instead. It is important to note that in order to support event deduplication across all services, the ability to resolve any open alert previously created by the orchestration when a matching dedup_key
exists, regardless of what service the open alert belongs to.
In this example, we will create a routing rule that uses these trigger events to create incidents on a catch all service.
- Create a routing rule and, under What service should events route to?, select the service you’d like to use as a catch all for these events.
- Under When should events be routed here?, select If events match certain conditions. Configure the following condition:
- If
event.event_action
matches part (contains)trigger
- If
- Click Save.
- If necessary, reorder the rule so that it is directly above the default catch all rule.
data:image/s3,"s3://crabby-images/c5f31/c5f319ce75de0fa549c1139976f79b215f442f4a" alt="Routing rule for trigger events"
Routing rule for trigger events
This configuration can raise the visibility for trigger events that do not match the existing routing rules while also preserving the default resolve behavior described above.
Standardized Triage
Availability
Global Orchestrations, incident suppression and variables are available with Advanced Event Orchestration.
Use Global Orchestration rules to standardize the triage process across your entire organization.
- Create a Global Orchestration. You will send your events to the orchestration’s integration.
- Create an Event Count Cache Variable. You can use this to count the number of specific events received within a designated timeframe.
- Create a rule with an event condition to suppress any known false positive events:
- Under When should events be routed here?, select If events match certain conditions.
- Define a condition that will match to your false positive events. E.g., If
event.summary
matches part (contains)[No error]
. Then click Next. - Under What action(s) should be applied?, select Suppress incident and notifications.
- Click Save.
- Create an Else rule with an event condition to evaluate the current cache variable count.
- Under When should events be routed here?, select If events match certain conditions.
- In the field selector, choose Cache Variable and add your cache variable’s name. Configure your desired threshold for when an incident should be created. E.g., If
cache_var.myEventCount
is greater than or equal2
. Then click Next. - Under What action(s) should be applied?, you can choose your desired priority and/or severity level for these incidents.
- Click Save.
- Create a final Else rule to set a lower priority and/or severity for incidents that do not pass the event count threshold set in the previous rule:
- Under When should events be routed here?, select Always (for all events) and click Next.
- Under What action(s) should be applied?, you can choose your desired priority and/or severity level for these incidents.
- Click Save.
data:image/s3,"s3://crabby-images/b77be/b77be1774cdae7338f45be62297b4d261c2cf01a" alt="Standardized triage example"
Standardized triage example
With this configuration in place, any event sent to the global orchestration will be evaluated by these rules. If the event is a false positive and matches the condition you set in step 3, it will automatically be suppressed and will not create an incident.
If the event does not meet that condition, it will increment the cache variable count, be routed to a service, and then create an incident on that service. The current event count for the cache variable will determine whether the incident has a high or low priority and/or severity.
Notification Management for Specific Incidents
Availability
Dynamic Escalation Policy Assignment is available with Advanced Event Orchestration.
Leverage Dynamic Escalation Policy Assignment to notify a Subject Matter Expert (SME) instead of the current on-call user when an incident requires a special skill set to resolve.
- Create an escalation policy with the SME user in the first level.
- If the SME should be on-call 24/7, you can add them as a user target. Otherwise, you can create a schedule to define what hours the user should be on-call for.
- In your global orchestration, create the following rule:
- Under When should events be routed here?, select If events match certain conditions.
- Define a condition that will match to events that indicate a specific SME should be notified instead of the current on-call user. E.g., If
event.summary
matches part (contains)Database error
. Then click Next.
- Under What action(s) should be applied?, locate Override service's assigned escalation policy with this policy and select your desired escalation policy from the dropdown.
- Click Save.
data:image/s3,"s3://crabby-images/6dd3b/6dd3b33f6dd3b02844c5a4e7e365e3e8c03320cb" alt="Override escalation policy with SME"
Override escalation policy with SME
When the orchestration receives an event that matches the condition, the escalation policy chosen will be used for the incident, regardless of which service the event is routed to.
Route Alerts with Dynamic Service Routes
Availability
Dynamic routing, dynamic field enrichment and extraction and variables are available with Advanced Event Orchestration.
Utilize the powers of Global Orchestrations to standardize the event format in order to automatically route events to the correct service with a dynamic routing rule.
An orchestration’s dynamic routing rule evaluates all events and does not have conditions. Rather, dynamic routing rules are able to route events to services based on the Service name or Service ID included within the payload. If service details are not included in your initial payload, you can update the payload with dynamic field enrichment and extraction.
- In your global orchestration, create a rule with the following actions:
- Create a rule variable and use regex to extract the desired value from an event field.
- Create an event field (e.g.,
custom_details.service_route
) and use the rule variable you created to set the value.
- In your global orchestration, navigate to the Service Routes tab.
- Create a new dynamic route and enter the event field you created in step 1.
data:image/s3,"s3://crabby-images/ff18b/ff18bfebf01d4ad1b13ac2696f54c8d1b8e47578" alt="Dynamic routing rule"
Dynamic routing rule
When the orchestration receives an event with a valid service name in the event field, it will automatically route it to that service. The event can be processed further by the service’s orchestration rules if desired.
Automated Remediation for Specific Incidents
Release Status and Availability
Automation on Alerts is in Early Access for customers with PagerDuty AIOps. Functionality and documentation are subject to change.
To request access, please complete our Early Access form and select Automation on Alerts.
You can pause notifications and suspend alerts for a set duration and simultaneously trigger a remediation automation with an Automation Action. If the automation successfully resolves the issue, an incident will not trigger. If the pause duration expires without a resolution, the incident will trigger as normal. You can configure this within a service orchestration by following the steps below.
- In your service orchestration, create a rule with your desired conditions.
- Under Basic Event, enable Pause notifications. Enter a value (in seconds) for Suspend alert for ___ seconds before triggering an incident. We recommend setting a pause duration that is long enough for your entire automation to complete.
- Under Automation Automation Actions, enable Use Automation Actions if an event reaches this rule.
- For When should we run this Action?, select Automatically when Alert is suspended/paused.
- For Automated Action, select your configured Automation Action .
data:image/s3,"s3://crabby-images/0ffe7/0ffe7cba569fa9a52b111027eae59314673f3ab7" alt="Configured actions for auto remediation"
Configured actions for auto remediation
Updated 7 days ago