Amazon CloudWatch Integration Guide | PagerDuty

Amazon CloudWatch + PagerDuty Benefits

  • Amazon CloudWatch provides monitoring for AWS resources and customer-run applications. The service can collect data, gain insight, and alert users to fix problems within applications and organizations.
  • Amazon CloudWatch gives system-wide visibility into resource utilization, and notifications can be set for metrics that cross specified thresholds. These notifications can be automatically sent to PagerDuty, which reliably alerts the correct on-call responder through their preferred contact methods.

📘

Note

This integration is available for Amazon CloudWatch on AWS Cloud or AWS Outposts.

Version

This guide details configuration of the CloudWatch V1 integration.

How it Works

  • When an AWS service metric goes beyond a predefined threshold, a CloudWatch alert sends a notification to a PagerDuty endpoint, triggering an incident.
  • When the AWS service metric returns to an OK state below the predefined threshold, a resolve event is sent to the same endpoint, resolving the PagerDuty incident.

Requirements

This integration expects to find in the Message property a nested JSON-encoded object; if this is not received, no alert will trigger. If you have any questions or need any assistance, please contact our Support team.

Integration Walkthrough

In PagerDuty

There are two ways that Amazon CloudWatch can be integrated with PagerDuty:

Integrate With Event Rules

Integrating with global or service-level event rules may be beneficial if you want to build different rules based on the payload coming from AWS. If you would like to learn more, please visit our article on Rulesets.

Configure an Event Rules Integration

  1. From the Automation menu, select Event Rules and click your Default Global Ruleset.
  2. On the Event Rules screen, click the Incoming Event Source dropdown and copy your Integration Key.
  1. Once you have your Integration Key, the Integration URL will be:

https://events.pagerduty.com/x-ere/[YOUR_INTEGRATION_KEY_HERE]

You can now proceed to the In the AWS Management Console section below.

Configure a Service Event Rules Integration

To use service-level event rules:

  1. Create a Generic Events API integration on your preferred service.
  2. Once complete, copy the Integration Key and paste it into the following URL:

https://events.pagerduty.com/integration/[YOUR_INTEGRATION_KEY_HERE]/enqueue

You can now proceed to the In the AWS Management Console section below.

Integrate With a PagerDuty Service

Integrating with a PagerDuty Service directly can be beneficial if you don’t need to route alerts from AWS to different responders based on the event payload. You can still use service-level event rules to perform actions such as suppressing.

Add to a New Service

  1. To add the integration to a new service, navigate to Services Service Directory and click +New Service.
  2. Follow the prompts and configure the service to your preferences. On the Integrations screen, select Amazon CloudWatch from the search bar, dropdown or from our most popular integrations list.
  3. Once you are done entering your service settings, click Create Service.
  4. You will now be in the service’s Integrations tab. Find your integration in the list and click the to view and copy your Integration URL and keep it in a safe place for later use.
  5. You can now proceed to the In the AWS Management Console section below.

Add to an Existing Service

  1. To add an integration to an existing service, go to Services Service Directory and select a service to add an integration to. Select the Integrations tab and click +Add another integration.
  2. Select Amazon CloudWatch from the search bar, dropdown or from our most popular integrations list.
  3. Click Add. Find your integration in the list and click the to the right to view and copy your Integration URL and keep it in a safe place for later use.
  4. You can now proceed to the In the AWS Management Console section below.

In the AWS Management Console

Create an SNS Topic

  1. In the Services search bar, search and select Simple Notification Service (SNS). In the SNS dashboard menu, select Topics and click Create Topic on the right. This will be used to route alerts to PagerDuty from AWS.
  1. Enter a Topic name (you may want to name your topic after your PagerDuty service’s name) and Display name, then click Create topic.
  2. Now that your topic has been created, select Subscriptions in the left hand menu and click Create Subscription.
  3. Select your Topic ARN and make sure HTTPS is the selected Protocol. Paste your Integration URL (generated in steps above) into the Endpoint field, ensure that the Enable raw message delivery checkbox is unchecked and click Create Subscription.
  4. Your subscription should be automatically confirmed. Click the refresh icon to make sure the Status is Confirmed and not PendingConfirmation.
  5. Next, you will create a CloudWatch alarm that will send notifications to your SNS topic when a metric falls outside of a predefined threshold.

Create a CloudWatch Alarm

  1. In the Services search bar, search and select Cloudwatch. Select Alarms All Alarms and then click Create Alarm on the right.
  1. Click Select metric. Select your metric using either of the following methods:
  • Select the service namespace that contains the metric. Continue selecting your preferred options, which will narrow down your choices until a list of metrics appears. Select the check box next to your desired Metric Name.
  • In the search field, enter the name of a metric, dimension, or resource ID and hit Enter. Then select your desired results and continue selecting your preferred options until a list of metrics appears. Select the check box next to your desired metric.

Read more about commonly used metrics here.

  1. Next, click the View graphed metrics button. Under Statistic, select one of the statistics or predefined percentiles, or specify a custom percentile (for example, p95.45). Under Period, select the evaluation period for the alarm. Click Select metric to continue.
  2. On the next page under Conditions, select from the following Threshold types:

Threshold Type

Instructions

Static

a) Under Whenever NumberOfObjects is…, select Greater, Greater/Equal, Lower/Equal or Lower.

b) Under than… enter your desired threshold value.

Anomaly detection

a) Under Whenever BucketSizeBytes is… select Outside of the band, Greater than the band or Lower than the band.

b) Under Anomaly detection threshold, set your threshold value.

Click Next to continue.

  1. First, you will configure the In alarm state notification, which will trigger a PagerDuty incident when the metric has met your predefined threshold. Select the In alarm and Select an existing SNS topic radio buttons, and then select the SNS Topic (created above) from the Send a notification to… field.
  2. Next, you will configure the OK state notification, which will automatically resolve the PagerDuty incident if the metric has fallen back into an OK state (not meeting or exceeding the threshold). Select the OK and Select an existing SNS topic radio buttons, and then select the SNS Topic (created above) from the Send a notification to… field. Click Next to continue.
  1. On the next page, enter an Alarm name and Alarm description. Click Next to continue.
  2. On the Preview and Create screen, review your alarm’s details. If you need to edit any details, click Edit to the right of each step. Once you have confirmed all details, click Create alarm.
  3. You should then see a confirmation dialog that your alarm was saved successfully.
  4. Congratulations, you have now integrated Amazon CloudWatch with PagerDuty! Now, when your alarm threshold is met, an incident will be triggered within PagerDuty. Once that alarm is back in an OK state, the incident will automatically resolve within PagerDuty.

Commonly Used Metrics

Metrics that are commonly used with the Amazon CloudWatch integration include, but are not limited to:

EC2

To use the CloudWatch integration with EC2 instance metrics, follow the instructions in the Integration Walkthrough and perform the following when you Create a CloudWatch Alarm:

  1. In step 8, select EC2 Per instance metrics.
  2. Check the checkbox next to the Instance Name with your preferred Metric Name on the right. Commonly used metrics are CPU Utilization and Status Check Failed. Please read AWS’ documentation for more information on EC2 metrics.
  3. Continue with the instructions in steps 9-16.

S3 Storage Lens

To use the CloudWatch integration with S3 Storage Lens metrics, follow the instructions in the Integration Walkthrough and perform the following when you Create a CloudWatch Alarm:

  1. In step 8, select S3 Storage Metrics.
  2. Check the checkbox next to the BucketName with your preferred Metric Name on the right. Commonly used metrics are Incomplete Multipart Upload Storage Bytes, Unencrypted Storage Bytes and Non-Current Version Storage Bytes. Please read AWS’ documentation for more information on S3 Storage Lens metrics.
  3. Continue with the instructions in steps 9-16.

EKS

To use the CloudWatch integration with EKS metrics, follow the instructions in the Integration Walkthrough and perform the following when you Create a CloudWatch Alarm:

  1. In step 8, select EKS Container Insights.
  2. Check the checkbox next to your preferred Metric Name on the right. Commonly used metrics are cluster_failed_node_count and node_cpu_utilization. Please read AWS’ documentation for more information on EKS metrics.
  3. Continue with the instructions in steps 9-16.

FAQ

What alarm statuses affect PagerDuty incidents?

An alarm with status ALARM will trigger incidents, and status OK will resolve them. Alarms with status INSUFFICIENT_DATA will only trigger PagerDuty incidents. If you need INSUFFICIENT_DATA to resolve an incident, we recommend using an email integration instead.

If I use an email integration, how can I verify my PagerDuty service’s email address?

If you send a confirmation email to your service’s PagerDuty address, you will be able to view the message body and verify that address from the PagerDuty console. To do so, find the incident that is created by the email and view its details to verify the email address.

The link to verify will be in the incident details. The SNS confirmation page requires JavaScript, which can not be executed in the iframe the message is rendered in. To confirm your subscription, open the confirmation link in a new tab or window by right-clicking on the link and choosing Open Link in New Tab/Window.

How can I change how events from CloudWatch are deduplicated into PagerDuty?

Navigate to your PagerDuty Service click the Integrations tab click the to the right of your Amazon CloudWatch integration click Edit change the value for the Correlate events by option.

Why are my CloudWatch events not triggering incidents in PagerDuty?

Events that are not sent properly from CloudWatch will be dropped and will not trigger alerts in PagerDuty. This integration expects to find in the Message property a nested JSON-encoded object from which meaningful data about the alert can be extracted to compose the PagerDuty incident. You can find details on Amazon's SNS Message attributes here.

AWS also has some troubleshooting docs on their side which outline some things to look for on the CloudWatch side.


Did this page help you?