Amazon CloudWatch provides monitoring for AWS resources and customer-run applications. The service can collect data, gain insight, and alert users to fix problems within applications and organizations. Amazon CloudWatch gives system-wide visibility into resource utilization and notifications can be set for when any metrics cross a specified threshold. These alarms can be automatically sent to PagerDuty, which then reliably alerts the correct on-call person through their preferred contact methods.
Follow the instructions below to configure your Amazon Cloudwatch with PagerDuty. Note that this integration expects to find in the
Message property a nested JSON-encoded object; if this is not received, no alert will trigger. If you have any questions or need any assistance, please contact our support team at [email protected].
There are two ways that Amazon CloudWatch can be integrated with PagerDuty: via Global Event Rules or through an integration on a PagerDuty Service.
Integrating with Global Event Rules may be beneficial if you want to build different rules rules based on the payload coming from AWS. If you would like to learn more, please visit our article on Global Event Rules.
- From the Configuration menu, select Event Rules
- On the Event Rules screen, copy your Integration Key.
- Once you have your Integration Key, the Integration URL will be:
You can now proceed to the In the AWS Management Console section below.
Integrating with a PagerDuty Service directly can be beneficial if you don’t need to route alerts from AWS to different responders based on the event payload. You can still use service-level event rules to perform actions such as suppressing.
- From the Configuration menu, select Services.
- On your Services page: If you are creating a new service for your integration, click +Add New Service. It is recommended that you create a service specifically for Amazon CloudWatch notifications.
If you are adding your integration to an existing service, click the name of the service you want to add the integration to. Then click the Integrations tab and click the +New Integration button.
- Select Amazon CloudWatch from the Integration Type menu and enter an Integration Name.
If you are creating a new service for your integration, in General Settings, enter a Name for your new service. Then, in Incident Settings, specify the Escalation Policy, Notification Urgency, and Incident Behavior for your new service.
- Click the Add Service or Add Integration button to save your new integration. You will be redirected to the Integrations page for your service.
- Copy the Integration URL for your new integration.
- In the Services search bar, search and select Simple Notification Service (SNS). On the SNS dashboard, select Topics and click Create Topic. This will be used to route alerts to PagerDuty from AWS.
- Enter a Topic name (you may want to name your topic after your PagerDuty service’s name) and Display name, then click Create topic.
- Now that your topic has been created, Select Subscriptions in the left hand menu and click Create Subscription.
- Make sure HTTPS is the selected Protocol. Paste your Integration URL from step 5 (above) into the Endpoint field, ensure that the Enable raw message delivery checkbox is unchecked and click Create Subscription.
Your subscription should be automatically confirmed. Click the refresh icon to make sure the Subscription ID is not PendingConfirmation.
Next, navigate to Services and search and select EC2. In your EC2 dashboard, select Instances, click your instance's checkbox, click Actions, select CloudWatch Monitoring, and click Add/Edit Alarms.
- Click Create Alarm.
- Select your notification from the dropdown menu, configure the settings that you would like to use for the alarm, and click Create Alarm.
At this point, you will receive alerts in PagerDuty for when an alarm has reported a critical alert, but the PagerDuty incident will not be resolved when the alarm clears. To enable automatic resolution in PagerDuty when an alarm clears, select your instance, click the Actions button, click CloudWatch Monitoring, and select Add/Edit Alarms again.
You will see the alarm that you created earlier. Click view under More Options.
- Select your alarm, click the Actions button, then click Modify.
- On the Modify Alarm screen, verify your alarm threshold and settings. Add a new Action to Send Notification(s) when the alarm state reaches ALARM, by clicking + Notification.
- Add a notification for the OK state and check that your ALARM state notification is correct. Ensure both notifications are being sent to the Topic created earlier in the integration. Make sure to click Save Changes.
- You should then see that your Alarm was saved successfully.
- Congratulations! You have now integrated Amazon CloudWatch with PagerDuty! Now when your alarm threshold is met, an incident will be triggered within PagerDuty.
- Once that alarm is back in an OK state, the incident will automatically resolve within PagerDuty.
An alarm with status ALARM will trigger incidents, and status OK will resolve them. Alarms with status INSUFFICIENT_DATA will only trigger PagerDuty incidents. If you need INSUFFICIENT_DATA to resolve an incident, we recommend using an email integration instead.
If you send a confirmation email to your service’s PagerDuty address, you will be able to view the message body and verify that address from the PagerDuty console. To do so, find the incident that is created by the email and view its details to verify the email address.
Navigate to your PagerDuty Service -> click the Integrations tab -> click the wheel cog to the right of your Amazon CloudWatch integration -> click Edit -> change the value for the Correlate events by option.
Events that are not sent properly from CloudWatch will be dropped and will not trigger alerts in PagerDuty. This integration expects to find in the
Message property a nested JSON-encoded object from which meaningful data about the alert can be extracted to compose the PagerDuty incident. You can find details on Amazon's SNS Message attributes here:
AWS also has some troubleshooting docs on their side which outline some things to look for on the CloudWatch side:
Updated about a month ago