Zabbix Troubleshooting Guide

Below are some common issues that you may run into when integrating Zabbix with PagerDuty, and steps for troubleshooting:

General Configuration Troubleshooting

Review Zabbix Logs

If you’re having an issue receiving Zabbix events in PagerDuty, check your Zabbix logs to see if the PagerDuty action was called, and if there were any associated errors.

For Zabbix 2.x: Navigate to Monitoring Events, then click the event timestamp for the problem event. Check the Message actions section in the event details.
For Zabbix 3.x: Navigate to Reports Action Log to view the status of the Zabbix event.

PagerDuty User/Action Configuration in Zabbix

If events are not being sent to the PagerDuty user/action in Zabbix, please check the following:

  • Make sure the PagerDuty group in Zabbix has read permissions for the hosts and/or host groups in question. Also confirm that the PagerDuty user is in the PagerDuty group.
  • Check that the PagerDuty media type is enabled under Configuration Media types.

Zabbix Events not Received in PagerDuty

Sometimes events look like they’re being sent to the PagerDuty user/action, but aren't showing up in PagerDuty. If this happens, please check the following:

  • Both the zabbix-agent and zabbix-server services must be running for Zabbix to send notifications to PagerDuty. To check that these are running use the following commands:
$ service zabbix-agent status
$ service zabbix-server status
  • Make sure your Zabbix server can make outbound HTTP/HTTPS connections to events.pagerduty.com on ports 80 and 443. If your environment requires a proxy to be set for outbound HTTP/HTTPS connections, make sure you set the proxy in the agent configuration or see the information below for using a proxy with the old Python integration.
  • Check that the integration key is set correctly in your PagerDuty user's media settings.
  • If you see an error such as "pagerduty_python: PagerDuty server REJECTED the event in file...Event object is invalid," check the trigger and recovery subjects and message formats in your PagerDuty action.

Under Configuration Actions, select the PagerDuty notification action to view its details. Remove any extra characters, such as whitespace or newline characters, after the text so that the configuration exactly matches the following:

Default subject: trigger
Recovery subject: resolve
Default message and Recovery message:

name:{TRIGGER.NAME}
id:{TRIGGER.ID}
status:{TRIGGER.STATUS}
hostname:{HOSTNAME}
ip:{IPADDRESS}
value:{TRIGGER.VALUE}
event_id:{EVENT.ID}
severity:{TRIGGER.SEVERITY}
  • If you're using Zabbix 3.x, make sure you have specified these script parameters for the PagerDuty media type under Administration Media types:
  • {ALERT.SENDTO}
  • {ALERT.SUBJECT}
  • {ALERT.MESSAGE}
Media types, script parameters

Media types, script parameters

PagerDuty Incidents not Resolving after Recovery in Zabbix

Please check the following items to ensure that Zabbix can deliver recovery events to PagerDuty and resolve associated incidents:

  • Ensure that your PagerDuty Notifications action has messaging operations (send to user/group) defined for its Recovery operations.
  • Make sure that the PagerDuty Notifications action’s messaging operations use the same message template for both the main action and the recovery action.
  • Make sure the message template is the one given in the integration guide for your Zabbix integration: Zabbix 4.x-6.x Integration Guide, Zabbix 3.x Integration Guide, Zabbix 1.x Integration Guide.
  • Ensure that the recovery operation’s default message subject is resolve (case sensitive).

Agent-Based Integration

Verify the Agent is Installed

When an issue with the agent-based installation arises, it is commonly related to the agent installation (i.e., trying to install the agent on an incompatible distribution, such as CentOS 5). The first step in troubleshooting agent-based integrations is to make sure that the PagerDuty Agent is both compatible with your distribution and successfully installed.

CentOS 5 users: Please use the Python-based integration, as the PagerDuty Agent requires a newer version of Python than the version available with CentOS 5.

Verify the Agent is Running

Once you've verified the agent is successfully installed, you'll want to make sure that it is running. You can check the status by running service pdagent status in the command line. If the agent isn't running, you can start it with the command service pdagent start.

Check the Agent's Logs for Errors

The agent logs activity and errors to /var/log/pdagent/pdagentd.log, which may contain helpful troubleshooting information.

Trigger a Test Incident with the Agent's CLI

Try manually triggering an incident using the pd-send command and check for errors (replace PD_SERVICE_KEY with one of your own PagerDuty integration keys):

$ export PD_SERVICE_KEY=YOUR_INTEGRATION_KEY_HERE
$ pd-send -k $PD_SERVICE_KEY -t trigger -d "Server is on fire" -i server.fire

If the pd-send command triggers an incident in PagerDuty, check the tips in the General Configuration Troubleshooting section. You may need to verify the trigger subject and message in your Zabbix configuration.

Trigger a Test Incident with pd-zabbix

Try manually triggering an incident using the pd-zabbix command and check for errors (replace PD_SERVICE_KEY with one of your own PagerDuty integration keys):

$ /usr/share/pdagent-integrations/bin/pd-zabbix PD_SERVICE_KEY trigger "name:Test
id:1
status:onfire
hostname:localhost
ip:127.0.0.1
value:5
event_id:2
severity:1"

Python-Based Integration

Python Version

The integration requires Python 2.7.9 or later to make a secure connection to PagerDuty. This is due to a security vulnerability in SSLv3 (POODLE), which older versions of Python use. Python 2.7.9 uses a backported version of Python 3's SSL library, so versions 2.7.9 and newer (up to 3.x) are able to make a secure connection to PagerDuty. The script does not work with Python 3.x due to other language changes in this version of Python.

Outbound HTTP/HTTPS Connections with a Proxy

If you need to set a proxy, use this modified version of the Python script for proxy support. Replace SOME_PROXY on line 68 with your proxy address (i.e., http://proxy.company.com:3128).

Trigger a Test Incident with the pagerduty.py Script

Try manually triggering an incident with pagerduty.py via the command line and check for errors (set PD_SERVICE_KEY to your own PagerDuty integration key):

$ /etc/zabbix/alert.d/pagerduty.py PD_SERVICE_KEY trigger "name:Test
id:1
status:onfire
hostname:localhost
ip:127.0.0.1
value:5
event_id:2
severity:1"

Verify the Python Script is in the Correct Location

The script should be placed in your AlertScriptsPath. This is usually /usr/lib/zabbix/alertscripts or /etc/zabbix/agent.d, but could be different if you installed Zabbix from non-standard packages. You can find the correct path for your particular environment by checking zabbix_server.conf in your Zabbix server configuration directory.

Verify the Zabbix User has Write Permissions

The script queues events in /tmp/pagerduty. If the Zabbix user cannot write to this directory, it will not be able to send alerts to PagerDuty.