Nagios XI: Automating Host Management

I was recently looking at how to automate adding and removing managed hosts and services in Nagios XI, which can be particularly useful in cloud computing and large environments where configuration management solutions have been implemented for provisioning. In these environments we typically use configuration files based on the attributes of a server role during the provisioning and configuration cycle.

Nagios XI contains a number of scripts in the directory /usr/local/nagiosxi/scripts that allow for automated host management, as below:

Script Description
reconfigure_nagios.sh Imports configuration files from the import directory, verifies configuration and restart Nagios if verification succeeds . If verification fails, configuration will be rolled back to the last working checkpoint. This is the command invoked from the web interface when selecting ‘Apply Configuration’.
nagiosql_delete_host.php Removes a host from the configuration database and removes the configuration file.
nagiosql_delete_service.php Removes services from the configuration database and removes the configuration file.

In order to automate adding managed hosts and services the method used was to create a single configuration file for a each host and each of its services to which service definitions are are only applied to a that host and not to a host list or host group and to name the configuration file according to the hostname. In the below example, I have created a single configuration file which defines the host and a managed service for CPU Usage and saved the configuration file as ‘server1.dean.local.cfg’.

define host {
 host_name server1.dean.local
 use xiwizard_windowsserver_host
 address server1.dean.local
 max_check_attempts 5
 check_interval 5
 retry_interval 1
 check_period xi_timeperiod_24x7
 notification_interval 60
 notification_period xi_timeperiod_24x7
 icon_image win_server.png
 statusmap_image win_server.png
 _xiwizard windowsserver
 register 1
 } 

define service {
 host_name server1.dean.local
 service_description CPU Usage
 use xiwizard_windowsserver_nsclient_service
 check_command check_xi_service_nsclient!!CPULOAD!-l 5,80,90
 max_check_attempts 5
 check_interval 5
 retry_interval 1
 check_period xi_timeperiod_24x7
 notification_interval 60
 notification_period xi_timeperiod_24x7
 _xiwizard windowsserver
 register 1
 }

Once the configuration file has been created we can place the file in the import directory located at ‘/usr/local/nagios/etc/import’ and invoke the script reconfigure_nagios.sh from the directory ‘/usr/local/nagiosxi/scripts’ to import the configuration file, verify the configuration and restart Nagios if successful. If the verification of the configuration fails, Nagios XI will restore the configuration files to the last working checkpoint but the imported configuration file will remain in the configuration database. In order to detect failures the following exit codes are returned where an exit code of ‘0’ to confirm that the configuration file has been successfully verified as a working configuration and Nagios has been restarted.

Exit Code Description
0 no problems detected
1 config verification failed
2 nagiosql login failed
3 nagiosql import failed
4 reset_config_perms failed
5 nagiosql_exportall.php failed (write configs failed)
6 /etc/init.d/nagios restart failed
7 db_connect failed

Now that we have added a managed host and services, how do we remove this the configuration database and delete the configuration file once the host is terminated? Providing the host has no dependent relationships we can firstly remove the services using the configuration name which matches the configuration file of the managed host (this is why it is important to name the configuration file according to hostname) and invoke the ‘nagiosql_delete_service.php’ from the directory ‘/usr/local/nagiosxi/scripts’ as the below example:

./nagiosql_delete_service.php --config=server1.dean.local

After the services have been successfully deleted we can remove the host by invoking the ‘nagiosql_delete_host.php’ script:

./nagiosql_delete_host.php --host=server1.dean.local

Once the host has been successfully removed, we can apply the new configuration as previosuly by invoking the ‘nagios_reconfigure_sh’ script. This method can also be applied to remove an imported configuration from the configuration database if verification of the configuration has failed during an import.

The above describes how to automate adding and removing hosts and services using Nagios XI and can be applied to your configuration management solutions during the provisioning and configuration cycle. In my scenario, I created a number of configuration files based on the attributes of server roles to which can be used as cookbook templates in Chef and using the ‘{node[‘fqdn’]}’ pattern to specify the host name in the definition file and the configuration file name.  I have also compiled PowerShell functions to perform the above which I will discuss in a later post.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s