I was recently looking at how to automate adding and removing managed hosts and services in Nagios XI, which can be particularly useful in cloud computing and large environments where configuration management solutions have been implemented for provisioning. In these environments we typically use configuration files based on the attributes of a server role during the provisioning and configuration cycle.
Nagios XI contains a number of scripts in the directory /usr/local/nagiosxi/scripts that allow for automated host management, as below:
Script | Description |
reconfigure_nagios.sh | Imports configuration files from the import directory, verifies configuration and restart Nagios if verification succeeds . If verification fails, configuration will be rolled back to the last working checkpoint. This is the command invoked from the web interface when selecting ‘Apply Configuration’. |
nagiosql_delete_host.php | Removes a host from the configuration database and removes the configuration file. |
nagiosql_delete_service.php | Removes services from the configuration database and removes the configuration file. |
In order to automate adding managed hosts and services the method used was to create a single configuration file for a each host and each of its services to which service definitions are are only applied to a that host and not to a host list or host group and to name the configuration file according to the hostname. In the below example, I have created a single configuration file which defines the host and a managed service for CPU Usage and saved the configuration file as ‘server1.dean.local.cfg’.
define host { host_name server1.dean.local use xiwizard_windowsserver_host address server1.dean.local max_check_attempts 5 check_interval 5 retry_interval 1 check_period xi_timeperiod_24x7 notification_interval 60 notification_period xi_timeperiod_24x7 icon_image win_server.png statusmap_image win_server.png _xiwizard windowsserver register 1 } define service { host_name server1.dean.local service_description CPU Usage use xiwizard_windowsserver_nsclient_service check_command check_xi_service_nsclient!!CPULOAD!-l 5,80,90 max_check_attempts 5 check_interval 5 retry_interval 1 check_period xi_timeperiod_24x7 notification_interval 60 notification_period xi_timeperiod_24x7 _xiwizard windowsserver register 1 }
Once the configuration file has been created we can place the file in the import directory located at ‘/usr/local/nagios/etc/import’ and invoke the script reconfigure_nagios.sh from the directory ‘/usr/local/nagiosxi/scripts’ to import the configuration file, verify the configuration and restart Nagios if successful. If the verification of the configuration fails, Nagios XI will restore the configuration files to the last working checkpoint but the imported configuration file will remain in the configuration database. In order to detect failures the following exit codes are returned where an exit code of ‘0’ to confirm that the configuration file has been successfully verified as a working configuration and Nagios has been restarted.
Exit Code | Description |
0 | no problems detected |
1 | config verification failed |
2 | nagiosql login failed |
3 | nagiosql import failed |
4 | reset_config_perms failed |
5 | nagiosql_exportall.php failed (write configs failed) |
6 | /etc/init.d/nagios restart failed |
7 | db_connect failed |
Now that we have added a managed host and services, how do we remove this the configuration database and delete the configuration file once the host is terminated? Providing the host has no dependent relationships we can firstly remove the services using the configuration name which matches the configuration file of the managed host (this is why it is important to name the configuration file according to hostname) and invoke the ‘nagiosql_delete_service.php’ from the directory ‘/usr/local/nagiosxi/scripts’ as the below example:
./nagiosql_delete_service.php --config=server1.dean.local
After the services have been successfully deleted we can remove the host by invoking the ‘nagiosql_delete_host.php’ script:
./nagiosql_delete_host.php --host=server1.dean.local
Once the host has been successfully removed, we can apply the new configuration as previosuly by invoking the ‘nagios_reconfigure_sh’ script. This method can also be applied to remove an imported configuration from the configuration database if verification of the configuration has failed during an import.
The above describes how to automate adding and removing hosts and services using Nagios XI and can be applied to your configuration management solutions during the provisioning and configuration cycle. In my scenario, I created a number of configuration files based on the attributes of server roles to which can be used as cookbook templates in Chef and using the ‘{node[‘fqdn’]}’ pattern to specify the host name in the definition file and the configuration file name. I have also compiled PowerShell functions to perform the above which I will discuss in a later post.