I am looking into monitoring the status of SQL Server Agent jobs and reporting them with Nagios with the following requirements
- For each step report the ‘run_status’.
- Provide date arithmetic to determine a time span for when the step was last run in days.
- Return a status of Critical if the run_status is reported as ‘0’ and Warning if the run_status is reported as ‘1’ and the ‘run_date’ is greater than the time span.
The powershell script will require two parameters for both the SQL Agent step name and the number of days to compare the ‘run_date’ agaisnt a calculated time span object.
Param ([string] $StepName,[string] $Days)
As I will be querying the SQL Server database engine instance and in this case a majority of those are SQL Server 2008 R2 I will need to import the SQLPS module into my session.
Import-Module SQLPS -DisableNameChecking
Below is my the query I will be invoking to return the information for the last run where the step_name is specified as a parameter.
$Query = Invoke-SQLCmd - ServerInstance "." -Query ("SELECT TOP 1 * from msdb..sysjobhistory h WHERE step_name = " + $StepName + " order by instance_id desc")
Once the information has been returned I will need to use DateTime.ParseExact method in order to convert the ‘run_date’ value to a date string of ‘dd/MM/yyyy’ and store this as a variable.
$LastRun = [datetime]::parseexact(($Query.run_date),"yyyyMMdd",$null).toString('dd/MM/yyyy')
The run_duration value returns the amount of time the step was running in seconds, I want to return this as a string ‘dd:hh:mm:ss’ by using the New-TimeSpan cmdlet and formatting the output.
$Duration = (New-TimeSpan -Seconds $Query.run_duration).ToString()
In order to provide the date arithmetic I will subtract the number of days specified in the parameter from the current date.
$Timespan = (Get-Date) - (New-TimeSpan -Days $Days -Hours 0 -Minutes 0)
In order to configure the service status and the status information conditional logic will be used to determine the return code. As per my requirements, I require the to report the service status to be ‘OK’ if the run_status is equal to ‘1’ and the ‘run_date’ is greater than the TimeSpan variable generated from the date arithmetic.
If ($Query.run_status -eq "1" -and [datetime]::parse($LastRun ) -ge [datetime]::parse($TimeSpan) ) { $returncode = 0 "Completed succesfully on " + $LastRun + " with run duration " + $Duration }
We will report a service status of ‘Warning if the ‘run_status’ is equal to ‘1’ and the ‘run_date’ is less then the TimeSpan variable generated from the date arithmetic.
ElseIf ($Query.run_status -eq "1" -and $LastRun -lt $Timespan ) { $returncode = 1 "Completed succesfully on " + $LastRun + " with run duration " + $Duration }
Finally, if the ‘run_status’ is equal to ‘0’ the service status will be reported as ‘Critical.
ElseIf ($Query.run_status -eq "0") { $returncode = 2 "Completed with failure on " + $LastRun + " with run duration " + $Duration }
Once the status information has been generated the script will exit the powershell session returning the exit code.
exit $returncode
While the script was created to be executed as an external script within Nagios, this can be run standalone from Windows Powershell as below.
./Check-SQLServerAgentJob,ps11 -StepName <step_name> -Days <number of days>
If your are looking to add external scripts to Nagios such as this one see the below link for more information;
https://deangrant.wordpress.com/2013/09/12/creating-and-running-external-scripts-within-nagios-xi/
The full Windows Powershell script can be downloaded from the below link: