Monday, 21 March 2016

Using service dependencies with NAGIOS Core

NAGIOS allows you to define service dependencies - that simply allow you to associate other hosts (or services) that a host (or service) relies on.

A typical example of this is when you would like to monitor a system that has an application frontend and database backend with two nodes in a cluster - you might not want to receive an alarm / notification (regarding the system specifically) if BOTH database servers have gone down - not simply if one of them has gone down.

Another real-world example (which I will demonstrate below) is if you are monitoring a series of externally hosted websites on your local premise and the internet connection drops - you might not want a huge series of alerts to come through - rather an alert the internet connection has gone down and to suppress any alerts for the websites.

We should firstly define the host / service that others will depend on e.g.:

define host{
        use             generic-host
        host_name       Internet Uplink
        alias           InternetUplink
        check_command   check_ping!200.0,70%!400.0,90%
        max_check_attempts 3
        contacts        MyContact
        address         8.8.8.8
        }
     
We should then define a service dependency definition:

define servicedependency{
        host_name Internet Uplink
        service_description Check Internet Connection
        dependent_host_name MyExternalWebsite
        dependent_service_description External Web Site
        execution_failure_criteria n
        notification_failure_criteria w,u,c
}

The two statements: execution_failure_criteria and notification_failure_criteria are really important here and define how notifications will be handled and how the relationship of the failures between the two hosts will affect each other.

The 'execution_failure_criteria' directive states when the dependency should be actively checked - when the parent is in specific states - for example the above snippet states that the dependency should always be checked (n) no matter what the state of the parent node is.

The 'notification_failure_criteria' directives defines when notifications of the dependency should be suppressed given the current state of the parent node. In this example alerts of the dependency will be suppressed when the parent node is in the following states - critical (c), unknown (u) and warning (w.)

0 comments:

Post a Comment