User Tools

Site Tools


devel:servicemon

Servicemon

Writing a Servicemon Checker

About checkers

A NAV service checker is a small Python class (inside a module with the same name) which checks a service and reports whether it is up or down.

The checkers are found in nav.statemon.checkers (or subsystem/statemon/nav/statemon/checker/ in the source tree). For each checker, there's a Python module (DhcpChecker.py) and a corresponding description file (DhcpChecker.descr).

A minimal example

Here's a minimal ToasterChecker which doesn't actually do anything.

from nav.statemon.abstractChecker import AbstractChecker
from nav.statemon.event import Event

class ToasterChecker(AbstractChecker):
    def __init__(self,service, **kwargs):
        AbstractChecker.__init__(self, "toaster", service, port=0, **kwargs)
    
    def execute(self):
        # This is where the checking takes place. You can do anything you want in the execute() method, as long
        # as you return either Event.UP or Event.DOWN with a descriptive message.
        # Don't worry about blocking; each checker runs in its own thread.

        # Get arguments
        ip, port = self.getAddress()
        timeout = self.getTimeout()  # get timeout in seconds
        args = self.getArgs()
        
        #
        # ...
        # Do your checking here
        # ...
        #
        
        version = some_way_to_get_the_version_of_the_toaster()
        
        if something_went_wrong:
            Event.DOWN, 'Descriptive error message'
        
        self.setVersion(version)  # This is optional (and will default to an empty string)
        
        return Event.UP, 'OK'

These are the most commonly used methods:

  • self.getAddress() ⇒ (ip, port) tuple
  • self.getTimeout() ⇒ timeout (in seconds)
  • self.setVersion(version)
  • self.getArgs() ⇒ dictionary of arguments

Check out abstractChecker.py for many more very rarely used methods.

Description file

The description file gives a human-readable name for the service and lists the required and optional arguments it takes. For example, DnsChecker.py looks like this:

description=Domain Name Service
args=request
optargs=port timeout

The parser is a bit picky about the format, so if you don't have any required arguments you can't just write:

args=  <- WRONG!

you actually have to leave the whole line out. For example ToasterChecker.descr would look like this:

description=Toaster
optsargs=timeout

Testing the checker

  • linking vs. copying while copying
  • using checkService.py to test the checker
  • servicemon.log
  • nav start servicemon / nav stop servicemon
  • tail -f /usr/local/nav/var/log/servicemon.log

Notes and pitfalls

You can do anything you want (including blocking activities) since the checker runs in its own thread.

The .py and .descr files must be named in strict camelcase (DhcpChecker), even though the service itself is an acronym (DHCP). The checker will appear on the webside in lowercase (dhcp).

Many checkers have a getRequiredArgs() function. Just ignored it. It is never called, and can safely be left out when writing new checkers. (The .descr file provides this information now.)

devel/servicemon.txt · Last modified: 2010/05/20 13:21 by olemb