backendprocesses
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| backendprocesses [2007/10/08 19:30] – faltin | backendprocesses [2015/09/24 12:48] (current) – warn of obsolesence morten | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Back-end processes in NAV ====== | ====== Back-end processes in NAV ====== | ||
| - | NAV has a number of back-end processes. This document gives an overview, listing key information | + | <note warning> |
| - | detailed description | + | |
| - | + | ||
| - | + | ||
| - | + | ||
| - | The following figure complements this document (the NAV 3.3 snmptrapd is not included in the figure): | + | |
| - | + | ||
| - | {{architecture1.png? | + | |
| - | + | ||
| - | + | ||
| + | NAV has a number of back-end processes. This page attempts to give an overview of them. | ||
| + | {{: | ||
| ===== nav list / nav status ===== | ===== nav list / nav status ===== | ||
| Line 24: | Line 16: | ||
| * [[backendprocesses# | * [[backendprocesses# | ||
| * [[# | * [[# | ||
| - | * [[#getdevicedata|getDeviceData]] | + | * [[#ipdevpoll|ipdevpoll]] |
| - | * [[# | + | |
| * [[# | * [[# | ||
| * [[# | * [[# | ||
| * [[# | * [[# | ||
| - | * [[# | ||
| * [[# | * [[# | ||
| + | * [[# | ||
| * [[# | * [[# | ||
| * [[# | * [[# | ||
| - | * [[# | ||
| * [[# | * [[# | ||
| + | * [[# | ||
| + | * [[# | ||
| ====== Building the network model ====== | ====== Building the network model ====== | ||
| - | ===== getDeviceData | + | ===== ipdevpoll |
| ==== Key information ==== | ==== Key information ==== | ||
| - | ^ Process name | + | ^ Process name |
| - | ^ Alias | gDD / the snmp data collector | + | |
| ^ Polls network | ^ Polls network | ||
| ^ Brief description | ^ Brief description | ||
| - | ^ Depends upon | Seed data must be filled in the netbox table, | + | ^ Depends upon | Seed data must be filled in the netbox table, |
| - | ^ Updates tables | + | ^ Updates tables |
| - | ^ Run mode | Daemon process. Thread based. | + | ^ Run mode | Daemon process | |
| - | ^ Default scheduling | + | ^ Default scheduling |
| - | ^ Config file | getDeviceData.conf | | + | ^ Config file | ipdevpoll.conf | |
| - | ^ Log files | getDeviceData.log og getDeviceData/ | + | ^ Log files | ipdevpoll.log | |
| - | ^ Programming language | Java | | + | ^ Programming language | Python |
| - | ^ Lines of code | Approx 8200 | | + | ^ Further doc | | |
| - | ^ Further doc | [[http:// | + | |
| Line 63: | Line 52: | ||
| ==== Details ==== | ==== Details ==== | ||
| - | * Initial OID classification | + | * jobs and plugins |
| - | + | * inventory job \\ Polls for inventory information every 6 hours (by default). | |
| - | * Plugin-based architecture | + | * profiler job \\ Runs every 5 minutes, profiling devices if deemed necessary. |
| - | * Device plugins collects data with SNMP. Each device plugin is geared towards a particular type of equipment, supporting a particular subset of OIDs. See further doc for details. | + | * logging job \\ Runs every 30 minutes |
| - | * Data plugins updates NAVdb with data fed from the device plugins. A particular data plugin is responsible for a particular table (or set of tables) in the database. See further doc for details. | + | |
| - | + | ||
| - | * Module monitor | + | |
| - | + | ||
| - | + | ||
| - | ===== iptrace ===== | + | |
| - | + | ||
| - | + | ||
| - | + | ||
| - | ==== Key information ==== | + | |
| - | + | ||
| - | ^ Process name | iptrace | | + | |
| - | ^ Alias | IP-to-mac collector / arplogger| | + | |
| - | ^ Polls network | + | |
| - | ^ Brief description | + | |
| - | ^ Depends upon | The routers (GW / GSW) must be in the netbox table. To assign prefixes to arp entries, gDD must have done router data collection. | | + | |
| - | ^ Updates tables | + | |
| - | ^ Run mode | cron | | + | |
| - | ^ Default scheduling | + | |
| - | ^ Config file | pping.conf | | + | |
| - | ^ Log file | pping.log | | + | |
| - | ^ Programming language | Perl| | + | |
| - | ^ Lines of code | Approx 130 lines| | + | |
| - | ^ Further doc | [[http:// | + | |
| - | + | ||
| - | + | ||
| - | ==== Details ==== | + | |
| - | + | ||
| - | * iptrace understands proxy arp and will not store arp entries that are " | + | |
| - | * The command line tool [[commandlinetools# | + | |
| Line 108: | Line 67: | ||
| ^ Polls network | ^ Polls network | ||
| ^ Brief description | ^ Brief description | ||
| - | ^ Depends upon | + | ^ Depends upon |
| ^ Updates tables | ^ Updates tables | ||
| ^ Run mode | cron | | ^ Run mode | cron | | ||
| Line 115: | Line 74: | ||
| ^ Log file | getBoksMacs.log | | ^ Log file | getBoksMacs.log | | ||
| ^ Programming language | Java | | ^ Programming language | Java | | ||
| - | ^ Lines of code | Approx 1400 | | + | ^ Further doc | [[http://nav.uninett.no/ |
| - | ^ Further doc | [[http://metanav.uninett.no/ | + | |
| Line 149: | Line 107: | ||
| One notable improvement is the addition of the interface field in the swport table. It is used for matching the CDP remote interface, and makes this matching much more reliable. Also, both the cam and the swp_netbox tables now use netboxid and ifindex to uniquely identify a swport port instead of the old netboxid, module, port-triple. This has significantly simplified swport port matching, and especially since the old module field of swport was a shortened version of what is today the interface field, reliability has increased as well. | One notable improvement is the addition of the interface field in the swport table. It is used for matching the CDP remote interface, and makes this matching much more reliable. Also, both the cam and the swp_netbox tables now use netboxid and ifindex to uniquely identify a swport port instead of the old netboxid, module, port-triple. This has significantly simplified swport port matching, and especially since the old module field of swport was a shortened version of what is today the interface field, reliability has increased as well. | ||
| - | - | ||
| - | |||
| - | ===== networkDiscovery_topology ===== | ||
| - | |||
| - | |||
| - | |||
| + | ===== topology ===== | ||
| ==== Key information ==== | ==== Key information ==== | ||
| - | ^ Process name | + | ^ Process name |
| - | ^ Alias | Physical Topology Builder | | + | ^ Alias | Physical |
| ^ Polls network | ^ Polls network | ||
| - | ^ Brief description | + | ^ Brief description |
| - | ^ Depends upon | mactrace fills data in swp_netbox representing the candidate | + | ^ Depends upon | mactrace fills data in '' |
| ^ Updates tables | ^ Updates tables | ||
| ^ Run mode | cron | | ^ Run mode | cron | | ||
| ^ Default scheduling | ^ Default scheduling | ||
| ^ Config file | None | | ^ Config file | None | | ||
| - | ^ Log file | + | ^ Log file |
| - | ^ Programming language | Java | | + | ^ Programming language | Python |
| - | ^ Lines of code | Approx 1500 (shared with vlan topology builder) | | + | |
| - | ^ Further doc | [[http:// | + | |
| ==== Details ==== | ==== Details ==== | ||
| - | * see the tigaNAV report as referenced for details. | + | === Physical topology === |
| + | The topology discovery system builds NAV's view of the network topology based | ||
| + | on cues from information collected previously via SNMP. | ||
| - | ===== networkDiscovery_vlan ===== | + | The information cues come from routers' |
| + | Discovery caches, interface physical (MAC) addresses, switch forwarding tables | ||
| + | and CDP (Cisco Discovery Protocol). | ||
| + | pre-parsed these cues and created a list of neighbor candidates for each port | ||
| + | in the network. | ||
| + | The physical topology detection algorithm is responsible for reducing the list | ||
| + | of neighbor candidates of each port to just one single device. | ||
| - | ==== Key information ==== | + | In practice |
| - | ^ Process name | networkDiscovery.sh vlan| | + | supporting it, and this makes it easier to correctly determine |
| - | ^ Alias | Vlan Topology Builder | | + | topology |
| - | ^ Polls network | + | trusted more than switch forwarding |
| - | ^ Brief description | + | through switches that don't support CDP, causing CDP data to be inaccurate. |
| - | ^ Depends upon | The physical topology need to be in place, this process therefore supersedes | + | |
| - | ^ Updates | + | |
| - | ^ Run mode | cron | | + | |
| - | ^ Default scheduling | + | |
| - | ^ Config file | None | | + | |
| - | ^ Log file | networkDiscovery/ | + | |
| - | ^ Programming language | Java | | + | |
| - | ^ Lines of code | See the physical topology builder above | | + | |
| - | ^ Further doc | [[http:// | + | |
| - | ==== Details ==== | + | === VLAN topology |
| - | * see the tigaNAV report as referenced for details | + | After the physical topology model of the network has been built, the logical |
| + | topology of the VLANs still remains. | ||
| + | trunking, which can transport several independent VLANs over a single physical | ||
| + | link, the logical topology can be non-trivial and indeed, in practice it | ||
| + | usually is. | ||
| + | |||
| + | The vlan discovery system uses a simple top-down depth-first graph traversal | ||
| + | algorithm to discover which VLANs are actually running on the different trunks | ||
| + | and in which direction. Direction is here defined relative to the router port, | ||
| + | which is the top of the tree, currently owning the lowest gateway IP or the | ||
| + | virtual IP in the case of HSRP. Re-use of VLAN numbers in physicallyq disjoint | ||
| + | parts of the network is supported. | ||
| + | |||
| + | The VLAN topology detector does not currently support mapping unrouted VLANs. | ||
| ====== Monitoring the network ====== | ====== Monitoring the network ====== | ||
| ===== pping ===== | ===== pping ===== | ||
| + | |||
| Line 216: | Line 180: | ||
| ^ Log file | pping.log | ^ Log file | pping.log | ||
| ^ Programming language | Python | | ^ Programming language | Python | | ||
| - | ^ Lines of code | Approx 4200, shared with servicemon | | + | ^ Further doc | See below, based on and translated from [[http://nav.uninett.no/ |
| - | ^ Further doc | [[http://metanav.uninett.no/ | + | |
| + | |||
| + | |||
| + | |||
| + | |||
| ==== Details ==== | ==== Details ==== | ||
| - | * see the NAVMore report | + | pping is a daemon with its own (configurable) scheduling. pping works in parallel which makes each ping sweep very |
| + | efficient. The frequency of each ping sweep is per default 20 seconds. The maximum allowed response time for a host is 5 seconds (per default). A host is declared down on the event queue after four consecutive "no responses" | ||
| + | that it takes between 80 and 99 seconds from a host is down till pping declares it as down. | ||
| + | |||
| + | Please note the [[# | ||
| + | have a grace period of one minute (configurable) before a "box down warning" | ||
| + | |||
| + | The configuration file '' | ||
| + | ^parameter ^description ^default | | ||
| + | | user | the user that runs the service | navcron | | ||
| + | | packet size |size of the icmp packet | 64 byte | | ||
| + | | check interval | how often you want to run a ping sweep | 20 seconds | | ||
| + | | timeout |seconds to wait for reply after last ping request is sent | 5 seconds | | ||
| + | | nrping |number of requests without answer before marking the device as unavailable | 4 | | ||
| + | | delay | ms between each ping request | 2 ms | | ||
| + | |||
| + | In addition you can configure debug level, location of log file and location of pid file. | ||
| + | |||
| + | Note: In order to uniquely identify the icmp echo response packets pping needs to tailor make the packets with its own signature. This delays the overall throughput a bit, but pping can still manage 90-100 hosts per second, which should be sufficient for most needs. | ||
| + | |||
| + | |||
| + | |||
| + | === Algorithm - one ping sweep === | ||
| + | |||
| + | < | ||
| + | pping has three threads: | ||
| + | 1. Thread 1 generates and sends out the icmp packets. | ||
| + | 2. Thread 2 receives echo replies, checks the signature and stores the result to RRD. | ||
| + | 3. The main thread does the main scheduling and reports to the event queue. | ||
| + | |||
| + | Thread 1 works this way: | ||
| + | FOR every host DO: | ||
| + | 1. Generate an icmp echo packet with: (destination IP, timestamp, signature) | ||
| + | 2. Send the icmp echo. | ||
| + | 3. Add host to the " | ||
| + | 4. Sleep in the configured '' | ||
| + | turn reduces the receive thread queue and will in effect make the measured response time more accurate. | ||
| + | |||
| + | Thread 2 works this way: | ||
| + | As long as thread 1 is operating and as long as we have hosts in the " | ||
| + | timout of 5 seconds (configurable): | ||
| + | 1. Check if we have received packets | ||
| + | 2. Get the data (the icmp reply packet) | ||
| + | 3. Verify that the packet is to our pid. | ||
| + | 4. Split the packet in (destination IP, timestamp, signature) | ||
| + | If IP is wrong or signature is wrong, discard. | ||
| + | 5. If we recognize the IP address on the " | ||
| + | | ||
| + | |||
| + | When thread 2 finishes the sweep is over. If hosts are remaining on the " | ||
| + | response time to " | ||
| + | |||
| + | When thread 3 detects that a host has to many no-replies a down event is posted on the event queue. | ||
| + | </ | ||
| + | |||
| + | Note that the response times are recorded to RRD which gives us response time and packet loss data as an extra bonus. | ||
| ===== servicemon ===== | ===== servicemon ===== | ||
| Line 238: | Line 263: | ||
| ^ Log file | servicemon.log | ^ Log file | servicemon.log | ||
| ^ Programming language | Python | | ^ Programming language | Python | | ||
| - | ^ Lines of code | See pping above, shared code base | | + | ^ Further doc | See the [[servicemon]] page and/ |
| - | ^ Further doc | [[http://metanav.uninett.no/ | + | |
| ==== Details ==== | ==== Details ==== | ||
| Line 261: | Line 285: | ||
| ^ Log file | thresholdMon.log | ^ Log file | thresholdMon.log | ||
| ^ Programming language | Python | | ^ Programming language | Python | | ||
| - | ^ Lines of code | Approx 400 | | ||
| ^ Further doc | See [[ThresholdMonitor]] | | ^ Further doc | See [[ThresholdMonitor]] | | ||
| Line 268: | Line 291: | ||
| * See [[ThresholdMonitor]] | * See [[ThresholdMonitor]] | ||
| - | |||
| - | ===== moduleMon ===== | ||
| - | |||
| - | |||
| - | ==== Key information ==== | ||
| - | ^ Process name | getDeviceData data plugin moduleMon | | ||
| - | ^ Alias | The module monitor | | ||
| - | ^ Polls network | ||
| - | ^ Brief description | ||
| - | ^ Depends upon | The switch or router to be processed by gDD with apropriate data in module and gwport/ | ||
| - | ^ Updates tables | ||
| - | ^ Run mode | daemon, a part of gDD. | | ||
| - | ^ Default scheduling | ||
| - | ^ Config file | see gDD | | ||
| - | ^ Log file | see gDD | ||
| - | ^ Programming language | Java | | ||
| - | ^ Lines of code | Part of gDD, see gDD. | | ||
| - | ^ Further doc | Not much. | | ||
| Line 305: | Line 310: | ||
| ^ Log file | eventEngine.log | ^ Log file | eventEngine.log | ||
| ^ Programming language | Java | | ^ Programming language | Java | | ||
| - | ^ Lines of code | Approx 3000 lines | | + | ^ Further doc | [[http://nav.uninett.no/ |
| - | ^ Further doc | [[http://metanav.uninett.no/ | + | |
| ==== Details ==== | ==== Details ==== | ||
| Line 313: | Line 317: | ||
| ===== maintengine ===== | ===== maintengine ===== | ||
| + | |||
| + | |||
| Line 319: | Line 325: | ||
| ^ Alias | The maintenance engine | | ^ Alias | The maintenance engine | | ||
| ^ Polls network | ^ Polls network | ||
| - | ^ Brief description | + | ^ Brief description |
| - | ^ Depends upon | NAV users must set up maintenance schedule which in turn is stored in the maintenance tables (emotd, maintenance, | + | ^ Depends upon | NAV users must set up maintenance schedule which in turn is stored in the maintenance tables (maint_task, maint_component). | |
| - | ^ Updates tables | + | ^ Updates tables |
| ^ Run mode | cron | | ^ Run mode | cron | | ||
| ^ Default scheduling | ^ Default scheduling | ||
| Line 327: | Line 333: | ||
| ^ Log file | maintengine.log | ^ Log file | maintengine.log | ||
| ^ Programming language | Python | | ^ Programming language | Python | | ||
| - | ^ Lines of code | Approx 300 | | + | ^ Further doc | Old doc: [[http://nav.uninett.no/ |
| - | ^ Further doc | [[http://metanav.uninett.no/ | + | |
| ==== Details ==== | ==== Details ==== | ||
| Line 348: | Line 353: | ||
| ^ Config file | alertengine.cfg | | ^ Config file | alertengine.cfg | | ||
| ^ Log file | alertengine.log og alertengine.err.log | ^ Log file | alertengine.log og alertengine.err.log | ||
| - | ^ Programming language | perl | | + | ^ Programming language | Python |
| - | ^ Lines of code | Approx 1900 | | + | ^ Further doc | [[http://nav.uninett.no/ |
| - | ^ Further doc | [[http://metanav.uninett.no/ | + | |
| ==== Details ==== | ==== Details ==== | ||
| Line 357: | Line 361: | ||
| ===== smsd ===== | ===== smsd ===== | ||
| + | |||
| + | |||
| Line 363: | Line 369: | ||
| ^ Alias | The SMS daemon | | ^ Alias | The SMS daemon | | ||
| ^ Polls network | ^ Polls network | ||
| - | ^ Brief description | + | ^ Brief description |
| - | ^ Depends upon | alertEngine fills the smsq | | + | ^ Depends upon | alertEngine fills the navprofiles.smsq table | |
| ^ Updates tables | ^ Updates tables | ||
| ^ Run mode | Daemon process | | ^ Run mode | Daemon process | | ||
| ^ Default scheduling | ^ Default scheduling | ||
| ^ Config file | smsd.conf | | ^ Config file | smsd.conf | | ||
| - | ^ Log file | smsd.log | + | ^ Log file | smsd.log | |
| - | ^ Programming language | Python | + | ^ Programming language | Python (Perl in 3.1) | |
| - | ^ Lines of code | In NAV 3.2: approx 1200 | | + | ^ Further doc | subsystem/ |
| - | ^ Further doc | - | | + | |
| - | ===== The snmptrapd ===== | + | |
| + | |||
| + | ==== Details | ||
| + | |||
| + | |||
| + | === Usage === | ||
| + | |||
| + | As described when given the '' | ||
| + | |||
| + | Usage: smsd [-h] [-c] [-d sec] [-t phone no.] | ||
| + | |||
| + | -h, --help | ||
| + | -c, --cancel | ||
| + | -d, --delay | ||
| + | -t, --test | ||
| + | |||
| + | Especially note the '' | ||
| + | |||
| + | |||
| + | === Configuration === | ||
| + | |||
| + | The configuration file smsd.conf lets you configure the following: | ||
| + | |||
| + | ^ parameter | ||
| + | | username | ||
| + | | delay | Delay in seconds between queue runs | 30 | | ||
| + | | autocancel | Automatically cancel all messages older than ' | ||
| + | | loglevel | ||
| + | | mailwarnlevel | Filter level for log messages sent by mail. | ERROR | | ||
| + | | mailserver | Mail server to send log messages via. | localhost | | ||
| + | | dispatcherretry | Time, in seconds, before a dispatcher is retried after a failure | 300 | | ||
| + | | dispatcherN | Dispatchers in prioritized order. Cheapest first, safest last. N should be 1,2,3,... | dispatcher1 defaults to GammuDispatcher | | ||
| + | |||
| + | In addition, some dispatchers need extra configuration as described in comments in the config file. | ||
| + | |||
| + | |||
| + | ===== snmptrapd ===== | ||
| Line 392: | Line 434: | ||
| ^ Log file | snmptrapd.log and snmptraps.log | ^ Log file | snmptrapd.log and snmptraps.log | ||
| ^ Programming language | Python | ^ Programming language | Python | ||
| - | ^ Lines of code | ? | | ||
| ^ Further doc | - | | ^ Further doc | - | | ||
| Line 398: | Line 439: | ||
| ===== makecricketConfig ===== | ===== makecricketConfig ===== | ||
| + | |||
| Line 408: | Line 450: | ||
| ^ Polls network | ^ Polls network | ||
| ^ Brief description | ^ Brief description | ||
| - | ^ Depends upon | That gDD has filled the gwport, swport tables (and more...) | | + | ^ Depends upon | That ipdevpoll |
| ^ Updates tables | ^ Updates tables | ||
| ^ Run mode | cron | | ^ Run mode | cron | | ||
| Line 414: | Line 456: | ||
| ^ Config file | None | | ^ Config file | None | | ||
| ^ Log file | cricket-changelog | ^ Log file | cricket-changelog | ||
| - | ^ Programming language | perl | | + | ^ Programming language | python |
| - | ^ Lines of code | Approx 1600 | | + | |
| ^ Further doc | [[howtoconfigurecricket|How to configure Cricket addons in NAV v3]] | | ^ Further doc | [[howtoconfigurecricket|How to configure Cricket addons in NAV v3]] | | ||
| + | |||
| + | ==== Details ==== | ||
| ===== The Cricket collector (not NAV) ===== | ===== The Cricket collector (not NAV) ===== | ||
| + | |||
| ==== Key information ==== | ==== Key information ==== | ||
| Line 429: | Line 473: | ||
| ^ Updates tables | ^ Updates tables | ||
| ^ Run mode | cron | | ^ Run mode | cron | | ||
| - | ^ Default scheduling | + | ^ Default scheduling |
| ^ Config files | directory tree under cricket-config/ | ^ Config files | directory tree under cricket-config/ | ||
| ^ Log file | cricket/ | ^ Log file | cricket/ | ||
| ^ Programming language | not relevant | | ^ Programming language | not relevant | | ||
| - | ^ Lines of code | not relevant | | ||
| ^ Further doc | not relevant | | ^ Further doc | not relevant | | ||
| Line 452: | Line 495: | ||
| ^ Log file | ? | ^ Log file | ? | ||
| ^ Programming language | Perl | | ^ Programming language | Perl | | ||
| - | ^ Lines of code | Approx 200 | | ||
| ^ Further doc | - | | ^ Further doc | - | | ||
| Line 478: | Line 520: | ||
| ^ Log file | None | ^ Log file | None | ||
| ^ Programming language | Python | | ^ Programming language | Python | | ||
| - | ^ Lines of code | Approx 350 | | + | ^ Further doc | [[http://nav.uninett.no/ |
| - | ^ Further doc | [[http://metanav.uninett.no/ | + | |
| ==== Details ==== | ==== Details ==== | ||
| Line 500: | Line 541: | ||
| ^ Default scheduling | ^ Default scheduling | ||
| ^ Programming language | | | ^ Programming language | | | ||
| - | ^ Lines of code | | | ||
| ^ Further doc | [[Arnold|Arnold]] | | ^ Further doc | [[Arnold|Arnold]] | | ||
backendprocesses.1191871849.txt.gz · Last modified: by faltin
