Skip to main content

NAV

Cake with NAV logo

For those of you who have not already noticed, we released the final NAV 4.0 version less than two weeks ago. Due to a serious bug found on the 3.15 branch, 4.0.1 was also released last week.

We also celebrated with cake!

As before, the most convenient way of getting started with NAV is using Debian GNU/Linux and the packages from our APT repository. If you aren’t already up and running with NAV, see our guide for adding the repository to your Debian server.

We realize that Graphite’s documentation may not be the best, and we are already receiving question about installing and configuring both the Carbon backend and the web frontend. Since we are a bit Debian focused, we have published a guide for installing and configuring Graphite for NAV use on a Debian server on our wiki.

As always, happy NAVing!

Why I love Whisper

We have just built a new, modern server room at UNINETT, with robust power distribution and cooling systems, and of course, we want to monitor the server room environment using NAV.

For NAV, we are brushing up its support for collecting sensor readings from UPSes, and we are implementing support for the Comet web probes that have been deployed to take temperature readings in the new server room.

This is when I happened upon NAV’s implementation of the UPS-MIB (RFC 1628), where the precision of a couple of objects is off by a factor of 10. No way our UPS is putting out 50 Amperes of electric current! The fix for the NAV code was quick, but the graph doesn’t look very nice after the change:

Graph of temperature readings from UPSes where the precision is off. Output current graph drops suddenly at the end.

This is where Whisper, the storage format used by Graphite, shines, compared to RRD, in my humble opinion. This was all fixable with some one-line command trickery:

whisper-fetch.py upsOutputCurrent.wsp \
| perl -lane \
    'print @F[0] . ":" . @F[1]/10.0 if @F[1] > 15.0' \
| xargs whisper-update.py upsOutputCurrent.wsp
  1. The whisper-fetch command pulls out all the data points from the underlying Whisper file.
  2. The perl command filters any data point with a value above 15.0, divides the value by 10.0 and outputs an updated data point.
  3. The xargs+whisper-update combination updates the Whisper file with all the modified datapoints output by the perl command.

The result:

Graph of temperature readings from UPSes with better precision. There are no sudden drops in the graph.

Brilliantly simple :-)

Statistics

One of the goals we had when we moved from Cricket to Graphite for statistics was to integrate the graphs more with NAV, instead of having to navigate to another webpage to get the information you wanted.

Graphs are important

For most NAV users, graphs are a very important source of information, and it is important that they are available where needed and configurable to convey the correct information.

The main source of graphs for an ip device is now the IP Device Info tool. For an IP device there are two types of graphs available - System metrics and Port metrics.

System metrics

System metrics display graphs regarding cpu, memory and other system related metrics.

Screenshot of tabs in NAV’s system metrics. Ping tab is opened, and is showing a ping packet loss graph

Port metrics

Port metrics display graphs regarding the interfaces of the IP device.

Screenshot of port metrics. Displays the graph called ‘Port details’. It shows in- and out-going traffic on some port, measured in bits per second.

Detailed interface view

The detailed interface view display all related graphs for that interface in addition to detailed information regarding the interface.

Screenshot of detailed interface view. Displays 4 activity graphs.

Graph controls

Each graph has individual controls for choosing timeframe. In addition there are global controls available for selecting timeframes for all graphs on a page.

Graphs on the dashboard

If the graph you are looking at is of special importance you can easily put it on your dashboard by clicking the “Add to dashboard” button. When the graph is on the dashboard you can set a custom title and refresh interval for it.

Screenshot of a graph that was added to the user’s dashboard. It shows a ping packet round trip time graph.

Custom graphs

If you want to dive deeper and get even more out of the integration, you have the ability to create your own graphs using the Graphite interface. These graphs may then be placed on your dashboard and will refresh themselves automatically.

Screenshot of a custom made graph. The graph is called ‘The 3 ip devices with highest max cpu’.

Test it yourself!

If you want to see and try all this out for yourself, we recommend installing the NAV appliance.

New beta release

A new beta of NAV 4.0 is released. In addition to the new logo, you will also find interface improvements all over.

Try it out! It’s easy using the virtual appliance.

NAV logo

As you can see, we have finally chosen the new NAV logo! It is simple, yet has some subtle nuances that we like.

We hope you like it aswell - feel free to drop us a comment and tell us what you think!

The road to 4.0

For those of you who still don’t know, these days we are working very hard to bring you NAV 4.0.

NAV 4.0 mostly tries to solve two big issues. One is an issue of mobile and small screen compatility (a.k.a grand ui redesign), the other is the long standing issue with lack of flexibility in storing and presenting time-series data (a.k.a statistics).

Mobile and small screen compatibility (A.K.A. Grand UI redesign)

While the initial goal came from a seemingly innocuous request from the NAV reference committee, to make some of the relevant NAV tools usable for field work on small screens, it has become so much more.

Our gut reaction was to “fix” the so called “relevant” tools piecemeal, but we eventually realized that this was the way we always did things, and that it was really an anti-pattern. Although we have worked towards extensive reuse of backend code in NAV, we have not been so diligent in encouraging reuse on the frontend side of things.

We did initially do some minor alterations to the status and the port admin tools to make them work better on small screens, and these were released in NAV 3.14. These were only a quick band-aid, though.

We came to see the need to redo much of the design of the NAV web interface to achieve a consistent user experience. Many of the NAV tools, while all inheriting from the basic NAV web template, would often times “re-invent the wheel” when it came to user interface elements. This only serves to confuse users, and is also a maintenance nightmare.

[Foundation hero logo] We wanted a framework to help us achieve this. Although Bootstrap was considered at first, it’s license was deemed incompatible with our use of version 2 of the GPL. Eventually we landed on Foundation as our framework. We also thought that while Bootstrap was nice for getting a new project up and running fast, we already had an old and big project, and it was in need of a framework that would help us structure and consolidate an already huge amount of design work.

We were fortunate enough to acquire the help of Christine Sætre from NTNU. She works as an expert in the field of interaction design, and has been an invaluable part of our redesign feedback loop for several months already.

Not only are we cleaning up things on a large scale here, but in the process we are evaluating the usability of all the tools on small screens and touch devices, which was what we set out to enable in the first place.

Lack of flexibility in handling time-series data (A.K.A. Statistics)

NAV already has a long history, dating back to the nineties, having first acquired its current name in 1999.

Being something a lot smaller than it is today, it initially provided a way to automatically configure MRTG to collect and graph time-series data from monitored devices. Fairly quickly, it replaced MRTG with Cricket, and has been using that ever since.

The biggest selling point of Cricket was its hierarchical configuration trees, simplifying manual maintenance of configuration for monitoring massive numbers of devices. However, this point was really moot, all the while NAV would automatically produce and maintain this configuration for you.

image

Over the years, Cricket would prove massively inflexible whenever we wanted to collect new things while still keeping old data around, or when we simply want to organize information differently. Most of these things were possible, of course, but would require huge amounts of manual work on the part of the NAV user/administrator.

Cricket would also scale poorly when increasing the number of things to be monitored. In fact, the entire reason the EDGE device category exists in NAV was NTNU’s wish to exclude statistics from access ports, as they simply could not get Cricket to scale to the massive amounts of edge switches in their network.

To underscore our growing dissatisfaction with Cricket, maintenance of its codebase seems to have ceased in 2004. How could we replace Cricket with something better?

[RRDTool logo]

We knew we wanted to do the data collection in NAV code. We already had a daemon for collecting SNMP data; why should we outsource parts of this a third party tool? Although RRDtool still seemed like the best candidate for storing our time-series data, part of our gripes with Cricket were actually with RRDtool itself.

That’s when we found Graphite.

We established an experimental Graphite installation and started sending metrics from our customers’ NAV deployment servers to it. During testing, other groups within UNINETT caught on on and established Graphite installations for new projects they were working on.

image

Eventually, the flexibility and scalability of Graphite, coupled with it’s simple interface for sending metrics, won us over. We decided it was our replacement for RRDtool and the Cricket web interface, while data collection would be handled by NAV’s existing SNMP collection engine, ipdevpoll.

A new major version

It’s been a full 10 years since the first NAV 3 release. While we have made many evolutionary changes to NAV these past ten years, the latest 3.15 release being quite different from the first 3.0 release, we believe these two changes to be fairly large and disruptive.

Even though most of the data was collected by a third party tool, statistics have been part of the core of NAV. This new version fundamentally changes this core, and you will feel the difference. NAV 4 no longer tracks the individual metrics that are available, but leaves that to Graphite. It sends metrics to Graphite and assumes it will find them there later. It doesn’t care how Graphite stores its metrics, it uses Graphite’s API to discover which metrics are available, to graph the metrics, and leverages the API when defining and monitoring metric threshold rules.

While the core functionality is the same, you will be met by an entirely redecorated user interface.

It will feel quite different, but hopefully a lot better, than the NAV you know, and that is what we want to signify by increasing the major version number.

Welcome to NAV 4!

P.S. We will be back with a blog entry on beta testing NAV 4, so stay tuned :-)