RRDBrowse Manual

Tommy van Leeuwen

$Revision: 1.1 $

Implementation Guidelines and Configuration Manual for RRDBrowse.


Table of Contents
1. What is RRDBrowse?
2. Installing RRDBrowse
2.1. Dependencies
2.2. Program Installation
2.3. Setting up the Web Interface
3. Upgrading RRDBrowse
4. Update Process
5. Testing
6. NFO Files
7. PAG Files
8. Performance
9. Cleaning Up
10. Modules
10.1. Global Required Fields
10.2. Global Optional Fields
10.3. Module Requirements
10.3.1. Generic SNMP ifOctets modules: (port, port64, catalyst, catalyst64, errors)
10.3.2. Cisco CAR rate-limit: (carlist)
10.3.3. Cisco Envinronment: (c72temp, ciscocpu, ciscoppp, memfree)
10.3.4. Linux Envinronment: (lincpu, linload, linmem, linmem22, linproc)
10.3.5. Linux Envinronment: (fsstat)
10.3.6. Linux Disk IO: (lindiskblk, lindiskio)
10.3.7. Linux Open Files and Sockets: (linofiles, linsockets)
10.3.8. APC UPS: (apccurrent, apcload, apctemp, apctime)
10.3.9. Apache Counters: (apabpr, apabw, apacpu, apaproc, apaxs)
10.3.10. Generic TCP Response Time counter: (tcpres)
10.3.11. Bind Queries: (bindqrs)
10.3.12. Misc Counters: (oidgauge, oidderive, telnet)
10.3.13. Request Tracker (RT) Statistics: (rtstats)
10.3.14. Windows 2000 Statistics: (w2kcpu, w2kmem)
11. History
12. References
13. Copyright and License

1. What is RRDBrowse?

RRDBrowse is a poller daemon, templater and webinterface for RRDTool. It has a threaded daemon which periodically runs from cron. It works with small .nfo files which hold router information and optionally connection details, colors, min max, bandwidth settings, etc, etc. RRDBrowse uses a small caching mechanism to store interface names. It's much MRTG like in it's current state.


2. Installing RRDBrowse

2.1. Dependencies

This Perl module requires these other modules and libraries:

RRDs (ships with rrdtool)
Time::HiRes
Data::Dumper
SNMP_util
Image::Magick
mod_perl
libwww-perl
Digest::MD5
Sys::Syslog

2.2. Program Installation

To install the Perl extension for RRDBrowse type the following:

perl Makefile.PL

Check and fix all prerequisites which you are seeing. You can use the CPAN module to install missing modules:

perl -MCPAN -e'install Module::Name'

Make sure all your prerequisites are fixed and type:

make
make install

Copy the rrdbrowse.conf to only one of the following locations:

/etc/rrdbrowse.conf
/usr/local/etc/rrdbrowse.conf
~/.rrdbrowse.conf
./rrdbrowse.conf

Now, you're ready to create the directory layout.

It's preferrable to create the files in /usr/local/etc/rrd so you don't have to update the configuration file. Any other path is no problem ofcourse. The above directory will be referenced to as basepath.

You must create the following directories under basepath: rrds, nfos, pages and pngs. The basepath/pngs directory should be writable by the user under which the webserver runs and under the user the update daemon runs as. You must specify the group under which your apache runs as in the rrdbrowse.conf. The directory structure looks like:

/usr/local/etc/rrdbrowse/
/usr/local/etc/rrdbrowse/nfos/
/usr/local/etc/rrdbrowse/pages/
/usr/local/etc/rrdbrowse/rrds/
/usr/local/etc/rrdbrowse/pngs/
/usr/local/etc/rrdbrowse/cache/

The rrds and pngs dir mus be writable by the webserver. nfos and pages only have readonly access during a session.

All daemons and utilities need to have atleast one subdirectory under which the .nfo files are stored. An example nfos directory structure looks like:

nfos/customers/cust1/cpu.nfo
nfos/customers/cust2/load.nfo
nfos/customers/cust2/bandwidth.nfo
nfos/company/router1/bandwidth.nfo
nfos/company/router1/load.nfo
nfos/company/ascends/max1-users.nfo
nfos/company/ascends/max2-users.nfo
etc..

Now lets test if everything works.

Create a test .nfo file or copy and edit one from the examples directory. Next execute the rbtest utility to test if all is ok. The rbtest utility tests if it can read and parse the nfo file, read the snmp data and create a graph.

rbtest path/to/test.nfo

For now, you /have/ to specify a path, even if it is ./ So if you are in some nfofile dir execute:

rbtest ./test.nfo

If all looks ok you are ready to put the update daemon in cron every five minutes. Use a cron entry like:

*/5 * * * * root /usr/bin/rbupdate

You should check out the cron mail received because it can contain errors. Probably it are snmp timeouts which can be ignored. Also if for some reason an interface got deleted from your router cron mails are easy to track this kind of problems.


2.3. Setting up the Web Interface

You should set up a title and an image in the rrdbrowse.conf. Next copy the rb.cgi to your cgi directory. Test it by executing rb.cgi from your website. If you want a bit more speed you can enable mod_perl in your cgi directory. This generally gives you a speedup of 30-50% on the website accesstime. If you don't want mod_perl, you're finished now.

Create an apache virtualhost entry (with mod_perl enabled) like this:

<VirtualHost rrd.chiparus.net>
PerlFreshRestart On
ScriptAlias /cgi-bin/ /usr/local/etc/rrd/cgi-bin/
 <Directory /usr/local/etc/rrd/cgi-bin/>
  SetHandler perl-script
  PerlHandler Apache::Registry
  Options +ExecCGI
 </Directory>
DocumentRoot /usr/local/etc/rrd/cgi-bin/
DirectoryIndex rb.cgi
</Virtualhost>

3. Upgrading RRDBrowse

If you are upgrading from earlier versions than of 1.4 you should make sure all options in the configuration file are ok. Especially the 'showvars' is important. The rest will have reasonally default values if not specified.

Most important is that since 1.4 there is only one CGI script which does the complete website. The other single files which build the mod_perl website in previous versions are gone. (That includes all headers, footers and the stylesheet. Everything is incorporated in rb.cgi).

The original mod_perl website is not supported anymore.

Also the header/footer stuff in the configuration file is gone. This is replaced by 'title', which sets the pagetitle and 'image' which in an html image tag for the image in the top right corner.


4. Update Process

The update daemon runs from cron every 5 minuts and parses every .nfo file it finds in the nfo directory. The update daemon has a number of threads it spawns which deal with a certain nfo file individually. A lot of time is spend with waiting for snmp response however if you have a fast network and a slow server the rrd updates may become important. On a normal network and server it should be able to handle at least 2500+ nfo/rrd updates every 5 minutes. A reasonabe recent fast PC may handle up to 10000+ updates every 5 minutes.


5. Testing

A simple test script is available to test your nfo file creations. To test a nfo name it .nf and use the test utility to test it. If it works and no errors or bugs appear you have to rename the .nf file to .nfo so the update process can find it. The test script can also be used to test your modules.

Make sure you test and work on your files with .nf and rename them later to .nfo to avoid cron emailing errors to you evey time. .nf files are skipped during the update process.


6. NFO Files

The NFO files are text files containg one or more of the following lines:

Interface: The exact textual description of the interface.
Target: community@ip.adres
Bandwidth: does nothing
Limit: Limit the graph to this amount of bps. 
Description: The description, make them the same as your cisco desc!
In: Description for the In (green) line of the default function (port.pm)
Out: Description for the Out (blue) line of the default function (port.pm)
Type: Which module you want to use (default: port.pm)
Options: nowarn (don't log snmp timeouts)
HRule: "value" prints a red line at value

The only fields always required are: Interface, Target and Type. Type: defines the perl module used for monitoring. See RRD::Info.pm to see a list of allowed (default) keywords. Depending on the module you use more or less options will be available.

Formatting: If the In: line is something like "from customer" and the Out: line reads "to customer" it's wise to put a few extra whitespaces at the end to the shortest line (in this case Out:) to align the text in the graphs. Basically you should play a little with it to get a nice layout, but by all means, it's not necessary ofcourse.

The In: and Out: keywords are only supported in port (and catalyst) modules. See the examples directory for minimal required keywords for a module.

Please note that Bandwidth, Connection and Router are only descriptive fields used only in the webinterface. They don't serve any purpose in the rest of the program.


7. PAG Files

A 'page' is simply a collection of nfo files and a title. A 'page' will show up in RRDBrowse as a list of thumbnails linked to the detail files. You can place your page files in the pages directory or mix them with the nfo files itself. The layout of a page is simple and all nfo files listed are relative to the basepath defined in the rrdbrowse.conf. An example page to list all your cpu's:

All CPU's Overview
customers/cust1/cpu.nfo
customers/cust2/cpu.nfo
customers/cust3/server1/cpu.nfo
customers/cust3/server2/cpu.nfo
etc..

Please note, the separate pag directory is gone from new releases. Pag files can now be placed anywhere in the nfo directory. The nfo directory serves as a basepath for the webinterface.


8. Performance

RRDBrowse, but especially RRDtool itself are responsible for great performance. Thousands of NFO files are no problem. Data which is needed more than once during several updates is cached to disk. However, if you suspect a performance problem somewhere it might probably be due to the SNMP updates coming from a slow network or slow server. If you want to have a look at how fast each nfo file updates you may want to give the -p parameter to rbupdate.

rbupdate -p

This will print the number of seconds for each update. If you plot Apache performance statistics and have 5 nfo files for each apache host, you can see the result of the cache too. The first update will take much more time compared to the next updates from the same host.


9. Cleaning Up

If you change your nfo files, remove interfaces, routers, etc, you can have outdated PNGs and cache data. To clear this data use the rbclean utility.

rbclean -p -c (-f | -d)

To clean up old PNG files use the -p option, to clean up old cache data use the -c option. "Old" files are defined to be older than 24 hours.

To clear all files, not just old files, use the -f (force) option. Warning: The force option can delete all of your cache which can make the next update round take a long time to complete! To see what rbclean would do in that case, run it with the -d (dry run) parameter so nothing gets deleted actually.

In your daily maintenance you can add "rbclean -pc" to have your old and stalled data removed once a day. To quickly delete your png cache in case apache or rrdtool don't refresh, use "rbclean -pf".


10. Modules

Modules by default are installed in your $perllibdir/RRD/Function. A module can be referenced to by the Type: keyword in the NFO files. The folowwing is a list of the module names together with their description.

apabpr		Apache Bytes Per Request
apabw		Apache Bandwidth
apacpu		Apache CPU Usage
apaproc		Apache Processes
apaxs		Apache Accesses
apccurrent	APC UPS Output Current
apcload		APC UPS Output Load
apctemp		APC UPS Temperature
apctime		APC UPS Time of battery
ascinuse	Ascend ports in use
bindqrs		Bind Number of Queries
c72temp		Cisco c7xxx Temperature
carlist		Cisco CAR rate-limit
catalyst	Cisco Catalyst OS Port
catalyst64	Cisco Catalyst OS 64 Bits Port
ciscocpu	Cisco CPU Usage
memfree		Cisco Free Memory
ciscoppp	Cisco PPP Channels in use
errors		Line Errors
fsstat		Linux Openfiles (needs snmpd setup)
lincpu		Linux CPU Usage
lindiskblk	Linux Disk Blocks per Second
lindiskio	Linux Disk I/O's per Second
linload		Linux Load Average
linmem22	Linux 2.2 Memory Usage
linmem		Linux 2.4 Memory Usage
linofiles	Linux Open Files
linproc		Linux Number of Processes
linsockets	Linux Open Sockets
oidgauge	Generic OID Plotter
oidderive	Generic OID Plotter
port		Generic Port (This is the default if you dont specify Type:)
port64		Generic 64 Bits Port
rtstats		Request Tracker Queue Statistics
tcpres		TCP Probe Response Time
telnet		Generic Telnet/TCP Interface
w2kcpu		Windows 2000 CPU Usage
w2kmem		Windows 2000 Memory Usage

It should be noted that besides Linux, most of the lin* modules are actually net-snmp modules so some of them should also work on FreeBSD, Solaris, HP-UX, and whatever system which runs net-snmp.

The following is a list of nice OIDs to use with the oidgauge module:

PIX Open Connections:	.1.3.6.1.4.1.9.9.147.1.2.2.2.1.5.40.6
PIX Free Memory:	.1.3.6.1.4.1.9.9.48.1.1.1.6.1
PIX Used Memory:	.1.3.6.1.4.1.9.9.48.1.1.1.5.1

Basically anything can be plotted. See also the rrdtool homepage to see things other people are doing with rrdtool.

.nfo files are placed in the nfos directory in the rrdbrowse installation path (default /usr/local/etc/rrd/nfos). The .nfo files contain one or more variables and values depending on the extention used. This file tries to explain how variables in .nfo are used to gather data for several types of modules. Modules are the small perl snippets which gather data and plot them using rrdtool.


10.1. Global Required Fields

These fields are required in all .nfo files:


 Type: <name>
 Target: <variable>
 Description: <string>

The Type: field contains the name of the module used to gather data. Examples are 'lincpu' to plot the CPU usage of a linux system, 'port' to plot the default snmp ifInOctets and ifOutOctets of an interface. Target: is the authenticator used to do the datagathering. If an SNMP-based object is used, the Target field must be community@address just like you would do in MRTG and or other applications which use the same snmp-perl modules as rrdbrowse.


10.2. Global Optional Fields


 Bandwidth: <string>
 Connection: <string>
 Router: <string>


 Options: nowarn
 HRule: <value>

The Bandwidth, Connection and Router keywords are only used to display them in the web interface. They are not used to gathering the data itself or for creating rrd databases. If you specify them you will see them in the show file detail output of the CGI script.

The Options: field can have only one value at the moment; ``nowarn''. If you specify the nowarn option then there is no logging of snmp timeouts or other times when data can't be collected for some unknown reason.

The HRule: displays a fixed red line in the rrdtool generated graph. The value of this field depends on the datatype used you are graphing.


10.3. Module Requirements

10.3.1. Generic SNMP ifOctets modules: (port, port64, catalyst, catalyst64, errors)


        Target: community@address
        In: Description of the Incoming traffic
        Out: Description of the Outgoing traffic
        Interface: Exact textual description of the interface 

This module plots the in and out bytes of an interface on a router or other networked object with snmp access. The interface description is cached and refreshed every couple of hours.


10.3.2. Cisco CAR rate-limit: (carlist)


	Target: community@address
	Interface: Exact textual description of the interface 
	ListNr: acl number on interface

The ListNr is the number of the access list on the interface. No, this is not the 'access-list XXX' number, but a plain counter which starts from 1. If you update the access-list the counter will rise, most of the time. This is still experimental software so snmp and me will have to find a way to stabelize the ListNr. If it doesn't work, raise or lower the number.


10.3.3. Cisco Envinronment: (c72temp, ciscocpu, ciscoppp, memfree)


        Target: community@address

The c72temp module reads 4 temperatures from your cisco. It's tested on 7200 and 7500 series routers. Depending on your configuration (and by default) it plots one Inlet and three Outlets. If you have a different setup you may have to edit the descriptions in the perl module c72temp.pm

The ciscocpu prints the usage of the first 4 CPU's it finds. Most of you will have atleast one main cpu and one or more on one of the extention cards (VIPs) for example.

The ciscoppp module plots the number of ppp connections in use. It's tested atleast 3600 series. Probably doesn't work in all cases.

The memfree module plots the amount of free memory in your cisco.


10.3.4. Linux Envinronment: (lincpu, linload, linmem, linmem22, linproc)


        Target: community@address

For this to work you should have a running net-snmp setup. Besides Linux some of these modules will also work on FreeBSD or Solaris.

The module lincpu and linload respectively plot the CPU and Load Average of your linux box. The lincpu plots the percentage of Nice, System and User CPU time. linload plots the 1, 5 and 15 minute load averages.

linmem and linmem22 are used for plotting the memory usage of your machine. It plots the Real, Cache, Swap, Buffer and Cache memory. Use linmem22 for Linux 2.2 kernels and use linmem for 2.4 kernels.

The module linproc plots the number of total number of processes.


10.3.5. Linux Envinronment: (fsstat)


        Target: community@address

Used to plot the number of open files. This needs the 'contrib' binary fsstats.sh copied and installed into net-snmp to a remote server. By default it kind of relies on this setup in your snmpd.conf:


        exec .1.3.6.1.4.1.2021.51 fsstats /usr/local/fsstats.sh

This setup is however quite not how i wanted to have it what is could have been, or whatever. You should not rely on this module to be available in next releases of rrdbrowse.


10.3.6. Linux Disk IO: (lindiskblk, lindiskio)

Linux disk io modules can plot the number of blocks per second or the number of sectors per second from each physical disk. You have to create a port in inetd and output the command procstat.sh found in the contrib directory of the rrdbrowse distribution. Example inetd.conf entry:


    procstat stream tcp nowait root /usr/sbin/tcpd /usr/sbin/procstat.sh

The first entry 'procstat' is the portnumber which you can specify in /etc/services. Don't forget, it's probably safe to setup tcp wrappers. The entry name in your hosts.allow file must be 'procstat.sh'.

To plot the first physical disk, use nfo entries like:


    Target: port@host
    Disk: 8,0


10.3.7. Linux Open Files and Sockets: (linofiles, linsockets)

These modules also make use of the procstat binary and caching mechanism. See the Linux Disk IO description on how to install procstat to your inetd. The Target: of the openfiles and sockets nfo file also must have the same portnumber:


    Target: port@host


10.3.8. APC UPS: (apccurrent, apcload, apctemp, apctime)


        Target: community@address

This module works with an with APC DP9606 management card. It's unclear if it works with other types or not.


    apccurrent  Amount of current (ampere) in use at the output of the UPS.
    apctime     Time left on the battery in case of an power outage.
    apctemp     UPS envinronment temperature in Celcius.
    apcload     Percentage of load on the output. 


10.3.9. Apache Counters: (apabpr, apabw, apacpu, apaproc, apaxs)


        Target: http://your.server.name/server-status?auto

The Apache modules rely on the server-status module to be active. Please test to retrieve the above url with your browser and don't forget to add the ?auto to make Apache print it in system readable format.

The apache modules basically print the same as the result provided from the above url. The module descriptions are:


        apabpr  Average bytes per request
        apabw   Bandwidth used on apache processes
        apacpu  Apache CPU usage (unknown values, useless)
        apaproc Apache number of Idle and Active proesses
        apaxs   The average number of access per second


10.3.10. Generic TCP Response Time counter: (tcpres)


        Target: port@hostaddress
        Send: <string> 

This module connects to a host and optionally sends the strings specified in Send: to the target host. It measures the time it takes for the first line returned from the targethost. If no Send: line is specified it just waits for the first thing returned. \n is converted to a real return in the Send string. Here are a few examples.


        Send: HEAD / HTTP/1.0\n         # http response times
        Send: HELO my.host.name         # smtp helo response time
        Send: MODE READER               # nntp response time
        Send: QUIT 


10.3.11. Bind Queries: (bindqrs)


        Target: port@hostaddress

This module prints the number of A, PTR, MX and NS queries your nameserver are doing per second. Since a complete 'ndc stats' dump is cached you can quite possibly clone this module and plot other values.

A remote inetd/telnet setup is needed for this module to work. This is documented in detail in the 'contrib' binary nsstats.sh.


10.3.12. Misc Counters: (oidgauge, oidderive, telnet)

oidgauge can plot any counter as gauge returned from SNMP. You can use this module to plot counters like connection, open files, memory usage, whatever. oidgauge and oidderive are basically the same, but use other rrd datasource types instead. Thanks to Ingimar Robertsson for the oidderive module.


        Target: community@address
        OID: The object ID you want to plot (eg .1.3.5.7.9.1.2.3)

telnet is a simple telnet script which telnets to a port and uses the first number returned. You can install simple inetd scripts witch return whatever you need to have calculated on the server.


        Target: port@hostaddress


10.3.13. Request Tracker (RT) Statistics: (rtstats)

RT is the beautifull trouble ticketing system from Jesse Vincent. RT is available at http://www.bestpractical.com. RRDBrowse can be used to plot the number of tickets on a per queue and per user basis.

To make it work you need to open a port from inetd and run the following command:


    /rt2/bin/rt --summary %id6%status8%owner10%queue16%created25%updated25 
                --limit-status=new --limit-status=open

This command is then retrieved and cached so you can plot multiple stats. Required fields to make it work are:


    Target: port@host
    Queue: queuename
    Status: open (or new, or whatever else you specified with --limit-status)
    Users: Nobody tommy sjakie (space separated list)

At this moment a maximum of 12 users can be plotted per queue, if you need more you need to tweak rtstats.pm.


10.3.14. Windows 2000 Statistics: (w2kcpu, w2kmem)

The windows 2000 modules were kindly submitted by Okumura Yoshifumi. For them to work you need to install SNMP for the public community, this can be found at: http://www.wtcs.org/snmp4tpc/ The Target: parameter should have the community and hostname to make those modules work, Type: can be one of w2kcpu or w2kmem to plot CPU or memory respectively.


    Target: community@hostname


11. History

First there was MRTG which does a nice job. But as the company grows and grows we needed more power. Not only more power to handle more items, but also more configuration power. As well for new clients, as well as objects which could be graphed.

MRTG was first extended to print and plot all kind of information we needed. It quickly evolved to a PHP interface. The NFO files idea was born and by using a few shell scripts those NFO files were parsed and piped to MRTG.

At some time later I discovered RRDtool and started to experiment a little. A few Perl scripts were developed to handle SNMP requests and put those in the RRD databases. Those Perl scripts also read from the NFO files. Basically, this is were RRDBrowse was born.

After a year or so I decided to rewrite and structure everything and release it as what is now known as RRDBrowse. Nowadays it runs a couple of hundred statistics in a few seconds at the company I work for. Authentication was wrapped around and it is now part of the customer portal too.


12. References

RRDBrowse wouldn't be possible without RRDtool, ofcourse. Download RRDtool from: http://www.rrdtool.org/ Download net-snmpd at: http://www.net-snmp.org/ Download windows SNMP for the public community: http://www.wtcs.org/snmp4tpc/ RRDBrowse homepage is located at: http://www.rrdbrowse.org/ RRDBrowse was made possible by Support Net B.V. who have an excellent lineup of equipment where most of this software was tested on initially. Visit Support Net at: http://www.support.net/


13. Copyright and License

Copyright (c) 2003, 2004, Tommy van Leeuwen

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.