Poller Packages

To define more complex monitoring configuration it is possible to group service configurations into polling packages. They allow you to assign different service configurations to nodes.

To assign a polling package to nodes, use the the filter syntax.

Each polling package can have its own Downtime Model configuration.

You can configure multiple packages, and an interface can exist in more than one package. This gives great flexibility to how the service levels will be determined for a given device.

Polling package assigned to Nodes with Rules and Filters
<package name="example1">(1)
  <filter>IPADDR != '0.0.0.0'</filter>(2)
  <include-range begin="1.1.1.1" end="254.254.254.254" />(3)
  <include-range begin="::1" end="ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff" />(3)
1 Unique name of the polling package.
2 Base filter on IP address, categories, or asset attributes of nodes based on filter rules. The filter is evaluated first and is required. This package is used for all IP interfaces that do not have 0.0.0.0 as an assigned IP address and is required.
3 Allow to specify if the configuration of services is applied on a range of IP interfaces_ (IPv4 or IPv6).

Instead of the include-range it is possible to add one or more specific IP interfaces:

Defining a specific IP Interfaces
<specific>192.168.1.59</specific>

It is also possible to exclude IP interfaces:

Exclude IP Interfaces
<exclude-range begin="192.168.0.100" end="192.168.0.104"/>

Response Time Configuration

The definition of polling packages lets you configure similar services with different polling intervals. All the response time measurements are persisted in RRD files and require a definition. Each polling package contains an RRD definition.

RRD configuration for Polling Package example1
<package name="example1">
  <filter>IPADDR != '0.0.0.0'</filter>
  <include-range begin="1.1.1.1" end="254.254.254.254" />
  <include-range begin="::1" end="ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff" />
  <rrd step="300">(1)
    <rra>RRA:AVERAGE:0.5:1:2016</rra>(2)
    <rra>RRA:AVERAGE:0.5:12:1488</rra>(3)
    <rra>RRA:AVERAGE:0.5:288:366</rra>(4)
    <rra>RRA:MAX:0.5:288:366</rra>(5)
    <rra>RRA:MIN:0.5:288:366</rra>(6)
</rrd>
1 Polling interval for all services in this polling package is reflected in the step of size 300 seconds. All services in this package have to be polled in a 5-minute interval, otherwise response time measurements are not persisted correctly.
2 1 step size is persisted 2016 times: 1 * 5 min * 2016 = 7 d, 5 min accuracy for 7 d.
3 12 steps average persisted 1488 times: 12 * 5 min * 1488 = 62 d, aggregated to 60 min for 62 d.
4 288 steps average persisted 366 times: 288 * 5 min * 366 = 366 d, aggregated to 24 h for 366 d.
5 288 steps maximum from 24 h persisted for 366 d.
6 288 steps minimum from 24 h persisted for 366 d.
The RRD configuration and the service polling interval must be aligned. In other cases, the persisted response time data is not correctly displayed in the response time graph.
If you change the polling interval afterwards, you must recreate existing RRD files with the new definitions.

Service Status Persistence

You can configure each service monitor to store the current status of the service in RRD files. This allows you to query the current status and the status history.

This feature is disabled by default. You must enable it individually for each service. To do so, add a service parameter to the appropriate services in poller-configuration.xml:

<service name="EXAMPLE" ...>
  <parameter key="rrd-status" value="true"/>
</service>

Overlapping Services

With the possibility of specifying multiple polling packages it is possible to use the same service (like ICMP) multiple times. The order how polling packages in the poller-configuration.xml are defined is important when IP interfaces match multiple polling packages with the same service configuration.

The following example shows which configuration is applied for a specific service:

Overwriting
<package name="less-specific">
  <filter>IPADDR != '0.0.0.0'</filter>
  <include-range begin="1.1.1.1" end="254.254.254.254" />
  <include-range begin="::1" end="ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff" />
  <rrd step="300">(1)
    <rra>RRA:AVERAGE:0.5:1:2016</rra>
    <rra>RRA:AVERAGE:0.5:12:1488</rra>
    <rra>RRA:AVERAGE:0.5:288:366</rra>
    <rra>RRA:MAX:0.5:288:366</rra>
    <rra>RRA:MIN:0.5:288:366</rra>
  </rrd>
  <service name="ICMP" interval="300000" user-defined="false" status="on">(2)
    <parameter key="retry" value="5" />(3)
    <parameter key="timeout" value="10000" />(4)
    <parameter key="rrd-repository" value="/var/lib/opennms/rrd/response" />
    <parameter key="rrd-base-name" value="icmp" />
    <parameter key="ds-name" value="icmp" />
  </service>
  <downtime interval="30000" begin="0" end="300000" />
  <downtime interval="300000" begin="300000" end="43200000" />
  <downtime interval="600000" begin="43200000" end="432000000" />
</package>

<package name="more-specific">
  <filter>IPADDR != '0.0.0.0'</filter>
  <include-range begin="192.168.1.1" end="192.168.1.254" />
  <include-range begin="2600::1" end="2600:::ffff" />
  <rrd step="30">(1)
    <rra>RRA:AVERAGE:0.5:1:20160</rra>
    <rra>RRA:AVERAGE:0.5:12:14880</rra>
    <rra>RRA:AVERAGE:0.5:288:3660</rra>
    <rra>RRA:MAX:0.5:288:3660</rra>
    <rra>RRA:MIN:0.5:288:3660</rra>
  </rrd>
  <service name="ICMP" interval="30000" user-defined="false" status="on">(2)
    <parameter key="retry" value="2" />(3)
    <parameter key="timeout" value="3000" />(4)
    <parameter key="rrd-repository" value="/var/lib/opennms/rrd/response" />
    <parameter key="rrd-base-name" value="icmp" />
    <parameter key="ds-name" value="icmp" />
  </service>
  <downtime interval="10000" begin="0" end="300000" />
  <downtime interval="300000" begin="300000" end="43200000" />
  <downtime interval="600000" begin="43200000" end="432000000" />
</package>
1 Polling interval in the packages are 300 seconds and 30 seconds
2 Different polling interval for the ICMP service
3 Different retry settings for the ICMP service
4 Different timeout settings for the ICMP service

The last polling package on the service will be applied. This can be used to define a less specific catch-all filter for a default configuration. Use a more specific polling package to overwrite the default setting. In the above example, all IP interfaces in 192.168.1/24 or 2600:/64 will be monitored with ICMP with different polling, retry, and timeout settings.

The WebUI displays which polling packages are applied to the IP interface and service. The IP Interface and Service pages show which polling package and service configuration is applied for this specific service.

03 polling package
Figure 1. Polling Package applied to IP interface and Service

Service Patterns

Usually, the poller used to monitor a service is found by the matching the poller’s name with the service name. There is an option for you to match poller if an additional element pattern is specified. If so, the poller is used for all services matching the RegEx pattern.

The RegEx pattern lets you specify named capture groups. There can be multiple capture groups inside of a pattern, but each must have a unique name. Please note, that the RegEx must be escaped or wrapped in a CDATA-Tag inside the configuration XML to make it a valid property.

If a poller is matched using its pattern, the parts of the service name which match the capture groups of the pattern are available as parameters to the Metadata DSL using the context pattern and the capture group name as key.

Examples:

<pattern><![CDATA[^HTTP-(?<vhost>.*)$]]></pattern>

Matches all services with names starting with HTTP- followed by a host name. If the services is called HTTP-www.example.com, the Metadata DSL expression ${pattern:vhost} will resolve to www.example.com.

<pattern><![CDATA[^HTTP-(?<vhost>.*?):(?<port>[0-9]+)$]]></pattern>"

Matches all services with names starting with HTTP- followed by a hostname and a port. There will be two variables (${pattern:vhost} and ${pattern:port}), which you can use in the poller parameters.

Use the service pattern mechanism whenever there are multiple instances of a service on the same interface. By specifying a distinct service name for each instance, the services is identifiable, but there is no need to add a poller definition per service. Common use cases for such services are HTTP virtual hosts, where multiple web applications run on the same web server or BGP session monitoring where each router has multiple neighbors.

Test Services on Manually

For troubleshooting it is possible to run a test via the Karaf shell:

ssh -p 8101 admin@localhost

Once in the shell, you can print show the commands help as follows:

opennms> opennms:poll --help
DESCRIPTION
        opennms:poll

	Used to invoke a monitor against a host at a specified location

SYNTAX
        opennms:poll [options] host [attributes]

ARGUMENTS
        host
                Hostname or IP address of the system to poll
                (required)
        attributes
                Monitor specific attributes in key=value form

OPTIONS
        --help
                Display this help message
        -l, --location
                Location
                (defaults to Default)
        -s, --system-id
                System ID
        -t, --ttl
                Time to live
        -P, --package
                Poller Package
        -S, --service
                Service name
        -n, --node-id
                Node Id for Service
        -c, --class
                Monitor Class

The following example runs the ICMP monitor on a specific IP interface.

Run ICMP monitor configuration defined in specific Polling Package
opennms> opennms:poll -S ICMP -P example1 10.23.42.1

The output is verbose, which lets you debug monitor configurations. Important output lines are shown as the following:

Important output testing a service on the CLI
Package: example1 (1)
Service: ICMP (2)
Monitor: org.opennms.netmgt.poller.monitors.IcmpMonitor (3)
Parameter ds-name: icmp (4)
Parameter retry: 2 (5)
Parameter rrd-base-name: icmp (4)
Parameter rrd-repository: /opt/opennms/share/rrd/response (4)
Parameter timeout: 3000 (5)

Service is Up on 192.168.31.100 using org.opennms.netmgt.poller.monitors.IcmpMonitor: (6)
	response-time: 407,0000 (7)
1 Service and package of this test
2 Applied service configuration from polling package for this test
3 Service monitor used for this test
4 RRD configuration for response time measurement
5 Retry and timeout settings for this test
6 Polling result for the service polled against the IP address
7 Response time

Test filters on Karaf Shell

Filters are ubiquitous in opennms configurations with <filter> syntax. Use this Karaf shell to verify filters. For more information, see Filters.

ssh -p 8101 admin@localhost

Once in the shell, print command help as follows:

opennms> opennms:filter --help
DESCRIPTION
        opennms:filter
	Enumerates nodes/interfaces that match a give filter
SYNTAX
        opennms:filter filterRule
ARGUMENTS
        filterRule
                A filter Rule

For ex: Run a filter rule that match a location

opennms:filter  "location='MINION'"

Output is displayed as follows

nodeId=2 nodeLabel=00000000-0000-0000-0000-000000ddba11 location=MINION
	IpAddresses:
		127.0.0.1

Another example: Run a filter that matches a node location and for a given IP address range.

opennms:filter "location='Default' & (IPADDR IPLIKE 172.*.*.*)"

Output is displayed as follows:

nodeId=3 nodeLabel=label1 location=Default
	IpAddresses:
		172.10.154.1
		172.20.12.12
		172.20.2.14
		172.01.134.1
		172.20.11.15
		172.40.12.18

nodeId=5 nodeLabel=label2 location=Default
	IpAddresses:
		172.17.0.111

nodeId=6 nodeLabel=label3 location=Default
	IpAddresses:
		172.20.12.22
		172.17.0.123
Node information displayed will have nodeId, nodeLabel, location, and optional fields like foreignId, foreignSource, and categories when they exist.