SnmpMonitor
The SNMP monitor provides a common platform to monitor states and results from SNMP agents. This monitor has two basic operation modes:
-
Test the response value of one specific OID (scalar object identifier)
-
Test multiple values in a whole table
To decide which mode to use, use the walk
and match-all
parameters.
The Operating mode selection and Monitor specific parameters for the SnmpMonitor tables below provide more information about these operation modes.
walk | match-all | Operating mode |
---|---|---|
true |
true |
tabular, all values must match |
false |
tabular, any value must match |
|
count |
specifies that the value of at least minimum and at most maximum objects encountered in |
|
false |
true |
scalar |
false |
scalar |
|
count |
tabular, between minimum and maximum values must match |
Monitor facts
Class Name |
|
When the monitor is configured to persist the response time, it counts the total amount of time spent until it obtains a successful one, including the retries. It stores the entire poll time, not just the time spent during the last successful attempt.
Configuration and use
Parameter | Description | Default |
---|---|---|
hex |
Specifies that the value monitored should be compared against its hexadecimal representation. Useful when the monitored value is a string containing non-printable characters. |
false |
match-all |
Can be set to:
|
true |
maximum |
Valid only when match-all is set to |
0 |
minimum |
Valid only when match-all is set to |
0 |
oid |
The object identifier of the MIB object to monitor. If no other parameters are present, the monitor asserts that the agent’s response for this object must include a valid value (as opposed to an error, no-such-name, or end-of-view condition) that is non-null. |
.1.3.6.1.2.1.1.2.0 (SNMPv2-MIB::SysObjectID) |
operand |
The value to compare against the observed value of the monitored object.
Note: Comparison will always succeed if either the |
n/a |
operator |
The operator to use for comparing the monitored object against the operand parameter. Must be one of the following symbolic operators:
Note: Comparison will always succeed if either the operand or operator parameter isn’t set and the monitored value is non-null.
Keep in mind that you need to escape all < and > characters as XML entities ( |
n/a |
port |
Destination port where the SNMP requests are sent. |
from |
reason-template |
A user-provided template used for the monitor’s reason code if the service is unavailable. Defaults to a reasonable value if unset. See below for an explanation of the possible template parameters. |
depends on operation mode |
retries |
Deprecated Same as retry. Parameter retry takes precedence if both are set. |
from |
walk |
|
false |
This monitor implements the Common Configuration Parameters.
Variable | Description |
---|---|
${hex} |
Value of the |
${ipaddr} |
IP address polled. |
${matchall} |
Value of the |
${matchcount} |
When |
${maximum} |
Value of the |
${minimum} |
Value of the |
${observedvalue} |
Polled value that made the monitor succeed or fail. |
${oid} |
Value of the |
${operand} |
Value of the |
${operator} |
Value of the |
${port} |
Value of the |
${retry} |
Value of the |
${timeout} |
Value of the |
${walk} |
Value of the |
Example: monitoring a scalar object
As a working example, we want to monitor the thermal system fan status which is provided as a scalar object ID.
cpqHeThermalSystemFanStatus .1.3.6.1.4.1.232.6.2.6.4.0
The manufacturer MIB gives the following information:
SYNTAX INTEGER {
other (1),
ok (2),
degraded (3),
failed (4)
}
ACCESS read-only
DESCRIPTION
"The status of the fan(s) in the system.
This value will be one of the following:
other(1)
Fan status detection is not supported by this system or driver.
ok(2)
All fans are operating properly.
degraded(3)
A non-required fan is not operating properly.
failed(4)
A required fan is not operating properly.
If the cpqHeThermalDegradedAction is set to shutdown(3) the
system will be shutdown if the failed(4) condition occurs."
The SnmpMonitor is configured to test if the fan status returns ok(2)
.
If so, the service is marked as up.
Any other value indicates a problem with the thermal fan status and marks the service down.
Note that you must include the monitor
section for each service in your definition.
poller-configuration.xml
<service name="HP-Insight-Fan-System" interval="300000" user-defined="false" status="on">
<parameter key="oid" value=".1.3.6.1.4.1.232.6.2.6.4.0"/>(1)
<parameter key="operator" value="="/>(2)
<parameter key="operand" value="2"/>(3)
<parameter key="reason-template" value="System fan status is not ok. The state should be ok($\{operand}) the observed value is $\{observedvalue}. Please check your HP Insight Manager. Syntax: other(1), ok(2), degraded(3), failed(4)"/> (4)
</service>
<monitor service="HP-Insight-Fan-System" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor" />
1 | Scalar object ID to test. |
2 | Operator for testing the response value. |
3 | Integer 2 as operand for the test. |
4 | Encode MIB status in the reason code to give more detailed information if the service goes down. |
Example: test SNMP table with all matching values
The second mode shows how to monitor the values of a whole SNMP table. As a practical use case, the status of a set of physical drives is monitored. This example configuration shows the status monitoring from the CPQIDA-MIB.
We use the physical drive status given by the following tabular OID as a scalar object ID:
cpqDaPhyDrvStatus .1.3.6.1.4.1.232.3.2.5.1.1.6
SYNTAX INTEGER {
other (1),
ok (2),
failed (3),
predictiveFailure (4)
}
ACCESS read-only
DESCRIPTION
Physical Drive Status.
This shows the status of the physical drive.
The following values are valid for the physical drive status:
other (1)
Indicates that the instrument agent does not recognize
the drive. You may need to upgrade your instrument agent
and/or driver software.
ok (2)
Indicates the drive is functioning properly.
failed (3)
Indicates that the drive is no longer operating and
should be replaced.
predictiveFailure(4)
Indicates that the drive has a predictive failure error and
should be replaced.
The configuration in our monitor tests all physical drives for status ok(2)
.
poller-configuration.xml
Note that you must include the monitor
section for each service in your definition.
<service name="HP-Insight-Drive-Physical" interval="300000" user-defined="false" status="on">
<parameter key="oid" value=".1.3.6.1.4.1.232.3.2.5.1.1.6"/>(1)
<parameter key="walk" value="true"/>(2)
<parameter key="operator" value="="/>(3)
<parameter key="operand" value="2"/>(4)
<parameter key="match-all" value="true"/>(5)
<parameter key="reason-template" value="One or more physical drives are not ok. The state should be ok($\{operand}) the observed value is $\{observedvalue}. Please check your HP Insight Manager. Syntax: other(1), ok(2), failed(3), predictiveFailure(4), erasing(5), eraseDone(6), eraseQueued(7)"/> (6)
</service>
<monitor service="HP-Insight-Drive-Physical" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor" />
1 | OID for SNMP table with all physical drive states. |
2 | Enable walk mode to test every entry in the table against the test criteria. |
3 | Test operator for integer. |
4 | Integer 2 as operand for the test. |
5 | Test in walk mode has to pass for every entry in the table. |
6 | Encode MIB status in the reason code to give more detailed information if the service goes down. |
Example: test SNMP table with some matching values
This example shows how to use the SnmpMonitor to test if the number of static routes are within a given boundary. The service is marked as up if at least three and at maximum of 10 static routes are set on a network device. This status can be monitored by polling the table ipRouteProto from the RFC1213-MIB2.
ipRouteProto 1.3.6.1.2.1.4.21.1.9
The MIB description provides the following information:
SYNTAX INTEGER {
other(1),
local(2),
netmgmt(3),
icmp(4),
egp(5),
ggp(6),
hello(7),
rip(8),
is-is(9),
es-is(10),
ciscoIgrp(11),
bbnSpfIgp(12),
ospf(13),
bgp(14)}
ACCESS read-only
DESCRIPTION
"The routing mechanism via which this route was learned.
Inclusion of values for gateway routing protocols is not
intended to imply that hosts should support those protocols."
To monitor only local routes, apply the test only on entries in the ipRouteProto table with value 2
.
The number of entries in the whole ipRouteProto table has to be counted and the boundaries on the number has to be applied.
Note that you must include the monitor
section for each service in your definition.
<service name="All-Static-Routes" interval="300000" user-defined="false" status="on">
<parameter key="oid" value=".1.3.6.1.2.1.4.21.1.9" />(1)
<parameter key="walk" value="true" />(2)
<parameter key="operator" value="=" />(3)
<parameter key="operand" value="2" />(4)
<parameter key="match-all" value="count" />(5)
<parameter key="minimum" value="3" />(6)
<parameter key="maximum" value="10" />(7)
</service>
<monitor service="All-Static-Routes" class-name="org.opennms.netmgt.poller.monitors.SnmpMonitor" />
1 | OID for SNMP table ipRouteProto . |
2 | Enable walk mode to test every entry in the table against the test criteria. |
3 | Test operator for integer. |
4 | Integer 2 as operand for testing local route entries. |
5 | Test in walk mode is set to count to get the number of entries in the table regarding operator and operand. |
6 | Lower count boundary set to 3 . |
7 | High count boundary is set to 10 . |