Other Errors
This section describes other errors that you may encounter when Horizon does not start as expected.
XML configuration errors
XML errors can occur when an XML file in the ${OPENNMS_HOME}/etc/
hierarchy contains errors that cause it not to be a well-formed XML document, or that lead to an invalid configuration according to the XML file’s schema.
Mar 20 14:41:53 rhel9h.totusmel.com opennms[130123]: Starting OpenNMS:
Mar 20 14:41:53 rhel9h.totusmel.com opennms[130823]: ERROR: XML validation failed: /opt/opennms/etc/events/VMWare.events.xml
Mar 20 14:41:53 rhel9h.totusmel.com opennms[130823]: run '/usr/bin/xmllint /opt/opennms/etc/events/VMWare.events.xml' for details
Mar 20 14:41:53 rhel9h.totusmel.com opennms[130123]: Validation failed on 1 XML files. Exiting.
Mar 20 14:41:53 rhel9h.totusmel.com opennms[130123]: failed
If you run config-tester -a
, the output shows that an error is generated immediately after the Drools Northbounder configuration is loaded:
11:05:42.570 [Main] INFO org.opennms.core.xml.AbstractJaxbConfigDao - Loaded Config for Drools Northbounder in 6 ms
java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.opennms.bootstrap.Bootstrap$4.run(Bootstrap.java:531)
at java.base/java.lang.Thread.run(Thread.java:829)
You can use the xmllint
tool to validate the XML files and more effectively identify the source of the error:
# xmllint --schema /var/opennms/xsds/eventconf.xsd /opt/opennms/etc/events/VMWare.events.xml --noout
/opt/opennms/etc/events/VMWare.events.xml:275: parser error : Opening and ending tag mismatch: logmsg line 273 and event
</event>
^
/opt/opennms/etc/events/VMWare.events.xml:276: parser error : Opening and ending tag mismatch: logmsg line 265 and events
</events>
^
/opt/opennms/etc/events/VMWare.events.xml:277: parser error : Premature end of data in tag event line 223
^
In this example, an unmatched tag in /opt/opennms/etc/events/VMWare.events.xml
is the source of the issue:
265 <logmsg dest="logndisplay"><p>
266 vpxdTrap trap received
267 vpxdTrapType=%parm[#1]%
268 vpxdHostName=%parm[#2]%
269 vpxdVMName=%parm[#3]%
270 vpxdNewStatus=%parm[#4]%
271 vpxdOldStatus=%parm[#5]%
272 vpxdObjValue=%parm[#6]%</p>
273 <logmsg>
This error can be fixed by updating line 273 to </logmsg>
.
If you run xmllint
again, the file should validate properly:
# xmllint --schema /var/opennms/xsds/eventconf.xsd /opt/opennms/etc/events/VMWare.events.xml --noout
/opt/opennms/etc/events/VMWare.events.xml validates
Horizon should now start properly.
You can run xmllint
on all files in ${OPENNMS_HOME}/etc/events
using this command:
for x in ${OPENNMS_HOME}/etc/events/*;do xmllint --schema ${OPENNMS_HOME}/share/xsds/eventconf.xsd $x --noout;done
xmllint can help pinpoint errors in XML syntax and in content that violates the schema (using the --schema flag).
More subtle errors may cause problems, however, even if the output indicates that all files are valid.
The config-tester command is even more exhaustive, but it may not catch all errors that break startup.
|
Incoming messages not received
If your $Horizon or Minion server is not processing traps, syslog, flow, or other incoming messages that you are able to see via tcpdump
, check your firewall settings.
The tcpdump
utility will show traffic before system firewalls such as iptables or firewalld block unexpected data.
Database connection deadlocks
If the system ends up in a state where all available database connections are exhausted and fails to recover, you can turn on deadlock detection to help identify the cause.
To do so, enable deadlock detection with the following:
# echo 'org.opennms.core.db.deadlock.detection=true' >> /opt/opennms/etc/opennms.properties.d/db.properties
Restart Horizon for the changes to become effective.
Once enabled, you should see the following in the logs:
2023-07-03 21:38:49,874 ERROR [Main] o.o.c.d.HikariCPConnectionFactory: Deadlock detection is enabled.
Possible deadlocks will be logged with a message similar to the following:
---
2023-07-03 21:39:52,357 ERROR [Main] o.o.c.d.HikariCPConnectionFactory: Possible database deadlock detected: Attempting to acquire connection in thread while existing transaction active.
java.lang.Exception: Possible deadlock
at org.opennms.core.db.HikariCPConnectionFactory.getConnection(HikariCPConnectionFactory.java:109) ~[org.opennms.core.db-31.0.3-SNAPSHOT.jar:?]
at org.opennms.netmgt.filter.JdbcFilterDao.getNodeIPAddressServiceMap(JdbcFilterDao.java:235) ~[opennms-config-31.0.3-SNAPSHOT.jar:?]
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:302) ~[org.apache.servicemix.bundles.spring-aop-4.2.9.RELEASE_1.ONMS.1.jar:?]
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202) ~[org.apache.servicemix.bundles.spring-aop-4.2.9.RELEASE_1.ONMS.1.jar:?]
at com.sun.proxy.$Proxy136.getNodeIPAddressServiceMap(Unknown Source) ~[?:?]
at org.opennms.netmgt.dao.support.DefaultFilterWatcher$FilterSession.lambda$refreshNow$0(DefaultFilterWatcher.java:228) ~[opennms-dao-31.0.3-SNAPSHOT.jar:?]
at org.opennms.netmgt.dao.hibernate.DefaultSessionUtils.withManualFlush(DefaultSessionUtils.java:81) ~[opennms-dao-31.0.3-SNAPSHOT.jar:?]
---
Inspect these messages and stack traces further to see if there are any potential problems. Some of the messages may be false positives.