Flows Troubleshooting

This section describes troubleshooting methods for flows and associated features.

Telemetryd

You can run the following checks to determine that telemetryd is working as expected:

  • Check telemetryd availability.

  • Verify that routers are sending data.

  • Check Docker networking configuration.

  • Perform a Minion health check.

  • Perform a Horizon health check.

  • Verify that sink consumer graphs are populated.

  • Verify that SNMP is available on routers that provide Netflow.

  • Review Horizon and Minion logs.

The Troubleshoot Telemetryd Discourse article describes how to run these checks.

No data or incorrect data

In a scenario where you see no data or incorrect data, you should view the state and parameters of telemetry listeners:

opennms:telemetry-listeners

Check Elasticsearch persistence

If you store flows in Elasticsearch, you can use Kibana to check if flow documents (raw and aggregated) are written to Elasticsearch.

You must know your endpoint address and API key.

To check Elasticsearch persistence, run this command:

curl "https://your-es-server.hostname:9200/_cat/indices?v&apikey=Your-API-Key"

The query returns a list of indices. Those that start with a . are system indices; all others are regular indices. Regular indices appear only when Elasticsearch is receiving flows. If no regular indices are displayed, check etc/telemetryd-configuration.xml and verify that Netflow listeners are enabled.

For more information on troubleshooting Elasticsearch, refer to the official Elasticsearch documentation.

Persisted flows do not appear in OpenNMS plugin for Grafana

If your persisted flows are not displayed in the OpenNMS plugin for Grafana, check the elasticUrl property in etc/org.opennms.features.flows.persistence.elastic.cfg. If you are using aggregated flows, ensure that your aggregate.elasticIndexStrategy matches the index strategy that you configured in the streaming analytics tool. To persist only raw flows or only aggregated flows, set either alwaysUseRawForQueries or alwaysUseAggForQueries, as appropriate.

Verify flows by device

To verify flows by device, click Info  Nodes in the web UI’s top menu bar. The flows indicator symbol on the Nodes page shows flows data for each device, with SNMP details and the direction of flows.

Replay flows from packet capture

You can replay flows from a package capture to help debug flow processing:

  1. List the available listeners and parsers:

    opennms:telemetry-listeners
  2. Replay the package capture to the target parser (obtained from the previous step):

    opennms:telemetry-replay-pcap <listener> <parser> <path-to-pcap-file>

This example is the result of replaying a package capture with Netflow 9 flows to the Netflow 9 parser:

admin@opennms()> opennms:telemetry-listeners
Name = Multi-UDP-9999
Description = UDP *:9999
Properties:
  Max Packet Size = 8096
  Port = 9999
Parsers:
  - Multi-UDP-9999.Netflow-5-Parser
  - Multi-UDP-9999.Netflow-9-Parser
  - Multi-UDP-9999.IPFIX-TCP-Parser
  - Multi-UDP-9999.SFlow-Parser
admin@opennms()> opennms:telemetry-replay-pcap Multi-UDP-9999 Multi-UDP-9999.Netflow-9-Parser /tmp/flows.pcap
Processing packets from '/tmp/flows.pcap'.
Processing packet #100.
Processing packet #200.
Processing packet #300.
Processing packet #400
Processing packet #500.
Done processing 515 packets.
admin@opennms()>

Flows are ingested using the same pipeline that they would if received directly from the devices. Nodes with interfaces that match the IP addresses in the package capture must exist to associate the results with a node.

Correct clock skew

Flow analyses use timestamps that are exposed by the underlying flow management protocol. The timestamps are set depending on the exporting router’s clock settings. If the router’s clock differs from the actual time, the skew is reflected in the received flows; this can subsequently impact further analysis and aggregation.

Horizon can correct skewed flow timestamps. To do so, it compares the exporting device’s current time with the actual time that the packet is received. If the times differ significantly, the time that the flow was received is considered more correct and all timestamps associated with the flow are updated.

To enable clock correction, configure a threshold for the maximum allowed delta, in milliseconds. Note that setting the threshold to 0 disables the correction mechanism.

Enable clock correction and set maximum delta
$ ssh -p 8101 admin@localhost
...
admin@opennms()> config:edit org.opennms.features.flows.persistence.elastic
admin@opennms()> config:property-set clockSkewCorrectionThreshold 5000
admin@opennms()> config:update