Alarm Lifecycle
The following is an example of the alarm lifecycle based on a nodeLostService
event.
Lifecycle example
A new nodeLostService
event is received and creates a new alarm.
!["New alarm visible in outstanding alarm list](../../_images/alarms/single_alarm_1.png)
Clicking the number displayed in the Count column displays the corresponding events and their details.
!["Event list showing events related to the alarm](../../_images/alarms/single_alarm_2.png)
The alarm clears automatically when service is restored, based on a nodeRegainedService
event.
![Alarm List displaying one cleared alarm and its log message](../../_images/alarms/single_alarm_3.png)
![Event list page displaying one service down event and one service restored event](../../_images/alarms/single_alarm_4.png)
If the problem occurs again, the events are reduced into the existing alarm. The alarm’s count is updated to reflect the new activity.
![Alarm List displaying one alarm with a count of 2](../../_images/alarms/single_alarm_5.png)
count
value![Detailed event list page displaying two service down events and one service restored event, all of which are components of the same alarm](../../_images/alarms/single_alarm_6.png)
The alarm once again clears immediately when service is restored.
![Alarm List displaying one cleared alarm with a count of 2, and its log message](../../_images/alarms/single_alarm_7.png)
Note that the alarm’s count only increments on events with a severity of Warning or greater.
![Detailed event list page displaying two service down events and two service restored events, all of which are members of the same alarm](../../_images/alarms/single_alarm_8.png)
Alarm lifetime rules
Alarms are deleted from the Horizon database after a set amount of time.
This lifetime can be configured via Drools rules in the ${OPENNMS_HOME}/etc/alarmd/drools-rules.d/alarmd.drl
file.
The default alarm lifetimes:
Alarm State | Deletion Delay |
---|---|
Cleared and Unacknowledged |
5 minutes |
Cleared and Acknowledged |
1 day |
Active and Unacknowledged |
3 days |
All other alarms |
8 days |
These delays are based on the last event time, and will restart the counter if a new problem event is reduced into the same alarm.