Housekeeping Tasks
There are a number of housekeeping tasks you may want to do regularly to ensure optimum system performance. We also recommend you complete some of these housekeeping tasks before upgrading Meridian.
Automatic tasks
Pruning old events
The following query runs from vacuumd-configuration.xml
to delete events older than six weeks that have no associated outages:
--# this deletes any events that are not associated with outages
DELETE FROM events WHERE NOT EXISTS
(SELECT svclosteventid FROM outages WHERE svclosteventid = events.eventid
UNION
SELECT svcregainedeventid FROM outages WHERE svcregainedeventid = events.eventid
UNION
SELECT eventid FROM notifications WHERE eventid = events.eventid)
AND eventtime < now() - interval '6 weeks';
It is also recommended to have a query in place to delete all events beyond a useful age. The following example is frequently used in vaccumd-configuration to prune old events:
<statement>
DELETE FROM events WHERE (eventcreatetime < now() - interval '180 days');
</statement>
Pruning old alarms
Pruning of old or expired alarms is performed by drools rules that are part of alarmd.
The following two drools rules from $OPENMS_HOME}/alarmd/drools-rules.d/alarmd.drl
delete acknowledged alarms older than three days, and any alarm older then 8 days.
rule "GC"
salience 0
when
$sessionClock : SessionClock()
$alarm : OnmsAlarm(alarmAckTime == null)
not( OnmsAlarm( this == $alarm ) over window:time( 3d ) )
then
alarmService.deleteAlarm($alarm);
end
rule "fullGC"
salience 0
when
$sessionClock : SessionClock()
$alarm : OnmsAlarm()
not( OnmsAlarm( this == $alarm ) over window:time( 8d ) )
then
alarmService.deleteAlarm($alarm);
end
Manual tasks
Node Inventory Maintenance
Meridian administrators are strongly encouraged to remove nodes from monitoring that are known to be permanently down.
Nodes determined to be down are still polled on the critical service on a schedule determined by the polling package’s downtime model.
This consumes a pollerd
thread, which must then wait until all retries and timeouts on the critical service are exhausted before moving on to other useful work, wasting resources.
Similarly for collectd
, which will still attempt to collect metrics from a down node until all retries and timeouts are exhausted, consuming a collectd
thread for the duration.
Maintaining a clean inventory will keep your Meridian instance performin at its peak.