Housekeeping Tasks

There are a number of housekeeping tasks you may want to do regularly to ensure optimum system performance. We also recommend you complete some of these housekeeping tasks before upgrading Meridian.

Automatic tasks

Pruning old events

The following query runs from vacuumd-configuration.xml to delete events older than six weeks that have no associated outages:

--# this deletes any events that are not associated with outages
DELETE FROM events WHERE NOT EXISTS
(SELECT svclosteventid FROM outages WHERE svclosteventid = events.eventid
UNION
SELECT svcregainedeventid FROM outages WHERE svcregainedeventid = events.eventid
UNION
SELECT eventid FROM notifications WHERE eventid = events.eventid)
AND eventtime < now() - interval '6 weeks';

It is also recommended to have a query in place to delete all events beyond a useful age. The following example is frequently used in vaccumd-configuration to prune old events:

<statement>
   DELETE FROM events WHERE (eventcreatetime &lt; now() - interval '180 days');
</statement>

Pruning old alarms

Pruning of old or expired alarms is performed by drools rules that are part of alarmd. The following two drools rules from $OPENMS_HOME}/alarmd/drools-rules.d/alarmd.drl delete acknowledged alarms older than three days, and any alarm older then 8 days.

rule "GC"
  salience 0
  when
    $sessionClock : SessionClock()
    $alarm : OnmsAlarm(alarmAckTime == null)
    not( OnmsAlarm( this == $alarm ) over window:time( 3d ) )
  then
    alarmService.deleteAlarm($alarm);
end

rule "fullGC"
  salience 0
  when
    $sessionClock : SessionClock()
    $alarm : OnmsAlarm()
    not( OnmsAlarm( this == $alarm ) over window:time( 8d ) )
  then
    alarmService.deleteAlarm($alarm);
end

Manual tasks

Node Inventory Maintenance

Meridian administrators are strongly encouraged to remove nodes from monitoring that are known to be permanently down. Nodes determined to be down are still polled on the critical service on a schedule determined by the polling package’s downtime model. This consumes a pollerd thread, which must then wait until all retries and timeouts on the critical service are exhausted before moving on to other useful work, wasting resources. Similarly for collectd, which will still attempt to collect metrics from a down node until all retries and timeouts are exhausted, consuming a collectd thread for the duration. Maintaining a clean inventory will keep your Meridian instance performin at its peak.