Deep Learning Engine
The Deep Learning Engine is a Cluster Engine that uses a TensorFlow model to build clusters.
In order to build clusters using TensorFlow, we reduce the cluster problem to a binary classification problem that is analogous to asking: are these two alarms related? When two alarms are related, we add these to a cluster along with any other alarms that are related to any of these and so on. In order to help reduce the computational complexity in performing these pairwise computations, we limit the search of potential candidates to those that are nearby (within a distance of \(\epsilon\).)
The current model definition has been developped using Ludwig:

The model input features include:
-
The inventory object types (categorical)
-
Relations between the inventory objects (binary)
-
Difference in time (numerical)
-
Distance on the graph (numerical)
See ludwig_model.yaml for the complete model definition. |
Using the engine
To use the engine, install the alec-engine-deeplearning
feature.
Once installed, restart the driver from the Karaf shell using:
bundle:restart org.opennms.alec.driver.main
Running the list-graphs
command should now reflect that the deeplearning
engine is being used:
admin@opennms> opennms-alec:list-graphs
deeplearning: 2 situations on 976 vertices and 839 edges.
Once installed add the feature to the alec.boot in a new line with alec-engine-deeplearning wait-for-kar=opennms-alec-plugin to ensure that engine is re-installed when the services are restarted.
|
Training
These instructions are here to help you get started but are by no means complete. We plan to help make this process easier in future releases. |
Install the following features to access the shell commands required for training:
feature:install alec-features-deeplearning alec-features-shell
Vectorize the datasets
Let’s take a snapshot of the current graph:
opennms-alec:datasource-snapshot /tmp/snap1
Now edit /tmp/snap1/alec.situations.xml
to reflect the desired state of situations.
Build vectors from the dataset:
opennms-alec:tensorflow-vectorize --alarms-in /tmp/snap1/alec.alarms.xml \
--inventory-in /tmp/snap1/alec.inventory.xml \
--situations-in /tmp/snap1/alec.situations.xml \
--csv-out /tmp/snap1/alec.vector.dataset.csv
Train with Ludwig
Use Ludwig to train the model:
ludwig train --data_csv /tmp/snap1/alec.vector.dataset.csv --model_definition_file model.yaml
Pull model.yaml from the ludwig_model.yaml file in the source tree
|
Exporting the trained model
Adapt the following script to export the model to a format that can be used in ALEC:
echo '#!/usr/bin/env python
import numpy as np
import tensorflow as tf
from tensorflow.python.framework import graph_util
from tensorflow.python.framework import ops
from tensorflow.python.saved_model import builder as saved_model_builder
from ludwig import LudwigModel
model_path = "results/experiment_run_0/model"
model = LudwigModel.load(model_path)
builder = tf.saved_model.Builder("export")
with tf.Session(graph=model.model.graph) as sess:
saver = tf.train.Saver()
saver.restore(sess, model.model.weights_save_path)
builder.add_meta_graph_and_variables(sess, [tf.saved_model.tag_constants.SERVING])
builder.save()
model.close()' > export_model.py
chmod +x export_model.py
./export_model.py
mkdir -p /tmp/tf-export
cp -R ./export/* /tmp/tf-export/
cp results/experiment_run_0/model/model_hyperparameters.json /tmp/tf-export/
Use the trained model in ALEC
Verify that the model can be loaded:
opennms-alec:tensorflow-load-model /tmp/tf-export
Configure the engine to use the model:
config:edit org.opennms.alec.engine.deeplearning
property-set modelPath /tmp/tf-export
config:update
Run simulations using the trained model
Generate situations:
opennms-alec:process-alarms --alarms-in /tmp/snap1/alec.alarms.xml \
--inventory-in /tmp/snap1/alec.inventory.xml \
--situations-out /tmp/snap1/alec.situations.deeplearning.trained.xml \
--engine deeplearning
Compare results:
opennms-alec:score-situations -s peer /tmp/snap1/alec.situations.xml /tmp/snap1/alec.situations.deeplearning.trained.xml