Test Traffic Project

SW-MGMT


Introduction

The management actions for TTM and RIS network include:

  1.  Installation
  2.  Configuration
  3.  Fault management
  4.  Security checks
  5.  SW consistency checks (local)
Management actions have to be run on both the remote machines as
well as on a central machine at the NCC.  On all the machines,
we want to ensure that the necessary processes are up and running,
while from the NCC we want to test that the box is still reachable.

The configuration files belong to so called "management information".
Since we don't have any secure network-wide management system,
it's desirable to use the SW-SYNC mechanism in combination
with configuration mechanism(s), which:

For maintainability, configuration information should not be fragmented.

Such a tool has been found; it is named cfengine (CFE) and is openly developed.
The tool CFE covers the most frequent administration needs;
at the same time it can be easily expanded through custom modules and scripts
or programs written in any language.

Capabilities

CFE is capable of the following:


Objectives

  1. Distributed, local self-monitoring
  2. Auto-restart of system processes such as named, sshd, xntpd, sendmail.
  3. Daily/Weekly/Monthly overview mails should be improved.

  4. Much of the functionality in these scripts is delivered
    as standard CFE features:
  5. Ensure remote maintenance of the machines.

  6. We want to ensure that all machines are still reachable from the NCC and
    can be connected to via ssh at regular intervals (a couple of times a day).
  7. Monitoring of TT software, ensuring that all data-taking processes

  8. are running correctly and still produce output; if not then restart
  9. managing state of the boxes (SETUP, ON, OFF, WATCH ...)


Requirements relating to SW-MGMT

Example implementation
#!/usr/local/sbin/cfengine -f
control:
  domain = ( ripe.net )
  sysadm = ( tt-ops@ripe.net )
  actionsequence = ( directories tidy )
#example of keeping the filesystem in a desired mode
directories:
  TestBoxes::
    /usr/sbin mode=755
tidy:
  TestBoxes::
     #typical system maintenance...
     /usr/home/                     pat=*.core          age=0   r=inf
     #next rule replaces 35 lines of perl code in ~ttraffic/bin/cleanup.pl (!!!)
     /var/log/xntp                  pat=*stats*         age=30