The core components of the “Nagwad” suite are the following ones:
the nagwad system service that scans the system journal
with the configured set of filters: either simple grep
regular expressions or full-featured sed scripts
(/etc/nagwad/*.regexp, *.sed);
a number of event post-filter scripts in
/etc/nagwad/filter-event.d;
a number of event post-processing scripts in
/etc/nagwad/process-event.d;
the check_nagwad NRPE script that looks for event signal
files in /var/log/nagwad/<boot_id>/<filter>/, where boot_id
is the current system boot ID (/proc/sys/kernel/random/boot_id);
the nagwad utility script that provides CLI to query, check and
manage event signal files (see nagwad(8)).
Addons for various monitoring systems consist of the following components:
a configuration templates for Icinga (2) monitoring agent;
a configuration templates for Nagios monitoring agent;
some actions for Nagstamon monitoring frontend;
the transcript-enabled nsca-shell shell wrapper that
can be used for the remote access.
When one of the configured filters matches a message in the system
journal, nagwad generates an event that is further passed to the
scripts in filter-event.d directory. The scripts are able to
1) signal back to nagwad should the given event be passed or ignored
and 2) to append arbitrary text to event’s message. If event isn’t
ignored, then the corresponding signal file is written to
/var/log/nagwad/<boot_id>/<filter>/. The name of a signal file is
of the form <filter>.<HASH>.<LEVEL>.
An NRPE-compatible monitoring agent is then able to check signal file
using check_nagwad script (the configuration templates for Icinga
and Nagios can be found in the corresponding packages).
When checking for signal files the script ignores files with special
suffix .FIXED. That provides a way to mark the registered events as
resolved (either by use of nagwad fix command or simply by renaming
the file).
In addition to that, each event data is further passed to scripts in
process-event.d directory.
A pre-configured 10-eperm filtering script is used to
additionally filter file access audit events based on actual paths
of the files. It uses .regexp files from
filter-event.d/eperm-skip.d directory to determine
which events to skip. Those events that pass the 10-eperm
post-filter are provided with additional message text: PATH=<path>,
where <path> is the actual path of the file for which the access
was denied.
A special 10-push-icinga script is able to push registered events
to an Icinga 2 instance via its REST API. One of the advantages of
using this script is that it can clear the acknowledgement flag
for the service each time a new event of the same type is registered.
The script reads its configuration from
process-event.d/push-icinga.conf and should run out of the box if
the node is properly set up as an Icinga agent/satellite (it uses
SSL certificates in /var/lib/icinga2/certs directory). However,
in order to use the Icinga REST API an ApiUser with proper set
of permissions should be configured on the target node. It’s handy to
use icinga2-usersyncd daemon for that purpose.
If MAXAGE is set to a greater than 0 value in
/etc/nagwad/nagwad.conf, then, on start, nagwad deletes all
/var/log/nagwad/* directories older than MAXAGE days.
Note, that the time intervals are truncated to integer
numbers. So, in order to be deleted, the log directory has to
be MAXAGE + 1 days older.
The configuration options in /etc/nagwad/nagwad.conf are:
PIDFILE — path to a PID-file written when nagwad service
is started (the default value is /run/nagwad.pid);
LOG_USER — name of the system user that owns signal files
(the default is root);
LOG_GROUP — name of the system user group that has read access
to signal files (the default is nagwad);
CONFDIR — path to the configuration directory (the default is
/etc/nagwad);
POSTFILTERS — path to the post-filter script directory (the
default is $CONFDIR/filter-event.d);
POSTPROCESS — path to the post-processing script directory
(the default is $CONFDIR/process-event.d);
LOGDIR — path to the root of log directory (the default is
/var/log/nagwad);
MAXAGE — number of days after which old signal files may be
deleted (the default is 30);
JOURNAL_TAIL — the number of messages that are pre-read from
the system journal on nagwad start and restart (the default is
5000).
First, determine a set of separate events you want to monitor in the system journals. Then determine for each event would it be a two-level (WARNING and CRITICAL) or a one-level only event. Each event is a “filter” in Nagwad suite, whereas the number of desired event levels determine the way regular expression should be written.
Having a plan, define a set of regular expressions for each new
filter in /etc/nagwad/ directory. For a two-level event it is
necessary to define a <filter>.sed script that a) filters-out
(doesn’t echo) all irrelevant messages and b) transforms relevant
journal messages into the form <filter>:<level>:<message>, where
<level> is either WARNING or CRITICAL. Whereas for one-level
events a simple <filter>.regexp file with a set of regular
expressions (line by line) can be enough. However, to extract and
format event data using a sed script might also be convenient.
In situations, where there isn’t enough data in single journal
message to consist a complete event, place additional post-filter
scripts into /etc/nagwad/filter-event.d/ directory. The file(s)
should be marked as executable. Each post-filter receives 3 arguments:
<filter> <status> <message>, where <filter> is is the name of the
filter that has fished out the event, <status> is either string
"WARNING" or "CRITICAL" and <message> is the filtered journal
message (possibly, transformed by a <filter>.sed script, if any).
A post-filter script should return 0 exit status if the event should
be passed further or the value of the NAGWAD_SKIP_EVENT environment
variable in the case when it should be skipped. Any other exit code
is signals about an internal post-filter error. Any data written
to standard output is appended to the <message>. If it is necessary
to define some parameters for a proper work of a post-filter, it is
okay to place the configuration file into
/etc/nagwad/filter-event.d/ (without giving permissions to
execute it).
Executable scripts, placed in /etc/nagwad/process-event.d/ directory,
are run the same way as post-filter scripts (receiving
<filter> <status> <message> arguments). However, an exit status
of a post-processing script doesn’t affect invocation of other
post-processing scripts. The rest is up to the script. Non-executable
configuration files in /etc/nagwad/process-event.d/ are okay too.