Log files written by cyclog
and many of the alternatives one can use in place of it are sequences of zero or more variable-length records that end with linefeeds.
There are a number of tools that can be used to post-process the contents of log directories.
One can "follow" a log with the tail -F
(not -f
) command applied to its current
file.
Note:
This does not "catch up" with old logs, and only processes the current file.
The -n +0
option can be added to follow from the beginning of the current file.
Also note that it has a race condition where tail
will miss out stuff if (for whatever reason) rotation from one file to the next happens faster than tail
can switch files to keep up.
tail
ends up skipping whole files.
(This is fairly easy to trigger in practice if the output of tail
is to something slow like a slow scrolling remote terminal over a low bandwidth or lossy network connection.)
GNU tail
and BSD tail
have a multiplicity of problems handling rotated log files.
One can convert the contents of a log file from external TAI64N form to a human-readable timestamp in the current timezone, with the tai64nlocal
tool, which can be used both as a filter within a post-processing pipeline and as a straight file-reading utility.
With GNU awk
, tai64nlocal
can be used as an awk "co-process" to convert timestamps:
print |& "tai64nlocal" "tai64nlocal" |& getline
Over the years, many log administration and analysis tools, from logstash to Sawmill (to name but two), have gained the ability to understand TAI64N directly, without need for using tai64nlocal
as an intermediary.
Log file directories are locked by the log writing programs with the conventional daemontools lockfile mechanism.
One can arrange to execute tasks, interlocked with the logging service not running, using the setlock
tool.
For example, this arranges to temporarily stop the log service connected to the local-syslog-read
service and archive a snapshot of its log directory:
setlock /var/log/sv/local-syslog-read/lock sh -c 'pax -w /var/log/sv/local-syslog-read/@*.[su] /var/log/sv/local-syslog-read/current > snapshot.tar' & system-control condrestart cyclog@local-syslog-read
(Note the subtlety of wildcard expansion being deferred until setlock
has acquired the lock.)
All log lines begin with a timestamp; TAI64N timestamps sort lexically into chronological order; and each log file is by its nature already sorted into chronological order.
This means that log files are suitable for using the -m
option to the sort
command, in order to merge sort multiple log files together into a single log.
The following example makes use of this to sort all of the last hour's logs from all (nosh managed, system-wide) services together:
find /var/log/sv/*/ -type f \( -name current -o -name '@*.[su]' \) -mmin -60 -print0 | xargs -0 sort -m -- | tai64nlocal | less -S +G
The follow-log-directories
and export-to-rsyslog
tools understand the structure of log directories, and can thus do things that tail -F
cannot:
They know the naming scheme for old rotated log files and know to scan through them, reading any that are newer than the point last recorded in their "log cursor", before reading the current file.
They know to skip through any given log file, to the next entry after the point last recorded in their "log cursor".
They know to read all of the way to the end of the current file before taking note of a rename notification triggered by a file rotation.
Their "log cursors" are persistent and tweakable, and not just transient state held within some kernel open file description that is lost when the process holding it open terminates. They will "remember" where they last left off if terminated and then re-invoked. An administrator can hand-edit the "cursor" file with the TAI64N timestamp of the desired place in the log to (re-)start from.
With export-to-rsyslog
one can build a de-coupled log export service that will export a machine's logs over the network to an RSYSLOG server, without skipping log entries and with the log directory itself being a buffering mechanism that allows the logged service to pull ahead of the export service without the risk of short-lived temporary network glitches blocking the logged service because its output pipe is full.
With follow-log-directories
one can build de-coupled log post-processing services that do things like pass logs through awk
or perl
scripts and perform tasks when log entries match particular patterns.
Again, the log directory itself acts as a buffer that allows the logged service to pull ahead if the tasks happen to take a long time.
There is a whole bunch of log post-processing tools for specific services written by various people. Here are just a couple, in no particular order:
Skaarup's tools for analyzing the logs of various Bernstein softwares, from axfrdns to publicfile httpd.
Erwin Hoffmann's newanalyse that analyses qmail log files specifically.