Debian 10: System monitoring using e-mail (Exim as a smarthost)

Recently, IT infrastructure monitoring tools have been springing up like mushrooms after a rain. But let’s take a step back and look at a traditional and very basic way to monitor a system – using e-mail. Yes, you heard right; that internet app invented in the 1960s!

For some events and incidents on a Debian system, the superuser of the system is still, by default, informed via e-mail. For example, mdadm shoots an e-mail to root when there is a software RAID array degradation. But by default, these e-mails are just dumped to a mail spool file, never inspected. This is really bad when your hard disk just died, and when you were never informed that the other one in the RAID array died too just a couple of months ago!

It has been 8 years since I wrote last about how to configure my favorite Mail Transport Agent, Exim. Back then, as an e-mail service provider, I wanted to run Exim as a production MTA; receiving, sending, and relaying e-mail in the internet. It was a daunting task, to say the least.

This time, however, I just want to run Exim locally, behind a firewall. It should receive locally submitted e-mail, and rather than dump the e-mails to a local mail spool file, forward it to a remote e-mail address, where I can read it on my phone. This mode of an MTA is called smarthost.

A smarthost is a local e-mail MTA/server which receives SMTP messages to any e-mail address locally, and forwards them as an authenticated client to the next MTA (that is why it must be “smart” – it must have the right credentials). The submission must be authenticated (submitting username and password), because it is unlikely that your local IP address from your ISP is not in any e-mail blacklist. And internet-facing MTAs will refuse to accept e-mails from such IP addresses.

It was not so straightforward to accomplish this, as always. But it is totally doable, and in a short time, too. This blog post documents the steps for Debian 10. They should continue to work in later Debian versions.

You will need:

  • A working e-mail account at an e-mail provider of your choice (SMTP server FQDN, port number, username and password). It is advisable that you generate a dedicated username and password. Do not use your primary username and password! If the credentials leak or are stolen (they will be stored on the local hard drive, readable only by root), then you can simply disable the affected e-mail account without further impact. Furthermore, for security, we are going to require that the SMTP server offers a port featuring TLS-on-connect, as suggested by RFC-8314 (“cleartext considered obsolete”).
  • Debian 10 (newer should work too, because the fundamentals rarely change)
  • Internet connection
  • 30 minutes of time
  • root access. Do all the following steps as root.

First, make sure that the FQDN of the SMTP server of your e-mail provider is not a DNS alias, otherwise domain matching inside of Exim will not work (Exim often uses reverse-DNS lookups to find actual FQDNs). Check this by entering:

host smtp.example.com

If this prints "smtp.example.com" is an alias for "mail.example.com", then resolve all aliases and choose the final FQDN. Another way to get the final FQDN is by doing a reverse DNS lookup on the server’s IP address, e.g. by running:

dig -x <ip address>

Use the printed FQDN to the right of “PTR”. In our example, this is mail.example.com.

Now, let’s actually start by installing Exim (exim4-daemon-light is enough):

apt install exim4

Next, configure exim4:

dpkg-reconfigure exim4-config

At the prompts, do the following:

  1. Select the option “mail sent by smarthost; no local mail”
  2. For “System mail name” leave the pre-filled hostname or PQDN
  3. “IP-addresses to listen on”: leave just “127.0.0.1”. We’re not going to use IPv6 right now.
  4. “Other destinations for which mail is accepted”: leave the pre-filled hostname or PQDN.
  5. “Visible domain name for local users”: leave the pre-filled hostname or PQDN.
  6. For “IP address or host name of the outgoing smarthost” enter the final FQDN which you have found previously, plus a double colon ::, plus the port number. For example: mail.example.com::465
  7. “Keep number of DNS-queries minimal”: leave the default “No”.
  8. “Split configuration into small files”: leave the default “No”.

Next, generate certificates for Exim by running the command below. These will be used for the TLS connections. To test, you can leave defaults for all the questions which the following command asks you. Later, you could upgrade to more professional certificates:

/usr/share/doc/exim4-base/examples/exim-gencert

Next, define the recipient e-mail address as an alias of the system user root. Add the following line to /etc/aliases:

root: recipient@example.com

The remote SMTP server may reject mail without a proper “From:” address. Usually, the server expects the address in the “From:” field to have the same domain name as the MTA itself. For this reason, add the following line to /etc/email-addresses.

root: sender@example.com

Next, add credentials for the SMTP submission to the file /etc/exim4/passwd.client in the format <FQDN>:<username>:<password> For example:

mail.example.com:username:hackme

The default configuration of Exim shipped in the Debian package is excellent and very flexible. But TLS is not enabled by default, and so we still need to make a few adaptations to the default config file.

Add the following lines to /etc/exim4/exim4.conf.localmacros. Create the file if it doesn’t exist:

# Enable TLS
MAIN_TLS_ENABLE = 1

# Require TLS for all remote hosts (STARTTLS or TLS-on-connect)
REMOTE_SMTP_SMARTHOST_HOSTS_REQUIRE_TLS = *

# Require TLS-on-connect for RFC-8314
REMOTE_SMTP_SMARTHOST_REQUIRE_PROTOCOL = smtps

Next, make the following adaption to the file /etc/exim4/exim4.conf.template. After .ifdef REMOTE_SMTP_SMARTHOST_HOSTS_REQUIRE_TLS.endif add the following:

.ifdef REMOTE_SMTP_SMARTHOST_REQUIRE_PROTOCOL
  protocol = REMOTE_SMTP_SMARTHOST_REQUIRE_PROTOCOL
.endif

Run update-exim4.conf and systemctl restart exim4. The final configuration file is written to /var/lib/exim4/config.autogenerated.

Testing with an Exim test instance

To test your setup, you can start a test instance of Exim in the local root console, listening on port 26. It will use the same configuration file, but runs in parallel and independently from the already running Exim daemon. Run as root:

exim -bd -d -oX 26

Then, you can use swaks (from the swaks Debian package) to send a test e-mail to the test instance. The output will be very verbose, so you can easily debug:

swaks --from root@localhost --to root@localhost --port 26

To clarify: This will send an e-mail to the address looked up at “root” in the file /etc/aliases (in our case, recipient@example.com). The “From:” header address will be looked up at “root” in the file /etc/email-addresses (in our case, sender@example.com).

See if you got the e-mail in the target inbox. If yes, then testing with the production Exim daemon should work too.

Testing with the Exim daemon

Observe the output of …

tail -f /var/log/exim4/mainlog

… and then again send a test e-mail using swaks, but this time to the default port 25:

swaks --from root@localhost --to root@localhost

If you got the e-mail, then congratulations! You will now receive e-mails directed at the local root user. You can easily extend this for other, unprivileged users of the system.

To test the entire chain and make sure that you will always be informed by important events, you could send a test e-mail in periodic intervals. But, this is a topic for a future blog post!

Scripting

You could wrap the above swaks command in a shell script to send e-mail from other scripts. For example:

#!/bin/sh
swaks --from root@localhost --to root@localhost --header "Subject: [$(hostname)] $1" --body "Body: $2"

Warning: You could also use the older tools sendmail or mail to write a similar script, but both directly call exim4 under the hood, which is fine as long as you don’t call such a script from a systemd unit (e.g. a service or a timer). Because seemingly, Exim, when called from the command line to submit e-mail, forks and detaches a short-lived process in order to deliver the e-mail to the target MTA. But systemd kills all sub-processes as soon as the main process exits. The process lives just long enough that the e-mail is put into Exim’s queue, but it is never executed. swaks on the other hand delivers the mail to Exim via SMTP and the delivery process is not subject to be killed by systemd.

You should also rate-limit Exim to protect against DoS attacks, but this is also a topic for a future blog post!

Save the (DOM) trees! Why direct DOM manipulation is not a bad idea

A generic, and so non-binary, unsorted, some labels duplicated, arbitrary diagram of a tree.
A generic, and so non-binary, unsorted, some labels duplicated, arbitrary diagram of a tree. CC BY-SA 4.0

Direct DOM manipulation has gotten a bad reputation in the last decade of web development. From Ruby on Rails to React, the DOM was seen as something to gloriously destroy and re-render from the server or even from the browser. Never mind that the browser already exerted a lot of effort parsing HTML and constructing this tree! Mind-numbingly complex HTML string regular expression tests and manipulations had to deal with low-level details of the HTML syntax to insert, delete and change elements, sometimes on every keystroke! Contrasting to that, functions like createElement, remove and insertBefore from the DOM world were largely unknown and unused, except perhaps in jQuery.

Processing of HTML is destructive: The original DOM is destroyed and garbage collected with a certain time delay. Attached event handlers are detached and garbage collected. A completely new DOM is created from parsing new HTML set via .innerHTML =. Event listeners will have to be re-attached from the user-land (this is no issue when using on* HTML attributes, but this has disadvantages as well).

It doesn’t have to be this way. Do not eliminate, but manipulate!

Save the (DOM) trees!

sanitize-dom crawls a DOM sub-tree (beginning from a given node, all the way down to its ancestral leaves) and filters and manipulates it non-destructively. This is very efficient: The browser doesn’t have to re-render everything; it only re-renders what has been changed (sound familiar from React?).

The benefits of direct DOM manipulation:

  • Nodes stay alive.
  • References to nodes (i.e. stored in a Map or WeakMap) stay alive.
  • Already attached event handlers stay alive.
  • The browser doesn’t have to re-render entire sections of a page; thus no flickering, no scroll jumping, no big CPU spikes.
  • CPU cycles for repeatedly parsing and dumping of HTML are eliminated.

sanitize-dom

I just released version 4 of my sanitize-dom library which is my tool of choice to sanitize user-generated content. As mentioned in this post, my project takes a different approach: It doesn’t work with HTML, it operates only on DOM nodes.

sanitize-doms further advantages:

  • No dependencies.
  • Small footprint (only about 7 kB minimized).
  • Faster than other HTML sanitizers because there is no HTML parsing and serialization.

Check out sanitize-dom here on Github!

Friction-less login to a tmux session on a suspended host

The Skating Minister by Henry Raeburn – Friction-less! (public domain)

I recently found that more and more I’m using technical solutions only when they are friction-less (I guess that is not that unusual generally, but a bit more unusual for me). This includes my work on a remote machine, which is suspended most of the time, and for which I’ve enabled WOL (Wake-on-LAN, see my related blog post on how I accomplished this).

I wanted to create a way to log into this machine regardless if it is suspended or not, and continue working where I left off (i.e. not losing any state even when it goes to sleep or if the network is interrupted). I’m usually doing complex work in several shell sessions with several files open at once, several processes running at once, etc.

Below, you’ll find a shell script to accomplish this. You can customize the host name and the host’s MAC address directly in the script, and then call it without arguments.

What does the script do?

It will first wake the machine up using wakeonlan . If it is already awake, then wakeonlan will not have any effect, and all is fine. If it is sleeping, then it will come online in a few seconds. Using an exponential back-off strategy, I’m polling the availability of the SSH port 22. If the port comes online, we know that SSH is ready to receive the login. I’m then trying to attach to an already existing tmux session where I can continue to work where I left off after the last ssh disconnection (simulate this by typing sequentially Enter ~ .) or tmux detachment (Ctrl+b d), and if there is no running tmux session, then it will start a new tmux session, which will also keep me logged in using the shell configured for my user (bash or zsh).

And without further ado, here is the script:

https://gist.github.com/michaelfranzl/5a4dc6b6e450577c39522a7deca51358

#!/bin/bash

# Copyright 2020 Michael Karl Franzl

# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
# associated documentation files (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge, publish, distribute,
# sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:

# The above copyright notice and this permission notice shall be included in all copies or
# substantial portions of the Software.

# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
# NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT
# OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

set -e

host_name="example.com"
host_mac="00:00:00:00:00:00"

echoerr() { printf "%s\n" "$*" >&2; }

wakeonlan ${host_mac}

# exponential backoff strategy for the login
for i in 1 2 4 8 16; do
  if [ $i -ge 16 ]; then
    echoerr "Host ${host_name} did not wake up. Giving up."
    exit 1
  fi

  echo "Waiting until host ${host_name} is awake..."
  nc -z ${host_name} 22 && break
  sleep $i
done

echo "Host ${host_name} is now awake. Logging in..."
ssh -t ${host_name} "tmux a || tmux"

Negative niceness is supanice!

Did you ever want to effortlessly start a process with highest (negative) priority?

Like here, where I don’t want the high CPU load of other processes on my system degrade my developing experience (i.e. zippy nvim):

supanice nvim

Here is the shell function accomplishing this:

function supanice {
  sudo --preserve-env=PATH nice -n -20 su -c $@ `whoami`
}

Vagrant hangs with message “Waiting for domain to get an IP address…”

The problem may be (it was in my case) that there are firewall rules preventing NAT firewall rules for the virbr0 network device created by Vagrant via libvirt, which may look like the following (excerpt of iptables -L -n):

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            192.168.121.0/24     ctstate RELATED,ESTABLISHED
ACCEPT     all  --  192.168.121.0/24     0.0.0.0/0

A possible fix is to disable those restrictive rules (for example, clear all iptables rules before starting the Vagrant machine).


Wake-on-LAN (WoL) in Linux is disabled by default. How to enable it without ethtool.

Accomplishing the simple often takes a day or two.

According to countless tutorials on the internet, in order for a working WoL, you need to fulfill the following preconditions (where point 4 below is rarely mentioned [that’s why this article exists]):

  1. An ethernet hardware/card supporting WoL
  2. Enabling WoL in the BIOS
  3. Enabling the WoL feature on the ethernet card (usually using the user space tool ethtool)
  4. Enabling the WoL feature in the Linux kernel (and this was the major pitfall for me)

Support of WoL can be checked for a given ethernet card in the following way:

ethtool  | grep Wake-on

According to my own experiments on my local hardware, WoL seems to always be enabled by default:

Supports Wake-on: pumbg
Wake-on: g

Because it already prints Wake-on: g, signalling that WoL is enabled (for Magic Packets), one is tempted to not run ethtool -s <card_label> wol g to achieve the same.

However, this ethtool command does at least one other thing, being equivalent to the following:

echo enabled > /sys/class/net/<card_label>/device/power/wakeup

And the main pitfall is: After a reboot, cat /sys/class/net/<card_label>/device/power/wakeup will return disabled.

That is why attempts to make WoL work without ethtool may fail. The feature will stay disabled in the Linux kernel, which will completely shut off the ethernet card in most powersave and power-off states. (I verified this observing the LEDs of a connected ethernet switch).

Conclusion

On modern Linux systems, given supporting hardware, it seems to be enough to write the string enabled to /sys/class/net/<card_label>/device/power/wakeup to make WoL work. ethtool doesn’t seem to be required any more for that, since Ethernet cards seem to already be configured properly by default. The Ethernet card will stay powered in most power-save and power-off states, so that it can receive Wake-on-LAN packets.

Resources

https://www.kernel.org/doc/html/v4.14/driver-api/pm/devices.html
https://www.kernel.org/doc/Documentation/power/states.txt

How to mount Google Drive in KDE’s Dolphin file manager

Note: This tutorial is based on Debian 10. It may work in similar environments.

While this is not a filesystem mount via the Linux kernel (such as I just described in a previous blog post), KIO GDrive (part of KDE) enables KIO-aware applications (such as the Dolphin file manager, Kate editor, or Gwenview image viewer) to access, navigate, and edit Google Drive files.

kio-gdrive is available as a package in several Linux distributions. If installed, the Dolphin file manager will get an entry “Google Drive” under “Network”. There, an unprivileged desktop user can ‘mount’ a GoogleDrive account via a guided graphical configuration (during which the default browser will be opened where one needs to give KDE KAccounts permission to access the GoogleDrive account).

This method doesn’t provide access to GoogleDrive via a terminal, but it integrates it nicely with a graphical desktop. But the best part is that you don’t have to be root/superuser in order to do this, nor do you have to use the command line or write configuration files!

The following screenshots will walk you through the entire process!

Install kio-gdrive (on Debian: apt install kio-gdrive). After installation, Dolphin file manager immediately should get an entry called “Google Drive” under “Network” (see also featured image of this blog post).


Click on “New account” under “Google Drive”. A dialog window opens:

Click on “+ Create” and then on “Google”. It now will ask you to allow access to your Google Drive. A small web frame should open.

Enter your Google credentials and proceed until you are asked to give access to KDE KAccounts Provider.

Once you give permission, you will see:

You will get back to the former dialog:

Exit this dialog and you will already be able to browse your files!

How to mount Google Drive as a file system in Linux

This was surprisingly simple thanks to the excellent google-drive-ocamlfuse project!

For Debian 10 “Buster”, the steps are as follows:

As root:

apt install opam

apt install libcurl4-gnutls-dev libfuse-dev libgmp-dev libsqlite3-dev m4 zlib1g-dev # dependencies for google-drive-ocamlfuse

Then, as unprivileged user, you can install google-drive-ocamlfuse into ~/.opam:

opam init
eval `opam config env` # set needed environment variables
opam update
opam install depext
opam depext google-drive-ocamlfuse
opam install google-drive-ocamlfuse

This compiles a native binary ~/.opam/system/bin/google-drive-ocamlfuse .

The first time, simply run this binary without arguments:

google-drive-ocamlfuse

This will start your default browser where you have to authorize gdfuse to access your Google Drive.

Then, mounting your actual Google Drive is as simple as running

google-drive-ocamlfuse ~/path/to/where/to/mount
ls -l ~/path/to/where/to/mount

Voila! Problem solved in 10 minutes!

Running a graphical window program via SSH on a remote machine (with GPU hardware acceleration)

Note 1: Even though it’s mid-2018, this post is still about the X Window System. Things still are in the transition phase towards Wayland, and things might get better or different over time.

Note 2: This post is not about displaying a graphical window of a program running on a remote machine on the local machine (like VNC or X forwarding). It is about running a remote program and displaying its graphical window on the remote machine itself, as if it had been directly started by a user sitting in front of the remote display. One obvious use case for the solution to this problem would be a remote graphics rendering farm, where programs must make use of the GPU hardware acceleration of the machine they’re running on.

Note that graphical programs started via Xvfb or via X login sessions on fake/software displays (started by some VNC servers) will not use GPU hardware acceleration. The project VirtualGL might be a viable solution too, but I haven’t looked into that yet.

Some experiments on localhost

I’m going to explore the behavior of localhost relative to our problem first. You’ll  need to be logged in to an X graphical environment with monitor attached.

The trivial case: No SSH login session

Running a local program with a graphical window from a local terminal on a local machine is trivial when you are logged into the graphical environment: For example, in a terminal, simply type glxgears and it will run and display with GPU hardware acceleration.

With SSH login session to the same user

Things become a bit more interesting when you use SSH to connect to your current user on localhost. Let’s say your local username is “me”. Try

ssh me@localhost
glxgears

It will output:

Error: couldn't open display (null)

This can be fixed by setting the DISPLAY variable to the same value that is set for the non-SSH session:

DISPLAY=:0 glxgears

Glxgears will run at this point.

With SSH login session to another user

Things become even more interesting when you SSH into some other local user on localhost, called “other” below.

ssh other@localhost
glxgears

You will get the message:

Error: couldn't open display (null)

Trying to export DISPLAY as before won’t help us now:

DISPLAY=:0 glxgears

You will receive the message:

No protocol specified 
Error: couldn't open display :0

This is now a permission problem. There are two solutions for it:

Solution 1: Relax permissions vIA XHOST PROGRAM

To allow non-networked connections to the X server, you can run (as user “me” which is currently using the X environment):

xhost + local:

Then DISPLAY=:0 glxgears will start working as user “other”.

For security reasons, you should undo what you just did:

xhost - local:

Settings via xhost are not permanent across reboots.

Solution 2: via Xauthority file

If you don’t want or can’t use the xhost program, there is a second way (which I like better because it only involves files and file permissions):

User “me” has an environment variable env | grep XAUTHORITY

XAUTHORITY=/run/user/1000/gdm/Xauthority

(I’m using the gdm display manager. The path could be different in your case.)

This file contains a secret which is readable only for user “me”, for security reasons. As a quick test, make this file available world-readable in /tmp:

cp /run/user/1000/gdm/Xauthority /tmp/xauthority_me
chmod a+r /tmp/xauthority_me

Then, as user “other”:

DISPLAY=:0 XAUTHORITY=/tmp/xauthority_me glxgears

Glxgears will run again.

To make sure that we are using hardware acceleration, run glxinfo:

XAUTHORITY=/tmp/xauthority_me DISPLAY=:0 glxinfo | grep Device

This prints for me:

Device: Mesa DRI Intel(R) HD Graphics 630 (Kaby Lake GT2)  (0x5912)

Make sure you remove /tmp/xauthority_me after this test.

Note that the Xauthority file is different after each reboot. But it should be trivial to make it available to other users in a secure way if done properly.

Application on remote machine

If you were able to make things work on the local machine, the same steps should work on a remote machine, too. To clarify, the remote machine needs:

  • A real X login session active (you will likely need to set up auto-login in your display manager if the machine is not accessible).
  • A real monitor attached. Modern graphics cards and/or BIOSes simply shut down the GPU to save power when there is no real device attached to the HDMI port. This is is not Linux or driver specific. Instead of real monitors, you probably want to use “HDMI emulator” hardware plugs – they are cheap-ish and small. Otherwise, the graphical window might not even get painted into the graphics memory. The usual symptom is a black screen when using VNC.

Summary

If you SSH-login into the remote machine, as the user that is currently logged in to the X graphical environment, you can just set the DISPLAY environment variable when running a program, and the program should show on the screen.

If you SSH-login into the remote machine, as a user that is not currently logged in to the X graphical environment, but some other user is, you can set both DISPLAY and XAUTHORITY environment variables as explained further above, and the program should show up on the screen.

Related Links

https://serverfault.com/questions/186805/remote-offscreen-rendering

https://stackoverflow.com/questions/6281998/can-i-run-glu-opengl-on-a-headless-server#8961649

https://superuser.com/questions/305220/issue-with-vnc-when-there-is-no-monitor

https://askubuntu.com/questions/453109/add-fake-display-when-no-monitor-is-plugged-in

https://software.intel.com/en-us/forums/intel-business-client-software-development/topic/279956

“Green Energy” complication: gdm3 suspends machine after 20 minutes: A solution

I just posted a bug report on https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=896083

The solution to the problem is further down in the bug report.

Dear Maintainer,

gdm3’s default dconf energy settings suspend the machine after 20 minutes.

This is independent of the power settings made by an unprivileged user within a Gnome login session.

While this could be forgiven on a locally accessible desktop machine, it also suspends remote/headless machines (e.g. in a data center). Activity on a SSH terminal or VNC connection does not prevent this issue. Having no easy way to re-wake remote machines, this may create highly inconvenient situations for administrators. In addition, unexpected suspension may also have disastrous consequences, depending on the use of the machine.

To reproduce, install task-gnome-desktop and wait for 20 minutes on a machine which supports power management.

The offending settings can be printed to the console. As superuser:

su -s /bin/bash Debian-gdm

unset XDG_RUNTIME_DIR

dbus-launch gsettings get org.gnome.settings-daemon.plugins.power sleep-inactive-ac-type

dbus-launch gsettings get org.gnome.settings-daemon.plugins.power sleep-inactive-ac-timeout

This prints ‘suspend’ and ‘1200’, respectively.

For quicker reproduction of the problem, reduce the timeout to 2 minutes:

dbus-launch gsettings set org.gnome.settings-daemon.plugins.power sleep-inactive-ac-timeout 120

Then reboot and wait 2 minutes.

To turn off suspension, set:

dbus-launch gsettings set org.gnome.settings-daemon.plugins.power sleep-inactive-ac-type nothing

Regards,
Michael Franzl