RSS Atom Add a new post titled:
Repairing a corrupt ext4 root partition

I ran into filesystem corruption (ext4) on the root partition of my backup server which caused it to go into read-only mode. Since it's the root partition, it's not possible to unmount it and repair it while it's running. Normally I would boot from an Ubuntu live CD / USB stick, but in this case the machine is using the mipsel architecture and so that's not an option.

Repair using a USB enclosure

I had to pull the shutdown the server and then pull the SSD drive out. I then moved it to an external USB enclosure and connected it to my laptop.

I started with an automatic filesystem repair:

fsck.ext4 -pf /dev/sde2

which failed for some reason and so I moved to an interactive repair:

fsck.ext4 -f /dev/sde2

Once all of the errors were fixed, I ran a full surface scan to update the list of bad blocks:

fsck.ext4 -c /dev/sde2

Finally, I forced another check to make sure that everything was fixed at the filesystem level:

fsck.ext4 -f /dev/sde2

Fix invalid alternate GPT

The other thing I noticed is this messge in my dmesg log:

scsi 8:0:0:0: Direct-Access     KINGSTON  SA400S37120     SBFK PQ: 0 ANSI: 6
sd 8:0:0:0: Attached scsi generic sg4 type 0
sd 8:0:0:0: [sde] 234441644 512-byte logical blocks: (120 GB/112 GiB)
sd 8:0:0:0: [sde] Write Protect is off
sd 8:0:0:0: [sde] Mode Sense: 31 00 00 00
sd 8:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 8:0:0:0: [sde] Optimal transfer size 33553920 bytes
Alternate GPT is invalid, using primary GPT.
 sde: sde1 sde2

I therefore checked to see if the partition table looked fine and got the following:

$ fdisk -l /dev/sde
GPT PMBR size mismatch (234441643 != 234441647) will be corrected by write.
The backup GPT table is not on the end of the device. This problem will be corrected by write.
Disk /dev/sde: 111.8 GiB, 120034123776 bytes, 234441648 sectors
Disk model: KINGSTON SA400S3
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 799CD830-526B-42CE-8EE7-8C94EF098D46

Device       Start       End   Sectors   Size Type
/dev/sde1     2048   8390655   8388608     4G Linux swap
/dev/sde2  8390656 234441614 226050959 107.8G Linux filesystem

It turns out that all I had to do, since only the backup / alternate GPT partition table was corrupt and the primary one was fine, was to re-write the partition table:

$ fdisk /dev/sde

Welcome to fdisk (util-linux 2.33.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

GPT PMBR size mismatch (234441643 != 234441647) will be corrected by write.
The backup GPT table is not on the end of the device. This problem will be corrected by write.

Command (m for help): w

The partition table has been altered.
Syncing disks.

Run SMART checks

Since I still didn't know what caused the filesystem corruption in the first place, I decided to do one last check: SMART errors.

I couldn't do this via the USB enclosure since the SMART commands aren't forwarded to the drive and so I popped the drive back into the backup server and booted it up.

First, I checked whether any SMART errors had been reported using smartmontools:

smartctl -a /dev/sda

That didn't show any errors and so I kicked off an extended test:

smartctl -t long /dev/sda

which ran for 30 minutes and then passed without any errors.

The mystery remains unsolved.

Setting up and testing an NPR modem on Linux

After acquiring a pair of New Packet Radio modems on behalf of VECTOR, I set it up on my Linux machine and ran some basic tests to check whether it could achieve the advertised 500 kbps transfer rates, which are much higher than AX25) packet radio.

The exact equipment I used was:

Radio setup

After connecting the modems to the power supply and their respective antennas, I connected both modems to my laptop via micro-USB cables and used minicom to connect to their console on /dev/ttyACM[01]:

minicom -8 -b 921600 -D /dev/ttyACM0
minicom -8 -b 921600 -D /dev/ttyACM1

To confirm that the firmware was the latest one, I used the following command:

ready> version
firmware: 2020_02_23
freq band: 70cm

then I immediately turned off the radio:

radio off

which can be verified with:

status

Following the British Columbia 70 cm band plan, I picked the following frequency, modulation (bandwidth of 360 kHz), and power (0.05 W):

set frequency 433.500
set modulation 22
set RF_power 7

and then did the rest of the configuration for the master:

set callsign VA7GPL_0
set is_master yes
set DHCP_active no
set telnet_active no

and the client:

set callsign VA7GPL_1
set is_master no
set DHCP_active yes
set telnet_active no

and that was enough to get the two modems to talk to one another.

On both of them, I ran the following:

save
reboot

and confirmed that they were able to successfully connect to each other:

who

Monitoring RF

To monitor what is happening on the air and quickly determine whether or not the modems are chatting, you can use a software-defined radio along with gqrx with the following settings:

frequency: 433.500 MHz
filter width: user (80k)
filter shape: normal
mode: Raw I/Q

I found it quite helpful to keep this running the whole time I was working with these modems. The background "keep alive" sounds are quite distinct from the heavy traffic sounds.

IP setup

The radio bits out of the way, I turned to the networking configuration.

On the master, I set the following so that I could connect the master to my home network (192.168.1.0/24) without conflicts:

set def_route_active yes
set DNS_active no
set modem_IP 192.168.1.254
set IP_begin 192.168.1.225
set master_IP_size 29
set netmask 255.255.255.0

(My router's DHCP server is configured to allocate dynamic IP addresses from 192.168.1.100 to 192.168.1.224.)

At this point, I connected my laptop to the client using a CAT-5 network cable and the master to the ethernet switch, essentially following Annex 5 of the Advanced User Guide.

My laptop got assigned IP address 192.168.1.225 and so I used another computer on the same network to ping my laptop via the NPR modems:

ping 192.168.1.225

This gave me a round-trip time of around 150-250 ms.

Performance test

Having successfully established an IP connection between the two machines, I decided to run a quick test to measure the available bandwidth in an ideal setting (i.e. the two antennas very close to each other).

On both computers, I installed iperf:

apt install iperf

and then setup the iperf server on my desktop computer:

sudo iptables -A INPUT -s 192.168.1.0/24 -p TCP --dport 5001 -j ACCEPT
sudo iptables -A INPUT -s 192.168.1.0/24 -u UDP --dport 5001 -j ACCEPT
iperf --server

On the laptop, I set the MTU to 750 in NetworkManager:

and restarted the network.

Then I created a new user account (npr with a uid of 1001):

sudo adduser npr

and made sure that only that account could access the network by running the following as root:

# Flush all chains.
iptables -F

# Set defaults policies.
iptables -P INPUT DROP
iptables -P OUTPUT DROP
iptables -P FORWARD DROP

# Don't block localhost and ICMP traffic.
iptables -A INPUT -i lo -j ACCEPT
iptables -A INPUT -p icmp -j ACCEPT
iptables -A OUTPUT -o lo -j ACCEPT

# Don't re-evaluate already accepted connections.
iptables -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

# Allow connections to/from the test user.
iptables -A OUTPUT -m owner --uid-owner 1001 -m conntrack --ctstate NEW -j ACCEPT

# Log anything that gets blocked.
iptables -A INPUT -j LOG
iptables -A OUTPUT -j LOG
iptables -A FORWARD -j LOG

then I started the test as the npr user:

sudo -i -u npr
iperf --client 192.168.1.8

Results

The results were as good as advertised both with modulation 22 (360 kHz bandwidth):

$ iperf --client 192.168.1.8 --time 30
------------------------------------------------------------
Client connecting to 192.168.1.8, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.225 port 58462 connected with 192.168.1.8 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-34.5 sec  1.12 MBytes   274 Kbits/sec

------------------------------------------------------------
Client connecting to 192.168.1.8, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.225 port 58468 connected with 192.168.1.8 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-42.5 sec  1.12 MBytes   222 Kbits/sec

------------------------------------------------------------
Client connecting to 192.168.1.8, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.225 port 58484 connected with 192.168.1.8 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-38.5 sec  1.12 MBytes   245 Kbits/sec

and modulation 24 (1 MHz bandwitdh):

$ iperf --client 192.168.1.8 --time 30
------------------------------------------------------------
Client connecting to 192.168.1.8, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.225 port 58148 connected with 192.168.1.8 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-31.1 sec  1.88 MBytes   506 Kbits/sec

------------------------------------------------------------
Client connecting to 192.168.1.8, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.225 port 58246 connected with 192.168.1.8 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-30.5 sec  2.00 MBytes   550 Kbits/sec

------------------------------------------------------------
Client connecting to 192.168.1.8, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.225 port 58292 connected with 192.168.1.8 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-30.0 sec  2.00 MBytes   559 Kbits/sec

Setting the default web browser on Debian and Ubuntu

If you are wondering what your default web browser is set to on a Debian-based system, there are several things to look at:

$ xdg-settings get default-web-browser
brave-browser.desktop

$ xdg-mime query default x-scheme-handler/http
brave-browser.desktop

$ xdg-mime query default x-scheme-handler/https
brave-browser.desktop

$ ls -l /etc/alternatives/x-www-browser
lrwxrwxrwx 1 root root 29 Jul  5  2019 /etc/alternatives/x-www-browser -> /usr/bin/brave-browser-stable*

$ ls -l /etc/alternatives/gnome-www-browser
lrwxrwxrwx 1 root root 29 Jul  5  2019 /etc/alternatives/gnome-www-browser -> /usr/bin/brave-browser-stable*

Debian-specific tools

The contents of /etc/alternatives/ is system-wide defaults and must therefore be set as root:

sudo update-alternatives --config x-www-browser
sudo update-alternatives --config gnome-www-browser

The sensible-browser tool (from the sensible-utils package) will use these to automatically launch the most appropriate web browser depending on the desktop environment.

Standard MIME tools

The others can be changed as a normal user. Using xdg-settings:

xdg-settings set default-web-browser brave-browser-beta.desktop

will also change what the two xdg-mime commands return:

$ xdg-mime query default x-scheme-handler/http
brave-browser-beta.desktop

$ xdg-mime query default x-scheme-handler/https
brave-browser-beta.desktop

since it puts the following in ~/.config/mimeapps.list:

[Default Applications]
text/html=brave-browser-beta.desktop
x-scheme-handler/http=brave-browser-beta.desktop
x-scheme-handler/https=brave-browser-beta.desktop
x-scheme-handler/about=brave-browser-beta.desktop
x-scheme-handler/unknown=brave-browser-beta.desktop

Note that if you delete these entries, then the system-wide defaults, defined in /etc/mailcap, will be used, as provided by the mime-support package.

Changing the x-scheme-handler/http (or x-scheme-handler/https) association directly using:

xdg-mime default brave-browser-nightly.desktop x-scheme-handler/http

will only change that particular one. I suppose this means you could have one browser for insecure HTTP sites (hopefully with HTTPS Everywhere installed) and one for HTTPS sites though I'm not sure why anybody would want that.

Summary

In short, if you want to set your default browser everywhere (using Brave in this example), do the following:

sudo update-alternatives --config x-www-browser
sudo update-alternatives --config gnome-www-browser
xdg-settings set default-web-browser brave-browser.desktop
Extending GPG key expiry

Extending the expiry on a GPG key is not very hard, but it's easy to forget a step. Here's how I did my last expiry bump.

Update the expiry on the main key and the subkey:

gpg --edit-key KEYID
> expire
> key 1
> expire
> save

Upload the updated key to the keyservers:

gpg --export KEYID | curl -T - https://keys.openpgp.org
gpg --keyserver keyring.debian.org --send-keys KEYID
Automated MythTV-related maintenance tasks

Here is the daily/weekly cronjob I put together over the years to perform MythTV-related maintenance tasks on my backend server.

The first part performs a database backup:

5 1 * * *  mythtv  /usr/share/mythtv/mythconverg_backup.pl

which I previously configured by putting the following in /home/mythtv/.mythtv/backuprc:

DBBackupDirectory=/var/backups/mythtv

and creating a new directory for it:

mkdir /var/backups/mythtv
chown mythtv:mythtv /var/backups/mythtv

The second part of /etc/cron.d/mythtv-maintenance runs a contrib script to optimize the database tables:

10 1 * * *  mythtv  /usr/bin/chronic /usr/share/doc/mythtv-backend/contrib/maintenance/optimize_mythdb.pl

once a day. It requires the libmythtv-perl and libxml-simple-perl packages to be installed on Debian-based systems.

It is quickly followed by a check of the recordings and automatic repair of the seektable (when possible):

20 1 * * *  mythtv  /usr/bin/chronic /usr/bin/mythutil --checkrecordings --fixseektable

Next, I force a scan of the music and video databases to pick up anything new that may have been added externally via NFS mounts:

30 1 * * *  mythtv  /usr/bin/mythutil --quiet --scanvideos
31 1 * * *  mythtv  /usr/bin/mythutil --quiet --scanmusic

Finally, I defragment the XFS partition for two hours every day except Friday:

45 1 * * 1-4,6-7  root  /usr/sbin/xfs_fsr

and resync the RAID-1 arrays once a week to ensure that they stay consistent and error-free:

15 3 * * 2  root  /usr/local/sbin/raid_parity_check md0
15 3 * * 4  root  /usr/local/sbin/raid_parity_check md2

using a trivial script.

In addition to that cronjob, I also have smartmontools run daily short and weekly long SMART tests via this blurb in /etc/smartd.conf:

/dev/sda -a -d ata -o on -S on -s (S/../.././04|L/../../6/05)
/dev/sdb -a -d ata -o on -S on -s (S/../.././04|L/../../6/05)

If there are any other automated maintenance tasks you do on your MythTV server, please leave a comment!

Fixing locale problem in MythTV 30

After upgrading to MythTV 30, I noticed that the interface of mythfrontend switched from the French language to English, despite having the following in my ~/.xsession for the mythtv user:

export LANG=fr_CA.UTF-8
exec ~/bin/start_mythtv

I noticed a few related error messages in /var/log/syslog:

mythbackend[6606]: I CoreContext mythcorecontext.cpp:272 (Init) Assumed character encoding: fr_CA.UTF-8
mythbackend[6606]: N CoreContext mythcorecontext.cpp:1780 (InitLocale) Setting QT default locale to FR_US
mythbackend[6606]: I CoreContext mythcorecontext.cpp:1813 (SaveLocaleDefaults) Current locale FR_US
mythbackend[6606]: E CoreContext mythlocale.cpp:110 (LoadDefaultsFromXML) No locale defaults file for FR_US, skipping
mythpreviewgen[9371]: N CoreContext mythcorecontext.cpp:1780 (InitLocale) Setting QT default locale to FR_US
mythpreviewgen[9371]: I CoreContext mythcorecontext.cpp:1813 (SaveLocaleDefaults) Current locale FR_US
mythpreviewgen[9371]: E CoreContext mythlocale.cpp:110 (LoadDefaultsFromXML) No locale defaults file for FR_US, skipping

Searching for that non-existent fr_US locale, I found that others have this in their logs and that it's apparently set by QT as a combination of the language and country codes.

I therefore looked in the database and found the following:

MariaDB [mythconverg]> SELECT value, data FROM settings WHERE value = 'Language';
+----------+------+
| value    | data |
+----------+------+
| Language | FR   |
+----------+------+
1 row in set (0.000 sec)

MariaDB [mythconverg]> SELECT value, data FROM settings WHERE value = 'Country';
+---------+------+
| value   | data |
+---------+------+
| Country | US   |
+---------+------+
1 row in set (0.000 sec)

which explains the non-sensical FR-US locale.

I fixed the country setting like this

MariaDB [mythconverg]> UPDATE settings SET data = 'CA' WHERE value = 'Country';
Query OK, 1 row affected (0.093 sec)
Rows matched: 1  Changed: 1  Warnings: 0

After logging out and logging back in, the user interface of the frontend is now using the fr_CA locale again and the database setting looks good:

MariaDB [mythconverg]> SELECT value, data FROM settings WHERE value = 'Country';
+---------+------+
| value   | data |
+---------+------+
| Country | CA   |
+---------+------+
1 row in set (0.000 sec)
Printing hard-to-print PDFs on Linux

I recently found a few PDFs which I was unable to print due to those files causing insufficient printer memory errors:

I found a detailed explanation of what might be causing this which pointed the finger at transparent images, a PDF 1.4 feature which apparently requires a more recent version of PostScript than what my printer supports.

Using Okular's Force rasterization option (accessible via the print dialog) does work by essentially rendering everything ahead of time and outputing a big image to be sent to the printer. The quality is not very good however.

Converting a PDF to DjVu

The best solution I found makes use of a different file format: .djvu

Such files are not PDFs, but can still be opened in Evince and Okular, as well as in the dedicated DjVuLibre application.

As an example, I was unable to print page 11 of this paper. Using pdfinfo, I found that it is in PDF 1.5 format and so the transparency effects could be the cause of the out-of-memory printer error.

Here's how I converted it to a high-quality DjVu file I could print without problems using Evince:

pdf2djvu -d 1200 2002.04049.pdf > 2002.04049-1200dpi.djvu

Converting a PDF to PDF 1.3

I also tried the DjVu trick on a different unprintable PDF, but it failed to print, even after lowering the resolution to 600dpi:

pdf2djvu -d 600 dow-faq_v1.1.pdf > dow-faq_v1.1-600dpi.djvu

In this case, I used a different technique and simply converted the PDF to version 1.3 (from version 1.6 according to pdfinfo):

ps2pdf13 -r1200x1200 dow-faq_v1.1.pdf dow-faq_v1.1-1200dpi.pdf

This eliminates the problematic transparency and rasterizes the elements that version 1.3 doesn't support.

Displaying client IP address using Apache Server-Side Includes

If you use a Dynamic DNS setup to reach machines which are not behind a stable IP address, you will likely have a need to probe these machines' public IP addresses. One option is to use an insecure service like Oracle's http://checkip.dyndns.com/ which echoes back your client IP, but you can also do this on your own server if you have one.

There are multiple options to do this, like writing a CGI or PHP script, but those are fairly heavyweight if that's all you need mod_cgi or PHP for. Instead, I decided to use Apache's built-in Server-Side Includes.

Apache configuration

Start by turning on the include filter by adding the following in /etc/apache2/conf-available/ssi.conf:

AddType text/html .shtml
AddOutputFilter INCLUDES .shtml

and making that configuration file active:

a2enconf ssi

Then, find the vhost file where you want to enable SSI and add the following options to a Location or Directory section:

<Location /ssi_files>
    Options +IncludesNOEXEC
    SSLRequireSSL
    Header set Content-Security-Policy: "default-src 'none'"
    Header set X-Content-Type-Options: "nosniff"
    Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
</Location>

before adding the necessary modules:

a2enmod headers
a2enmod include

and restarting Apache:

apache2ctl configtest && systemctl restart apache2.service

Create an shtml page

With the web server ready to process SSI instructions, the following HTML blurb can be used to display the client IP address:

<!--#echo var="REMOTE_ADDR" -->

or any other built-in variable.

Note that you don't need to write a valid HTML for the variable to be substituted and so the above one-liner is all I use on my server.

Security concerns

The first thing to note is that the configuration section uses the IncludesNOEXEC option in order to disable arbitrary command execution via SSI. In addition, you can also make sure that the cgi module is disabled since that's a dependency of the more dangerous side of SSI:

a2dismod cgi

Of course, if you rely on this IP address to be accurate, for example because you'll be putting it in your DNS, then you should make sure that you only serve this page over HTTPS, which can be enforced via the SSLRequireSSL directive.

I included two other headers in the above vhost config (Content-Security-Policy and X-Content-Type-Options) in order to limit the damage that could be done in case a malicious file was accidentally dropped in that directory.

Finally, I suggest making sure that only the root user has writable access to the directory which has server-side includes enabled:

$ ls -la /var/www/ssi_includes/
total 12
drwxr-xr-x  2 root     root     4096 May 18 15:58 .
drwxr-xr-x 16 root     root     4096 May 18 15:40 ..
-rw-r--r--  1 root     root        0 May 18 15:46 index.html
-rw-r--r--  1 root     root       32 May 18 15:58 whatsmyip.shtml
Backing up to a GnuBee PC 2

After installing Debian buster on my GnuBee, I set it up for receiving backups from my other computers.

Software setup

I started by configuring it like a typical server but without a few packages that either take a lot of memory or CPU:

I changed the default hostname:

  • /etc/hostname: foobar
  • /etc/mailname: foobar.example.com
  • /etc/hosts: 127.0.0.1 foobar.example.com foobar localhost

and then installed the avahi-daemon package to be able to reach this box using foobar.local.

I noticed the presence of a world-writable directory and so I tightened the security of some of the default mount points by putting the following in /etc/rc.local:

chmod 755 /etc/network
exit 0

Hardware setup

My OS drive (/dev/sda) is a small SSD so that the GnuBee can run silently when the spinning disks aren't needed. To hold the backup data on the other hand, I got three 4-TB drives drives which I setup in a RAID-5 array. If the data were valuable, I'd use RAID-6 instead since it can survive two drives failing at the same time, but in this case since it's only holding backups, I'd have to lose the original machine at the same time as two of the 3 drives, a very unlikely scenario.

I created new gpt partition tables on /dev/sdb, /dev/sdbc, /dev/sdd and used fdisk to create a single partition of type 29 (Linux RAID) on each of them.

Then I created the RAID array:

mdadm /dev/md127 --create -n 3 --level=raid5 -a /dev/sdb1 /dev/sdc1 /dev/sdd1

and waited more than 24 hours for that operation to finish. Next, I formatted the array:

mkfs.ext4 -m 0 /dev/md127

and added the following to /etc/fstab:

/dev/md127 /mnt/data/ ext4 noatime,nodiratime 0 2

Keeping a copy of the root partition

In order to survive a failing SSD drive, I could have bought a second SSD and gone for a RAID-1 setup. Instead, I went for a cheaper option, a poor man's RAID-1, where I will have to reinstall the machine but it will be very quick and I won't lose any of my configuration.

The way that it works is that I periodically sync the contents of the root partition onto the RAID-5 array using a cronjob in /etc/cron.d/hdd-sync:

0 10 * * *     root    /usr/local/sbin/ssd_root_backup

which runs the /usr/local/sbin/ssd_root_backup script:

#!/bin/sh
nocache nice ionice -c3 rsync -aHx --delete --exclude=/dev/* --exclude=/proc/* --exclude=/sys/* --exclude=/tmp/* --exclude=/mnt/* --exclude=/lost+found/* --exclude=/media/* --exclude=/var/tmp/* /* /mnt/data/root/

Drive spin down

To reduce unnecessary noise and reduce power consumption, I also installed hdparm:

apt install hdparm

and configured all spinning drives to spin down after being idle for 10 minutes by putting the following in /etc/hdparm.conf:

/dev/sdb {
       spindown_time = 120
}

/dev/sdc {
       spindown_time = 120
}

/dev/sdd {
       spindown_time = 120
}

and then reloaded the configuration:

 /usr/lib/pm-utils/power.d/95hdparm-apm resume

Monitoring drive health

Finally I setup smartmontools by putting the following in /etc/smartd.conf:

/dev/sda -a -o on -S on -s (S/../.././02|L/../../6/03)
/dev/sdb -a -o on -S on -s (S/../.././02|L/../../6/03)
/dev/sdc -a -o on -S on -s (S/../.././02|L/../../6/03)
/dev/sdd -a -o on -S on -s (S/../.././02|L/../../6/03)

and restarting the daemon:

systemctl restart smartd.service

Backup setup

I started by using duplicity since I have been using that tool for many years, but a 190GB backup took around 15 hours on the GnuBee with gigabit ethernet.

After a friend suggested it, I took a look at restic and I have to say that I am impressed. The same backup finished in about half the time.

User and ssh setup

After hardening the ssh setup as I usually do, I created a user account for each machine needing to backup onto the GnuBee:

adduser machine1
adduser machine1 sshuser
adduser machine1 sftponly
chsh machine1 -s /bin/false

and then matching directories under /mnt/data/home/:

mkdir /mnt/data/home/machine1
chown machine1:machine1 /mnt/data/home/machine1
chmod 700 /mnt/data/home/machine1

Then I created a custom ssh key for each machine:

ssh-keygen -f /root/.ssh/foobar_backups -t ed25519

and placed it in /home/machine1/.ssh/authorized_keys on the GnuBee.

On each machine, I added the following to /root/.ssh/config:

Host foobar.local
    User machine1
    Compression no
    Ciphers aes128-ctr
    IdentityFile /root/backup/foobar_backups
    IdentitiesOnly yes
    ServerAliveInterval 60
    ServerAliveCountMax 240

The reason for setting the ssh cipher and disabling compression is to speed up the ssh connection as much as possible given that the GnuBee has a very small RAM bandwidth.

Another performance-related change I made on the GnuBee was switching to the internal sftp server by putting the following in /etc/ssh/sshd_config:

Subsystem      sftp    internal-sftp

Restic script

After reading through the excellent restic documentation, I wrote the following backup script, based on my old duplicity script, to reuse on all of my computers:

# Configure for each host
PASSWORD="XXXX"  # use `pwgen -s 64` to generate a good random password
BACKUP_HOME="/root/backup"
REMOTE_URL="sftp:foobar.local:"
RETENTION_POLICY="--keep-daily 7 --keep-weekly 4 --keep-monthly 12 --keep-yearly 2"

# Internal variables
SSH_IDENTITY="IdentityFile=$BACKUP_HOME/foobar_backups"
EXCLUDE_FILE="$BACKUP_HOME/exclude"
PKG_FILE="$BACKUP_HOME/dpkg-selections"
PARTITION_FILE="$BACKUP_HOME/partitions"

# If the list of files has been requested, only do that
if [ "$1" = "--list-current-files" ]; then
    RESTIC_PASSWORD=$PASSWORD restic --quiet -r $REMOTE_URL ls latest
    exit 0

# Show list of available snapshots
elif [ "$1" = "--list-snapshots" ]; then
    RESTIC_PASSWORD=$GPG_PASSWORD restic --quiet -r $REMOTE_URL snapshots
    exit 0

# Restore the given file
elif [ "$1" = "--file-to-restore" ]; then
    if [ "$2" = "" ]; then
        echo "You must specify a file to restore"
        exit 2
    fi
    RESTORE_DIR="$(mktemp -d ./restored_XXXXXXXX)"
    RESTIC_PASSWORD=$PASSWORD restic --quiet -r $REMOTE_URL restore latest --target "$RESTORE_DIR" --include "$2" || exit 1
    echo "$2 was restored to $RESTORE_DIR"
    exit 0

# Delete old backups
elif [ "$1" = "--prune" ]; then
    # Expire old backups
    RESTIC_PASSWORD=$PASSWORD restic --quiet -r $REMOTE_URL forget $RETENTION_POLICY

    # Delete files which are no longer necessary (slow)
    RESTIC_PASSWORD=$PASSWORD restic --quiet -r $REMOTE_URL prune
    exit 0

# Catch invalid arguments
elif [ "$1" != "" ]; then
    echo "Invalid argument: $1"
    exit 1
fi

# Check the integrity of existing backups
RESTIC_PASSWORD=$PASSWORD restic --quiet -r $REMOTE_URL check || exit 1

# Dump list of Debian packages
dpkg --get-selections > $PKG_FILE

# Dump partition tables from harddrives
/sbin/fdisk -l /dev/sda > $PARTITION_FILE
/sbin/fdisk -l /dev/sdb > $PARTITION_FILE

# Do the actual backup
RESTIC_PASSWORD=$PASSWORD restic --quiet --cleanup-cache -r $REMOTE_URL backup / --exclude-file $EXCLUDE_FILE

I run it with the following cronjob in /etc/cron.d/backups:

30 8 * * *    root  ionice nice nocache /root/backup/backup-machine1-to-foobar
30 2 * * Sun  root  ionice nice nocache /root/backup/backup-machine1-to-foobar --prune

in a way that doesn't impact the rest of the system too much.

Finally, I printed a copy of each of my backup script, using enscript, to stash in a safe place:

enscript --highlight=bash --style=emacs --output=- backup-machine1-to-foobar | ps2pdf - > foobar.pdf

This is actually a pretty important step since without the password, you won't be able to decrypt and restore what's on the GnuBee.

Disabling mail sending from your domain

I noticed that I was receiving some bounced email notifications from a domain I own (cloud.geek.nz) to host my blog. These notifications were all for spam messages spoofing the From address since I do not use that domain for email.

I decided to try setting a strict DMARC policy to see if DMARC-using mail servers (e.g. GMail) would then drop these spoofed emails without notifying me about it.

I started by setting this initial DMARC policy in DNS in order to monitor the change:

@ TXT v=spf1 -all
_dmarc TXT v=DMARC1; p=none; ruf=mailto:dmarc@fmarier.org; sp=none; aspf=s; fo=0:1:d:s;

Then I waited three weeks without receiving anything before updating the relevant DNS records to this final DMARC policy:

@ TXT v=spf1 -all
_dmarc TXT v=DMARC1; p=reject; sp=reject; aspf=s;

This policy states that nobody is allowed to send emails for this domain and that any incoming email claiming to be from this domain should be silently rejected.

I haven't noticed any bounce notifications for messages spoofing this domain in a while, so maybe it's working?