I recently upgraded from Ubuntu 18.04.5 (bionic) to 20.04.1 (focal) and it was one of the roughest Ubuntu upgrades I've gone through in a while. Here are the notes I took on avoiding or fixing the problems I ran into.
Preparation
Before going through the upgrade, I disabled the configurations which I know interfere with the process:
Enable etckeeper auto-commits before install by putting the following in
/etc/etckeeper/etckeeper.conf
:AVOID_COMMIT_BEFORE_INSTALL=0
Remount
/tmp
as exectuable:mount -o remount,exec /tmp
Another step I should have taken but didn't, was to temporarily remove safe-rm since it caused some problems related to a Perl upgrade happening at the same time:
apt remove safe-rm
Network problems
After the upgrade, my network settings weren't really working properly and so I started by switching from ifupdown to netplan.io which seems to be the preferred way of configuring the network on Ubuntu now.
Then I found out that netplan.io is
not automatically enabling the
systemd-resolved handling of .local
hostnames.
I would be able to resolve a hostname using avahi:
$ avahi-resolve --name machine.local
machine.local 192.168.1.5
but not with systemd:
$ systemd-resolve machine.local
machine.local: resolve call failed: 'machine.local' not found
$ resolvectl mdns
Global: no
Link 2 (enp4s0): no
The best solution I found involves keeping
systemd-resolved and its /etc/resolv.conf
symlink to /run/systemd/resolve/stub-resolv.conf
.
I added the following in a new /etc/NetworkManager/conf.d/mdns.conf
file:
[connection]
connection.mdns=1
which instructs NetworkManager to resolve mDNS on all network interfaces it manages but not register a hostname since that's done by avahi-daemon.
Then I enabled mDNS globally in systemd-resolved by setting the following
in /etc/systemd/resolved.conf
:
MulticastDNS=yes
before restarting both services:
systemctl restart NetworkManager.service systemd-resolved.service
With that in place, .local
hostnames are resolved properly and I can
see that mDNS is fully enabled:
$ resolvectl mdns
Global: yes
Link 2 (enp4s0): yes
Boot problems
For some reason I was able to boot with the kernel I got as part of the focal update, but a later kernel update rendered my machine unbootable.
Adding some missing RAID-related modules to
/etc/initramfs-tools/modules
:
raid1
dmraid
md-raid1
and then re-creating all initramfs:
update-initramfs -u -k all
seemed to do the trick.
As part of the #MoreOnionsPorFavor
campaign, I decided to
follow brave.com
's lead and make
my homepage available as a Tor onion
service.
Tor daemon setup
I started by installing the Tor daemon locally:
apt install tor
and then setting the following in /etc/tor/torrc
:
SocksPort 0
SocksPolicy reject *
HiddenServiceDir /var/lib/tor/hidden_service/
HiddenServicePort 80 [2600:3c04::f03c:91ff:fe8c:61ac]:80
HiddenServicePort 443 [2600:3c04::f03c:91ff:fe8c:61ac]:443
HiddenServiceVersion 3
HiddenServiceNonAnonymousMode 1
HiddenServiceSingleHopMode 1
in order to create a version 3 onion service without actually running a Tor relay.
Note that since I am making a public website available over Tor, I do not need the location of the website to be hidden and so I used the same settings as Cloudflare in their public Tor proxy.
Also, I explicitly used the external IPv6 address of my server in the configuration in order to prevent localhost bypasses.
After restarting the Tor daemon to reload the configuration file:
systemctl restart tor.service
I looked for the address of my onion service:
$ cat /var/lib/tor/hidden_service/hostname
ixrdj3iwwhkuau5tby5jh3a536a2rdhpbdbu6ldhng43r47kim7a3lid.onion
Apache configuration
Next, I enabled a few required Apache modules:
a2enmod mpm_event
a2enmod http2
a2enmod headers
and configured my Apache vhosts in /etc/apache2/sites-enabled/www.conf
:
<VirtualHost *:443>
ServerName fmarier.org
ServerAlias ixrdj3iwwhkuau5tby5jh3a536a2rdhpbdbu6ldhng43r47kim7a3lid.onion
Protocols h2, http/1.1
Header set Onion-Location "http://ixrdj3iwwhkuau5tby5jh3a536a2rdhpbdbu6ldhng43r47kim7a3lid.onion%{REQUEST_URI}s"
Header set alt-svc 'h2="ixrdj3iwwhkuau5tby5jh3a536a2rdhpbdbu6ldhng43r47kim7a3lid.onion:443"; ma=315360000; persist=1'
Header add Strict-Transport-Security: "max-age=63072000"
Include /etc/fmarier-org/www-common.include
SSLEngine On
SSLCertificateFile /etc/letsencrypt/live/fmarier.org/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/fmarier.org/privkey.pem
</VirtualHost>
<VirtualHost *:80>
ServerName fmarier.org
Redirect permanent / https://fmarier.org/
</VirtualHost>
<VirtualHost *:80>
ServerName ixrdj3iwwhkuau5tby5jh3a536a2rdhpbdbu6ldhng43r47kim7a3lid.onion
Include /etc/fmarier-org/www-common.include
</VirtualHost>
Note that /etc/fmarier-org/www-common.include
contains all of the
configuration options that are common to both the HTTP and the HTTPS sites
(e.g. document root, caching headers, aliases, etc.).
Finally, I restarted Apache:
apache2ctl configtest
systemctl restart apache2.service
Testing
In order to test that my website is correctly available at its .onion
address, I opened the following URLs in a Brave Tor
window:
- http://ixrdj3iwwhkuau5tby5jh3a536a2rdhpbdbu6ldhng43r47kim7a3lid.onion/
- https://ixrdj3iwwhkuau5tby5jh3a536a2rdhpbdbu6ldhng43r47kim7a3lid.onion/ (a TLS certificate error is expected)
I also checked that the main URL (https://fmarier.org/) exposes a working
Onion-Location
header
which triggers the display of a button in the URL bar (recently merged
and available in Brave Nightly):
Testing that the Alt-Svc
is working required using the Tor Browser
since that's not yet supported in
Brave:
- Open https://fmarier.org.
- Wait 30 seconds.
- Reload the page.
On the server side, I saw the following:
2a0b:f4c2:2::1 - - [14/Oct/2020:02:42:20 +0000] "GET / HTTP/2.0" 200 2696 "-" "Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Firefox/78.0"
2600:3c04::f03c:91ff:fe8c:61ac - - [14/Oct/2020:02:42:53 +0000] "GET / HTTP/2.0" 200 2696 "-" "Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Firefox/78.0"
That first IP address is from a Tor exit node:
$ whois 2a0b:f4c2:2::1
...
inet6num: 2a0b:f4c2::/40
netname: MK-TOR-EXIT
remarks: -----------------------------------
remarks: This network is used for Tor Exits.
remarks: We do not have any logs at all.
remarks: For more information please visit:
remarks: https://www.torproject.org
which indicates that the first request was not using the .onion
address.
The second IP address is the one for my server:
$ dig +short -x 2600:3c04::f03c:91ff:fe8c:61ac
hafnarfjordur.fmarier.org.
which indicates that the second request to Apache came from the Tor relay
running on my server, hence using the .onion
address.
In order to fix the following error after setting up SIP TLS in Asterisk 16.2:
asterisk[8691]: ERROR[8691]: tcptls.c:966 in __ssl_setup: TLS/SSL error loading cert file. <asterisk.pem>
I created a Let's Encrypt certificate using certbot:
apt install certbot
certbot certonly --standalone -d hostname.example.com
To enable the asterisk
user to load the certificate successfuly (it
doesn't have permission to access the certificates under /etc/letsencrypt/
),
I copied it to the right directory:
cp /etc/letsencrypt/live/hostname.example.com/privkey.pem /etc/asterisk/asterisk.key
cp /etc/letsencrypt/live/hostname.example.com/fullchain.pem /etc/asterisk/asterisk.cert
chown asterisk:asterisk /etc/asterisk/asterisk.cert /etc/asterisk/asterisk.key
chmod go-rwx /etc/asterisk/asterisk.cert /etc/asterisk/asterisk.key
Then I set the following variables in /etc/asterisk/sip.conf
:
tlscertfile=/etc/asterisk/asterisk.cert
tlsprivatekey=/etc/asterisk/asterisk.key
Automatic renewal
The machine on which I run asterisk has a tricky Apache setup:
- a webserver is running on port 80
- port 80 is restricted to the local network
This meant that the certbot domain ownership checks would get blocked by the firewall, and I couldn't open that port without exposing the private webserver to the Internet.
So I ended up disabling the built-in certbot renewal mechanism:
systemctl disable certbot.timer certbot.service
systemctl stop certbot.timer certbot.service
and then writing my own script in /etc/cron.daily/certbot-francois
:
#!/bin/bash
TEMPFILE=`mktemp`
# Stop Apache and backup firewall.
/bin/systemctl stop apache2.service
/usr/sbin/iptables-save > $TEMPFILE
# Open up port 80 to the whole world.
/usr/sbin/iptables -D INPUT -j LOGDROP
/usr/sbin/iptables -A INPUT -p tcp --dport 80 -j ACCEPT
/usr/sbin/iptables -A INPUT -j LOGDROP
# Renew all certs.
/usr/bin/certbot renew --quiet
# Restore firewall and restart Apache.
/usr/sbin/iptables -D INPUT -p tcp --dport 80 -j ACCEPT
/usr/sbin/iptables-restore < $TEMPFILE
/bin/systemctl start apache2.service
# Copy certificate into asterisk.
cp /etc/letsencrypt/live/hostname.example.com/privkey.pem /etc/asterisk/asterisk.key
cp /etc/letsencrypt/live/hostname.example.com/fullchain.pem /etc/asterisk/asterisk.cert
chown asterisk:asterisk /etc/asterisk/asterisk.cert /etc/asterisk/asterisk.key
chmod go-rwx /etc/asterisk/asterisk.cert /etc/asterisk/asterisk.key
# Commit changes to etckeeper and restart asterisk.
pushd /etc/ > /dev/null
/usr/bin/git add letsencrypt asterisk
DIFFSTAT="$(/usr/bin/git diff --cached --stat)"
if [ -n "$DIFFSTAT" ] ; then
/usr/bin/git commit --quiet -m "Renewed letsencrypt certs." letsencrypt asterisk
echo "$DIFFSTAT"
/bin/systemctl restart asterisk.service
fi
popd > /dev/null
Here is the process I followed when I moved my GnuBee's root partition from one flaky Kingston SSD drive to a brand new Samsung SSD.
It was relatively straightforward, but there are two key points:
- Make sure you label the root partition
GNUBEE-ROOT
. - Make sure you copy the network configuration from the SSD, not the
tmpfs
mount.
Copy the partition table
First, with both drives plugged in, I replicated the partition table of the
first drive (/dev/sda
):
# fdisk -l /dev/sda
Disk /dev/sda: 111.8 GiB, 120034123776 bytes, 234441648 sectors
Disk model: KINGSTON SA400S3
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 799CD830-526B-42CE-8EE7-8C94EF098D46
Device Start End Sectors Size Type
/dev/sda1 2048 8390655 8388608 4G Linux swap
/dev/sda2 8390656 234441614 226050959 107.8G Linux filesystem
onto the second drive (/dev/sde
):
# fdisk /dev/sde
Welcome to fdisk (util-linux 2.33.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0xd011eaba.
Command (m for help): g
Created a new GPT disklabel (GUID: 83F70325-5BE0-034E-A9E1-1965FEFD8E9F).
Command (m for help): n
Partition number (1-128, default 1):
First sector (2048-488397134, default 2048):
Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-488397134, default 488397134): +4G
Created a new partition 1 of type 'Linux filesystem' and of size 4 GiB.
Command (m for help): t
Selected partition 1
Partition type (type L to list all types): 19
Changed type of partition 'Linux filesystem' to 'Linux swap'.
Command (m for help): n
Partition number (2-128, default 2):
First sector (8390656-488397134, default 8390656):
Last sector, +/-sectors or +/-size{K,M,G,T,P} (8390656-488397134, default 488397134): 234441614
Created a new partition 2 of type 'Linux filesystem' and of size 107.8 GiB.
Command (m for help): p
Disk /dev/sde: 232.9 GiB, 250059350016 bytes, 488397168 sectors
Disk model: Samsung SSD 860
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 83F70325-5BE0-034E-A9E1-1965FEFD8E9F
Device Start End Sectors Size Type
/dev/sde1 2048 8390655 8388608 4G Linux swap
/dev/sde2 8390656 234441614 226050959 107.8G Linux filesystem
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
I wasted a large amount of space on the second drive, but that was on purpose in case I decide to later on move to a RAID-1 root partition with the Kingston SSD.
Format the partitions
Second, I formated the new partitions:
# mkswap /dev/sde1
Setting up swapspace version 1, size = 4 GiB (4294963200 bytes)
no label, UUID=7a85fbce-2493-45c1-a548-4ec6e827ec29
# mkfs.ext4 /dev/sde2
mke2fs 1.44.5 (15-Dec-2018)
Discarding device blocks: done
Creating filesystem with 28256369 4k blocks and 7069696 inodes
Filesystem UUID: 732a76df-d369-4e7b-857a-dd55fd461bbc
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872
Allocating group tables: done
Writing inode tables: done
Creating journal (131072 blocks): done
Writing superblocks and filesystem accounting information: done
and labeled the root partition so that the GnuBee can pick it up as it boots up:
e2label /dev/sde2 GNUBEE-ROOT
since GNUBEE-ROOT
is what uboot
will be looking
for.
Copy the data over
Finally, I copied the data over from the original drive to the new one:
# umount /etc/network
# mkdir /mnt/root
# mount /dev/sde2 /mnt/root
# rsync -aHx --delete --exclude=/dev/* --exclude=/proc/* --exclude=/sys/* --exclude=/tmp/* --exclude=/mnt/* --exclude=/lost+found/* --exclude=/media/* --exclude=/rom/* /* /mnt/root/
# sync
Note that if you don't unmount /etc/network/
, you'll be copying the
override provided at boot time instead of the underlying config that's on
the root partition. The reason that this matters is that the script that
renames the network interfaces to ethblack
and
ethblue
expects specific files in order to produce a working network configuration.
If you copy the final modified config files then you end up with an
bind-mounted empty directory as /etc/network
, and the network interfaces
can't be brought up successfully.