Opting your domain out of programmatic advertising

A few years ago, the advertising industry introduced the ads.txt project in order to defend against widespread domain spoofing vulnerabilities in programmatic advertising.

I decided to use this technology to opt out of having ads sold for my domains, at least through ad exchanges which perform this check, by hosting a text file containing this:

contact=ads@fmarier.org

at the following locations:

(In order to get this to work on my blog, running Ikiwiki on Branchable, I had to disable the txt plugin in order to get ads.txt to be served as a plain text file instead of being automatically rendered as HTML.)

Specification

The key parts of the specification for our purposes are:

[3.1] If the server response indicates the resource does not exist (HTTP Status Code 404), the advertising system can assume no declarations exist and that no advertising system is unauthorized to buy and sell ads on the website.

[3.2.1] Some publishers may choose to not authorize any advertising system by publishing an empty ads.txt file, indicating that no advertising system is authorized to buy and sell ads on the website. So that consuming systems properly read and interpret the empty file (differentiating between web servers returning error pages for the /ads.txt URL), at least one properly formatted line must be included which adheres to the format specification described above.

As you can see, the specification sadly ignores RFC8615 and requires that the ads.txt file be present directly in the root of your web server, like the venerable robots.txt file, but unlike the newer security.txt standard.

If you don't want to provide an email address in your ads.txt file, the specification recommends using the following line verbatim:

placeholder.example.com, placeholder, DIRECT, placeholder

Validation

A number of online validators exist, but I used the following to double-check my setup:

List of Planet Linux Australia blogs

I've been following Planet Linux Australia for many years and discovered many interesting FOSS blogs through it. I was sad to see that it got shut down a few weeks ago and so I decided to manually add all of the feeds to my RSS reader to avoid missing posts from people I have been indirectly following for years.

Since all feeds have been removed from the site, I recovered the list of blogs available from an old copy of the site preserved by the Internet Archive.

Here is the resulting .opml file if you'd like to subscribe.

Changes

Once I had the full list, I removed all blogs that are gone, empty or broken (e.g. domain not resolving, returning a 404, various database or server errors).

I updated the URLs of a few blogs which had moved but hadn't updated their feeds on the planet. I also updated the name of a blogger who was still listed under a previous last name.

Finally, I removed LA-specific tags from feeds since these are unlikely to be used again.

Work-arounds

The following LiveJournal feeds didn't work in my RSS reader but opened fine in a browser:

However since none of them have them updated in the last 7 years, I just left them out.

A couple appear to be impossible to fetch over Tor, presumably due to a Cloudflare setting:

Since only the last two have been updated in the last 9 years, I added these to Feedburner and added the following "proxied" URLs to my reader:

Similarly, I couldn't fetch the following over Tor for some other reasons:

I excluded the first two which haven't been updated in 6 years and proxied the other ones: