Showing posts with label spamassassin. Show all posts
Showing posts with label spamassassin. Show all posts

Sunday, December 30, 2018

Making a Mailserver - Spam Blocking, Revisited

In an earlier post I described implementing spamassassin with exim4. The information there still holds true, but the technique of simply implementing "spamd" has not been enough to hold back spammers who have my email address. My email address was harvested in both the LinkedIn and Last.fm hacks. In the last years the targeted spam has increased noticeably.

I started to train my desktop email client to pick out spam and it does a decent job, so I weathered the deluge for some time. However, plenty of spam still gets through and when my desktop email client is not open I have plenty of junk to pick through on my mobile devices.

Finally, I've taken the time to sharpen up my exim4 defenses.

Challenges


In the rest of this post, I'll be answering these questions:
  1. Does spamassassin support Domain Name System Blacklists (DNSBL)?
  2. How do I integrate blocklist (DNSBL) checks in exim4?
  3. How do I block hosts that are not really mailservers?
  4. How do I block on reverse DNS failures?
  5. How do I allow specific hosts to skip checking by exim4 and spamd?
  6. How do I verify these measures are working properly?
The answers to these questions are straightforward, but took quite a bit of research time and verification. Point 6 isn't separately addressed; each section that follows will talk about the ways that I verified the spam mitigations were working.

spamassassin and exim4


It turns out that spamassassin (spamd) supports DNSBL by default. I actually discovered this after going through the process of integrating zen.spamhaus.org checking in exim4. The difference is that you can kill spam outright with exim4 integration, but spamd will use it as part of the point calculation when determining how 'spammy' an email is.

There's a downside then to spamd: while the blocklist from spamhaus has a very high accuracy, being on the blocklist doesn't guarantee that spamd will calculate enough points to junk the email.

It's possible to change the point value of addresses that are on DNSBLs by adding a "score URIBL_BLACK <value>" line to the spamassassin config file. You also need to ensure that perl's Net::DNS is installed. To check if that is installed, try "perl -MNet::DNS -e 1" and the command should execute with no errors.

One unanswered question I have is whether the perl module takes care of the DNS lookup and server to use, or whether your server needs to have a a spamhaus friendly DNS server in /etc/resolv.conf - see the spamassassin DNSBL discussion below.

Verify spamassassin is using DNSBL


To verify whether DNSBL is being used by spamassassin, check the log for the presence of URIBL_BLACK. This could be the syslog logfile depending on the system, not the exim4 logs:

Dec 29 16:22:02 mail spamd[2739]: spamd: result: Y 12 - AXB_XMAILER_MIMEOLE_OL_024C2,BAYES_00,FORGED_MUA_OUTLOOK,FORGED_OUTLOOK_HTML,FORGED_OUTLOOK_TAGS,FREEMAIL_FROM,FROM_MISSP_EH_MATCH,FROM_MISSP_FREEMAIL,FROM_MISSP_MSFT,FROM_MISSP_REPLYTO,FROM_MISSP_XPRIO,FSL_CTYPE_WIN1251,FSL_NEW_HELO_USER,HTML_MESSAGE,LOTS_OF_MONEY,MIME_HTML_ONLY,MISSING_HEADERS,MISSING_MID,MONEY_FROM_MISSP,NSL_RCVD_HELO_USER,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_WEB,REPLYTO_WITHOUT_TO_CC,SPF_SOFTFAIL,TO_NO_BRKTS_FROM_MSSP,TO_NO_BRKTS_MSFT,T_COMPENSATION,URIBL_BLACK scantime=1.3,size=4689,user=Debian-exim,uid=104,required_score=5.0,rhost=127.0.0.1,raddr=127.0.0.1,rport=41221,mid=(unknown),bayes=0.000005,autolearn=no autolearn_force=no 

This means that this particular message was found in a blocklist. Of course you are not going to see that present on every email that is checked.

Integrate the spamhaus blocklist into exim4


The instructions that I'll provide here are not specific to spamhaus, it's just the service I decided to try. It's free to a point, someone like me with relatively low volumes of email will be able to use the service unimpeded. There is a performance hit on your own server while doing the DNS lookup on spamhaus though. The speed of the lookup will slow your mail delivery of legitimate email by milliseconds.

It's supposedly possible to download a blocklist and do local lookups on that, but setting that up is more complex and requires frequent downloads of large lists, so it seems of small reward for a lot of work if you are not handling much of email.

My exim4 config is broken into separate config elements, which is a fairly normal thing to do, but you may find the files to place this config will differ depending on your system.

Enable "DNSBLS" as exim4 refers to it, in your custom macro file (/etc/exim4/conf.d/main/00-custom_macros for example):

CHECK_RCPT_IP_DNSBLS = zen.spamhaus.org

Configure the deny option or leave it at a warning level (/etc/exim4/conf.d/acl/30_exim4-config_check_rcpt).

# Check against classic DNS "black" lists (DNSBLs) which list
# sender IP addresses
.ifdef CHECK_RCPT_IP_DNSBLS
#warn
# message = X-Warning: $sender_host_address is listed at $dnslist_domain ($dnslist_value: $dnslist_text)
deny
  message = Failed sender validation
  log_message = michael DENY - $sender_host_address is listed at $dnslist_domain ($dnslist_value: $dnslist_text)
  dnslists = CHECK_RCPT_IP_DNSBLS
.endif

Notice above that I've commented out the default "warn" and "message". I've also added a custom log_message so the "michael DENY" sticks out. I know when I see that log that it's taking action on something I did. Also I dumbed down the message to not be too helpful to spammers, not that I think they're reading the SMTP rejection reasons!

When I initially implemented this, I never saw the rule being triggered. The reason was because the spamhaus lookup always failed to return any record. If your mailserver is configured to do lookups via a major DNS servers like 8.8.8.8, 1.1.1.1 or 9.9.9.9, the spamhaus lookups don't work. I'm not going to go into why they don't work here, but suffice to say that the major DNS providers don't want to know about these queries.

Unfortunately, that means finding a DNS server that will help you with your inquiries, or running a DNS server on the local box (or in your local network). If you refer back to my previous post on setting up a nameserver, then you can simply add the following snippets to the existing setup (/etc/bind/named.conf.options):


acl "trusted" {
        localhost;
        <other trusted IPs>;
};
options {
        ...

        allow-query { any; };
        allow-recursion { trusted; };
        allow-query-cache { trusted; };
        ...
}


I sincerely encourage you to check the bind documentation yourself. Don't go adding random config from the internet to highly sensitive services without understanding what each setting means. Just a little explanation, the acl restricts DNS queries for domains other than in the local zones to a list of permitted hosts, including localhost, so that exim4 and other local services can use this server to resolve IPs.

Naturally, you need to ensure that /etc/resolv.conf has "nameserver 127.0.0.1" or another DNSBL friendly server configured.

Back to the exim config. Without customising, i.e. using the default settings, you'll get a new header on the email message. The email won't be dropped by this rule,  but you will see in the email (or in the exim4 rejectlog if dropped elsewhere) an X-Warning header. Note it below:

Envelope-from: <ass3@binkmail.com>
Envelope-to: <michael@moff.tech>
P Received: from 79-103-16-190.fibertel.com.ar ([190.16.103.79])
by mail.moff.tech with esmtp (Exim 4.84_2)
(envelope-from <ass3@binkmail.com>)
id 1gdFOD-00008q-90
for michael@moff.tech; Sat, 29 Dec 2018 15:14:57 +0100
I Message-ID: <4EAF76134D97CA28C910752BF1AC4EAF@KSP94W150>
F From: "michael@moff.tech" <ass3@binkmail.com>
T To: <michael@moff.tech>
Subject: ***wonderful spam***
Date: 29 Dec 2018 07:01:51 -0400
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5512
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
X-Warning: 190.16.103.79 is listed at zen.spamhaus.org (127.0.0.11, 127.0.0.4: https://www.spamhaus.org/query/ip/190.16.103.79)
X-Spam-Score: 20.0 (++++++++++++++++++++)

You'll likely see the same warning in the exim4 mainlog:

2018-12-29 15:14:57 H=79-103-16-190.fibertel.com.ar [190.16.103.79] Warning: 190.16.103.79 is listed at zen.spamhaus.org (127.0.0.11, 127.0.0.4: https://www.spamhaus.org/query/ip/190.16.103.79)

Once you see this log or the custom deny log I suggested, you know that the lookup is working.

In the end, I ditched doing the spamhaus lookup when I realised that spamassassin was also doing it. I commented out the "CHECK_RCPT_IP_DNSBLS = zen.spamhaus.org" entry and left the 30_exim4-config_check_rcpt config just in case I wanted to switch it on again.

Blocking mailservers that aren't


Mail delivery is entirely dependent on DNS records. A mailserver without forward and reverse DNS entries is of doubtful reputation. It's a safe bet that any host sending you email that doesn't have complete DNS records is not a legitimate mailserver and should be ignored.

My default exim4 install does not hold mailservers or the sender addresses to strict standards. Organisations that deal with large volumes of emails, for large numbers of users, will receive legitimate email from such badly configured mail clients and mailservers.

I can say with a high degree of certainty that I should not be receiving email from weird hosts or senders with domains that don't exist or accept email. If I discard emails from such servers and senders, I might have a handful of people over some number of years that have a problem emailing me. The upsides to configuring my mailserver to be strict on these points outweigh the potential downsides.

There are three useful options. In the custom macros file (/etc/exim4/conf.d/main/00-custom_macros) you may elect to enable the following ACLs.
# Denied in /etc/exim4/conf.d/acl/30_exim4-config_check_rcpt
CHECK_RCPT_VERIFY_SENDER = yes
#
# Denied in /etc/exim4/conf.d/acl/40_exim4-config_check_data
CHECK_DATA_VERIFY_HEADER_SENDER = yes
#
# Denied in /etc/exim4/conf.d/acl/30_exim4-config_check_rcpt
CHECK_RCPT_REVERSE_DNS = yes

In the configuration files, you'll see that two of the methods are already deny, once the option is enabled (as above).

There's a helpful overview of many exim4 ACL options here and here. My descriptions below paraphrase them.

CHECK_RCPT_VERIFY_SENDER verifies that the sender of the message (RCPT TO) has a DNS entry. This is disabled by default, but when enabled will deny by default. Note that I added a custom log log message:

/etc/exim4/conf.d/acl/30_exim4-config_check_rcpt

# Deny unless the sender address can be verified.
#
# This is disabled by default so that DNSless systems don't break. If
# your system can do DNS lookups without delay or cost, you might want
# to enable this feature.
#
# This feature does not work in smarthost and satellite setups as
# with these setups all domains pass verification. See spec.txt chapter
# 39.31 with the added information that a smarthost/satellite setup
# routes all non-local e-mail to the smarthost.
.ifdef CHECK_RCPT_VERIFY_SENDER
deny
  message = Sender verification failed
  log_message = michael DENY - Sender verification failed
  !acl = acl_local_deny_exceptions
  !verify = sender
.endif

To date I have not seen this logged, so I can't verify that it's doing anything. It is possible that one of the other ACLs denies the email first.

CHECK_RCPT_REVERSE_DNS is the ACL that actually checks whether the mailserver has a reverse DNS entry.

/etc/exim4/conf.d/acl/30_exim4-config_check_rcpt

# Warn if the sender host does not have valid reverse DNS.
#
# If your system can do DNS lookups without delay or cost, you might want
# to enable this.
# If sender_host_address is defined, it's a remote call. If
# sender_host_name is not defined, then reverse lookup failed. Use
# this instead of !verify = reverse_host_lookup to catch deferrals
# as well as outright failures.
.ifdef CHECK_RCPT_REVERSE_DNS
#warn
# message = X-Host-Lookup-Failed: Reverse DNS lookup failed for $sender_host_address (${if eq{$host_lookup_failed}{1}{failed}{deferred}})
deny
  message = Sender validation failure
  log_message = michael DENY - Reverse DNS check failed
condition = ${if and{{def:sender_host_address}{!def:sender_host_name}}\
{yes}{no}}
.endif

Reverse check logs will now appear as so:

2018-12-30 07:17:37 H=([182.177.52.180]) [182.177.52.180] F=<ezambrano@maecabogados.com> rejected RCPT <michael@moff.tech>: michael DENY - Reverse DNS check failed


CHECK_DATA_VERIFY_HEADER_SENDER verifies that the sender is valid in at least one of the "Sender:", "Reply-To:", or "From:" header lines.

/etc/exim4/conf.d/acl/40_exim4-config_check_data


# require that there is a verifiable sender address in at least
# one of the "Sender:", "Reply-To:", or "From:" header lines.
.ifdef CHECK_DATA_VERIFY_HEADER_SENDER
deny
  message = No verifiable sender address in message headers
  log_message = michael DENY - No verifiable sender address in message headers
  !acl = acl_local_deny_exceptions
  !verify = header_sender
.endif


This condition is generally rare to see in the logs, but will look as so:

2018-12-30 09:00:31 1gdW1P-0003gC-Fx H=somelinuxhost.net (gentoo.somelinuxhost.net) [x.x.x.x] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 F=<noreply@somelinuxhost.net> rejected after DATA: michael DENY - No verifiable sender address in message headers: syntax error in 'From:' header when scanning for sender: malformed address: <noreply@somelinuxhost.net> may not follow noreply@somelinuxhost.net  in "noreply@somelinuxhost.net <noreply@somelinuxhost.net>"


After a day of monitoring logs I found that the only instance was against a message that I wanted to receive. A friend was sending me emails directly from a host that sent an automated daily digest. I decided to whitelist the domain:

# cat /etc/exim4/sender_local_deny_exceptions
somelinuxhost.net

Skip checking friendly hosts


My mailserver relays email for a couple of other servers I have on the internet. These host websites with contact forms that can change the "From" header to use the email address of the person who filled out the form.

In this case there will be various validation checks that fail.

018-12-29 17:51:59 no IP address found for host ip-x-x-x-x.eu-west-1.compute.internal (during SMTP connection from ec2-x-x-x-x.eu-west-1.compute.amazonaws.com (ip-x-x-x-x.eu-west-1.compute.internal) [x.x.x.x])
2018-12-29 17:51:59 H=ec2-x-x-x-x.eu-west-1.compute.amazonaws.com (ip-x-x-x-x.eu-west-1.compute.internal) [x.x.x.x] sender verify fail for <www-data@ip-x.x.x.x.eu-west-1.compute.internal>: Unrouteable address
2018-12-29 17:51:59 H=ec2-x-x-x-x.eu-west-1.compute.amazonaws.com (ip-x-x-x-x.eu-west-1.compute.internal) [x.x.x.x] X=TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128 F=<www-data@ip-x-x-x-x.eu-west-1.compute.internal> rejected RCPT <info-today@moff.tech>: michael-today UNKNOWN - Sender verification failed: Sender verify failed

You also need to give local server applications the ability to send email through exim4. For example, if you had a python script that generated an email at the of the script, you might see something like this in mainlog:

2018-12-30 23:44:56 1gdjpI-0005VU-Dg H=localhost (mail.moff.tech) [::1] F=<michael@moff.tech> rejected after DATA: michael DENY - No verifiable sender address in message headers: there is no valid sender in any header line

To add trusted hosts, including localhost, the solution is simple:

# cat /etc/exim4/host_local_deny_exceptions
127.0.0.1
x.x.x.x
y.y.y.y

In Conclusion


Don't waste your time integrating DNSBL into exim4 unless you do not want spamassassin to check for you.

So far so good. After a day of monitoring, not one piece of spam slipped through. I did discover that one email that I wanted was denied but due to the fact that the sender was someone firing email out from a machine not correctly setup to be a mailserver (I don't believe the sender cares enough to set that up).

This result was stunning when considering I could reduce spam not totally eliminate it, as appears to be the case after just 24 hours. One should expect to lose the odd email but careful review of the exim4 mainlog and rejectlog will help identify and whitelist desired 'special case' email senders.

Wednesday, July 6, 2016

Making a Mailserver (Part 5) - Wonderful Spam

This is an instalment in my series on setting up a Linux based mailserver. See these posts:

In this post we setup Spamassassin with exim4 to stomp on as much spam as possible. We won't give anything that looks like spam a chance to be delivered, we'll dump such messages before they even complete the delivery process.


And yet it spams

Why Spamassassin?

There are alternatives, a good friend of mine recently recommended dspam to me, so that's on my list to investigate. Spamassassin doesn't catch everything, mostly because some spammers are pretty damn good at their job. It does not do a great job of spotting messages delivering Malware and the tediously regular emails from bearing manufacturers.

If there was such a thing as a perfect solution, it wouldn't be by implementing just one technology. Nothing's perfect. I chose Spamassassin because it is mature, well understood, backed by Apache and easy to setup.

This is Kinda Interesting

For a good number of years I went without receiving a lot of spam. I didn't intentionally publish my email address on the web and many websites take care to obscure your email address, for example, when publishing mailing list archives.

I was intrigued by Steve Gibson's assertion in Security Now #557 that it takes multiple years before your email address really gets on Spammers' radar. He changes his email address once a year:
And something as simple as changing your email address loses spam. That is, it's just gone. And you might think that, oh, it's going to find you again within a week or two. No. It takes, I can attest to this, years, multiple years.
I'd naively held the belief that Spammers found your domain name and then worked through a list of common names to mail to (michael@, john@, chris@, etc). Perhaps they do do that, but it seems that scraping websites and database dumps is the most common and less time-wasting way of building a list of recipients.

In 2015 the level of spam quickly started to get out of control for me. Malware especially was really flooding in and even though I am largely Linux and Android focussed, constantly deleting spam made checking my email a tedious, rather than fun, task.

Plus one for Thunderbird however. I took the time to train Thunderbird's junk mail handling and it is really good. However, as I check email on my phone (most of the time), Thunderbird's junk handling wasn't going to help unless I always had it running in the background on some workstation, somewhere.

The spam was pouring in and since I was redirecting (aliasing via /etc/aliases) some mailboxes to Gmail accounts, I was ending up with a large queue of frozen messages because Gmail was not happy to handle redirected Spam. If you want a Spam free mailbox, there's arguably nothing better than Gmail. I was worried about the potential damage I was doing to my mail server's reputation with Gmail by reflecting Spam straight to Gmail.

Enter Spamassassin.

Exim and Spamassassin Integration

I recommend you first review the debian exim wiki. I did find it to be wrong in places when I referred to it. There was some Exim documentation about acl actions that also informed my config. I'll be providing a multi mail-domain example because I do handle mail for multiple domains. The following example will work for one mail domain or many.

I use exim4 split config files because ... that's what everyone else does.

In brief, the debian exim wiki tells you to do the following:
#apt-get install spamassassin
If you are using Debian Jessie or later (with systemd enabled by default), enable and start the service using systemctl;
#systemctl enable spamassassin.service
On earlier Debian releases, edit /etc/default/spamassassin ...
ENABLED=1 
...and then start the daemon.
#/etc/init.d/spamassassin start
At this point I found divergences between what the documentation tells you to do and what works in reality. The "add_header" did not work for me, following the wiki instructions. Here's how I set it up:

/etc/exim4/conf.d/acl/40_exim4-config_check_data
# warn
#   spam = Debian-exim:true
#   message = X-Spam_score: $spam_score\n\
#             X-Spam_score_int: $spam_score_int\n\
#             X-Spam_bar: $spam_bar\n\
#             X-Spam_report: $spam_report
#
# put headers in all messages (no matter if spam or not)
  warn  spam = Debian-exim:true
      add_header = X-Spam-Score: $spam_score ($spam_bar)
# add second subject line with *SPAM* marker when message
# is over threshold
  drop  spam = Debian-exim
#      add_header = Subject: ***SPAM (score:$spam_score)*** $h_Subject:
Important Points:
  • The debian docs "Subject" manipulation simply did not work for me. Refer to the "Rewriting Subject" section further below.
  • The debian docs used "nobody" as the user, I changed this to Debian-exim. Using "nobody" gets you all kinds of painful log messages.
  • The debian docs used "add_header = X-Spam-Report: $spam_report" on all  messages, this resulted in a message in the header of every email saying that the email had been detected as spam, regardless of the score. 
  • I do still insert the X-Spam_Score in every message.
  • I'm dropping anything over the Spamassassin threshold (required_score). The incoming message will be "rejected after DATA".
You will want to tinker with the threshold on dropped messages. 8 is too high, but it's better to start high and then inspect the score on the spam that makes it through. The bulk of spam appears to get very high scores, but between 4 to 5 there is a crossover between legitimate email and spam.

You also receive spam that gets scores as low as 1 and it's impossible to filter at that level without losing a lot of legitimate email.

You can tinker with the required score in /etc/spamassassin/local.cf and the default at the time of writing is 5, which I think is about right.
required_score 8.0
It's important to remember that the delivery agent is going to get a hard fail when a message scores over the required_score. It probably won't come back for a second try. A slighted mailing list server, for example, may mark your address as a hard fail and remove your subscription.

The rewrite_header in the Spamassassin config is meaningless because Exim is handling the mail and just asking Spamassassin for its opinion on the spam score. Other elements in the Spamassassin file are relevant to scoring the message.

That's it! That's all you need to do.

Rewriting Subject

I didn't implement this because I elected to dump the high scoring messages and write (for every message) the X-Spam-Score to the headers.

Reviewing the Efficacy

You really must spend days or weeks checking in with your Exim logs in addition to reviewing the Spam messages that are slip through.
  • When looking at the spam that hits your inbox, take a look at the X-Spam-Score header that was written in by Spamassassin. View the message source to see the headers. 
  • Don't be confused by  fake headers added by the spammer, such as fake Spam Score information.
  • There is often false information in the headers about being checked by this or that antivirus software. 
  • Message headers should be read from bottom to top. Each mail agent prepends its headers to the top of the message as the message bounces around mailserver to mailserver.
Review the Exim reject log. Here's what you should see when things are working:
# tail -vf /var/log/exim4/rejectlog 
2016-07-06 18:41:41 1bKptM-0006PS-7x H=208-180-142-165.chstcmtk01.com.sta.suddenlink.net [208.180.142.165] F=<xxx@swisslens.com> rejected after DATA
Envelope-from: <xxx@swisslens.com>
Envelope-to: <xxx@moff.tech>
P Received: from 208-180-142-165.chstcmtk01.com.sta.suddenlink.net ([208.180.142.165])
        by moff.tech with smtp (Exim 4.84_2)
        (envelope-from <xxx@swisslens.com>)
        id 1bKptM-0006PS-7x
        for
xxx@moff.tech; Wed, 06 Jul 2016 18:41:40 +0200
  Date: Wed, 06 Jul 2016 14:35:35 -0300
F From: "CamilleHot" <xxx@mndistaog.org>
R Reply-To: "CamilleHot" <xxx@mndistaog.org>
  X-Priority: 3 (Normal)
I Message-ID: <75461.14671320@mndistaog.org>
T To: xxx@moff.tech
  Subject: Come here! I want to make love to you
  MIME-Version: 1.0
  Content-Type: multipart/alternative;
        boundary="605311998876970"
  X-Spam-Score: 18.3 (++++++++++++++++++)

Notice the X-Spam-Score of 18.3 - high but not off the charts. Let's review the kind of scores we've recently seen:
# grep X-Spam-Score /var/log/exim4/rejectlog
  X-Spam-Score: 7.5 (+++++++)
  X-Spam-Score: 7.2 (+++++++)
  X-Spam-Score: 11.8 (+++++++++++)
  X-Spam-Score: 8.2 (++++++++)
  X-Spam-Score: 20.0 (++++++++++++++++++++)
  X-Spam-Score: 14.2 (++++++++++++++)
  X-Spam-Score: 18.4 (++++++++++++++++++)
  X-Spam-Score: 15.4 (+++++++++++++++)
  X-Spam-Score: 5.1 (+++++)
  X-Spam-Score: 16.8 (++++++++++++++++)
  X-Spam-Score: 13.6 (+++++++++++++)
  X-Spam-Score: 6.5 (++++++)
  X-Spam-Score: 14.2 (++++++++++++++)
  X-Spam-Score: 9.5 (+++++++++)
  X-Spam-Score: 18.5 (++++++++++++++++++)
  X-Spam-Score: 18.3 (++++++++++++++++++)
  X-Spam-Score: 6.1 (++++++)
In the period where you're still finding the right score level, I would recommend going back and looking at the headers on the 5.1 message: use "less" to simply view the file and search for the score string.

In fact I discovered, as I wrote this, that the 5.1 in the example above was a legitimate email about a delivery I was waiting on. Oh dear, perhaps I'll nudge the required_score up to 5.1. In my experience, 5.5 is too high.

It's easy to review the scores on the messages in your mailboxes. Scores can actually be negative in number, the lowest I've noticed is -11.89:

Your current inbox:
# grep X-Spam-Score /var/mail/you
Other folders:
# grep X-Spam-Score /home/you/mail/Trash

Keep looking at the Exim logs. Be curious. Learn what the headers mean. Tinker. It's fascinating stuff.

References

http://michaelfranzl.com/2015/02/11/exim-spamassassin-rewriting-subject-lines-adding-spam-score
https://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html
http://www.exim.org/exim-html-current/doc/html/spec_html/ch-access_control_lists.html
http://www.exim.org/exim-html-current/doc/html/spec_html/ch-content_scanning_at_acl_time.html
https://www.maretmanu.org/homepage/inform/exim-spam.php