In part 1 of this series, we explained what DMARC is and why it is part of the solution to leveling up your email security architecture against emerging threats. In part 2, we explored the security challenges and relevant threat vectors around using multi-tenant cloud email senders and how they overlap with the protections offered by currently implementing the DMARC protocol.
In this final part, we will look at the challenges of deploying and maintaining a compliant policy over the lifetime of your business. For the purposes of simplicity, we will focus on the SPF protocol which DMARC relies on, in part to ensure the authenticity of your outbound communication.
What is SPF?
SPF as a standard predates DMARC and is relied upon to provide one of the authenticity signals for DMARC compliant messages. On the surface, it is a simple standard that ultimately boils down to a whitelist of IP addresses served over DNS that are allowed to originate email for your domain. As a result, it is usually everyone’s first port of call when implementing email authentication1.
An SPF record is advertised on your domain. For example, let’s use correcthorsebatterystaple.com
The owner of this domain demonstrates to the world that they have control of the domain by advertising a special type of DNS record called a TXT record on the domain. This is a commonly used verification mechanism. If you have ever set up Google Analytics, you may have done this before2.
Similarly, the SPF record is advertised on correcthorsebatterystaple.com
as a TXT record and tells the world what the owner or authorised person for this domain wants to allow specific mail servers, as identified by their IP address to send email that comes from the domain correcthorsebatterystaple.com
. At its core, the TXT record is a list of these IP addresses in a machine readable format.
For example, v=spf1 ip4:35.190.247.3 ~all
is a simple instruction to allow all email originated by the server that has the IP address 35.190.247.3
. You can tack on additional IP addresses and build a static list of every server you want to authorize. So far so straight forward. However, the real world gets in the way. As we discussed in part 2 of this series, most people don’t have servers in their data centers that they authorize anymore. Most actual mail sending is handled by servers on multi tenant providers that usually provide a service further up the messaging value stack. While it is possible to ask each supplier for a list of IPs so they can be specified in the customer’s domain, this workflow is fragile and impractical. To improve on this, the SPF mechanism has a include
mechanism that allows the owner of the domain to delegate the authorization of specific IP addresses to someone else. In this way, the owner of correcthorsebatterystaple.com
can say they want Google to specify the IP address using the mechanisms using an include mechanism e.g. include:_spf.google.com
. In principle, after setting this the owner never has to worry about the specifics as they have handed off the job of setting and maintaining this to Google who is really the best place to supply and maintain this detail.
Errors in SPF records are more common than you think
What if the provider makes a mistake in forming their record? Remember, this record is interpreted by machines and has to conform to a strict specification most recently defined in RFC 7208. If during evaluation the receiver encounters an error, the mail recipient is expected to stop evaluation of the SPF authentication process and return a specific type of error. Before DMARC, these errors were considered hints. In most cases, mail would get delivered but the email may get a warning in the user’s UI or in the worst case, sent to spam. After you deploy a DMARC policy of ‘reject’, these errors mean email goes nowhere.
We did an analysis of the primary domain of around 5,000 businesses and found errors in around 10% of those domains. This is not an entirely uncommon occurrence. Some of the errors are easy to understand.
For example, one domain had removed a sending service but had forgotten to clean up the SPF record properly. It now reads v=spf1 ip4:35.190.247.3 include: ip4:35.190.237.3 ~all
. That include:
just dangles there and breaks everything after it, and the IP address that follows (35.190.237.3
) will never be allowed to send email.
In another instance, a domain owner tries to authorize Office 365 with an entry that reads include:spf.prod.outlook.com
. Unfortunately, the correct entry is include:spf.protection.outlook.com
, there is no SPF entry at spf.prod.outlook.com
, technically a NXDOMAIN and everything after that incorrectly specified include is a failure.
When we test these errors by emulating these broken records and sending emails to common cloud mailboxes such as outlook.com or gmail.com, we see them fail SPF and send mail to spam or outright refuse to accept it. It is no surprise that IT administrators and end users struggle to understand why email behaves the way it does.
How can you prevent these errors?
As you might expect, there are a number of tools that will check these SPF records and verify that a mistake has not been made. This gives the owner the ability to correct them. However, when the owner doesn’t have direct control but is simply including something where someone else has made a mistake, the remediation can be more problematic. Something that was correct once can also break in the future. One common example is including a service that goes away. While we have never spotted Google or Microsoft make a mistake, the same can’t be said about other providers. If anyone included zohocrm.com
as of 2nd Dec 2019 it would break evaluation as it included a service called localtransmail.net
which was no longer returning a valid SPF record3.
So fixing your SPF at a specific point in time does not guarantee that it will always continue working. Monitoring your SPF for correct evaluation is possible through services or scripts if you want to roll your own but when entries break, some percentage of your mail will stop getting delivered as intended while you remediate.
Managing all of this in a manner that is resilient to human error and technical failure is hard work. At Red Sift, we care deeply about automating the things we don’t believe you should be worrying about. Our OnDMARC compliance tool includes a hosted SPF solution that does all the work for you with every single request. If we detect errors anywhere in your authorization tree, we clean it up and always present an error free and reliable record. We call it DynamicSPF and it is trusted to authenticate SPF by some of the largest B2B and B2C organisations in the world and does everything it can to keep mail flowing. We believe every organisation that is looking to achieve, and more importantly maintain DMARC compliance needs a solution around the challenges underlying SPF. We believe so strongly about this that we include DynamicSPF in every OnDMARC account, even our free tier.
Footnotes
1Alternatively, one may choose to implement DKIM which uses private key/public key signing on every message to signal that the message was originated by a server that has access to the private key associated with the public key that is advertised on your domain. If it sounds a little complex, that is because it is. While it offers some benefits, managing keys, rotation and canonicalization of the message fields you choose to sign requires some expertise and is best handled by a high quality cloud email provider. For those who need to manage this themselves, we provide deep investigative tooling that can help with DKIM as we encourage everyone who is on a DMARC deployment project to provide the highest level of authentication for every message they can as it can improve the deliverability heuristics used by most recipient inboxes.
2Google tells you to put some magic values up on these domains so they can verify that the person who has created the Google Analytics account is also the person who has access to or is authorised by the correcthorsebatterystaple.com domain owner to receive reporting for this domain. This is important as it provides a layer of security and prevents some threat vectors that would allow a malicious actor to set up a bogus Google Analytics account and receive analytics data for a domain/company that they are not authorised for. If you think analytics data can’t be that important, imagine if this was a large ecommerce site that was being targeted for competitive intelligence or a publicly listed company that was being traded based on stolen realtime performance analytics. Practically all data in 2020 matters.
3This has since been fixed but was observed for a significant number of days.