September 2025 update: This was originally posted to LinkedIn in July 2025, but I'm reposting it here to have most of my technical writing in one place.
A couple years ago, I wrote about configuring a comprehensive email security stack for my personal domain, including SPF, DKIM, DMARC, and MTA-STS. I considered it a classic "set-and-forget" project.
But "set-and-forget" can easily become "complacency", and complacency is the enemy of security and reliability. A recent deep dive into my domain's configuration revealed a subtle but critical flaw, not in my setup, but in my provider's. This discovery kicked off a journey to migrate my entire DNS infrastructure with zero downtime, and in the process, build a more robust and modern architecture.
This project became a lesson in the importance of continuous verification and choosing transparent, standards-compliant tools.
The Trigger: When Secure Isn't Truly Secure
My journey started with a routine DNSSEC audit using the DNS visualization tool DNSViz. While most checks passed, I discovered a recurring failure in Authenticated Denial of Existence. In simple terms, my DNS provider's nameservers could prove when a record existed, but they couldn't provide a valid cryptographic proof when a record did not exist.
This is a serious flaw, as it undermines the trust model of DNSSEC and exposes the domain to potential cache poisoning and downgrade attacks.
When a support interaction confirmed a fix wasn't forthcoming, it highlighted the risk of relying on opaque services where deep diagnostics are unsupported. I decided to migrate my DNS hosting from Squarespace to Cloudflare for a robust, standards-compliant foundation.
The Blueprint: Engineering a Zero-Downtime Switch
A DNS migration is like performing surgery on a live system. The goal is to ensure no service—web traffic, email delivery, or authentication—is disrupted. Because my previous provider also handled web and email forwarding, the migration required a detailed plan to replicate these services and ensure no disruption to mail delivery or authentication.
1. Mitigating DNS & Policy Caching Risks:
As a preparatory step, I preemptively lowered the DNS TTLs for my dmarc, mta-sts, and spf records. This, combined with reducing my MTA-STS policy's max_age, was a key step to ensure operational agility during the cutover, preventing long delays from cached data.
2. Taming the Transitional State with MTA-STS:
My biggest concern was inbound email. My domain uses MTA-STS, which tells mail servers to only connect to specific MX hosts over an encrypted line. Simply changing my MX records would violate my own policy during the DNS propagation window. The solution was to implement a transitional MTA-STS policy that explicitly authorized both the old and new mail servers, making the migration invisible to compliant senders and preventing any rejected emails.
3. Merging Authentication with a Unified SPF Record:
My new setup required SPF authorizing both SendGrid (for outbound) and Cloudflare (for inbound forwarding). This meant merging them into a single, unified SPF record:
v=spf1 include:sendgrid.net include:_spf.mx.cloudflare.net ~allThis is a small but critical detail that, if overlooked, would have broken DMARC alignment for one of my mail streams.
4. Replicating Services with Explicit Rules:
- Web Forwarding: A Cloudflare Page Rule was configured to redirect web traffic to my LinkedIn profile.
- Email Forwarding: Cloudflare Email Routing was set up to manage all inbound mail, including a catch-all address.
Revisiting a Critical Detail: SPF, Forwarding, and ARC
In my previous article, I didn't address a critical detail: how email forwarding inherently breaks SPF alignment because the final recipient sees the forwarder's IP, not the original sender's.
Authenticated Received Chain (ARC) solves this by providing a trusted "chain of custody." An ARC-aware forwarding service validates the original SPF and DKIM signatures upon receipt. It then attaches its own signed header (ARC-Seal) to the message before forwarding it. This seal essentially tells the final recipient, "When this message arrived at my server, its original authentication was intact. You can trust it." This allows an inbox like Gmail to see a verifiable audit trail and trust the message despite the final SPF alignment failure.
My goal for this migration wasn't just to fix the immediate DNSSEC bug, but to build a more resilient and transparent architecture. Ensuring I have an explicit, standards-compliant implementation of ARC was a key part of achieving that.
The Lesson in Restraint: Why I Chose Not to Implement DANE
With a fully working DNSSEC setup, the next logical step seemed to be DANE, a standard for binding TLS certificates to DNS records. However, after researching it, I made a conscious decision not to implement it.
DANE requires a stable, predictable link between a DNS TLSA record and a server's certificate. This is incredibly fragile when using large, shared email providers who rotate their certificates on their own schedule. Implementing DANE in this environment would lead to inevitable, hard-to-debug delivery failures.
This reinforced a lesson in technical maturity: the goal isn't to implement every standard, but to architect a solution using the most appropriate standards for the environment. For my architecture, DNSSEC + MTA-STS provides the highest level of practical and robust transport security.
DANE is ideally suited for vertically-integrated environments where an organization controls the entire stack. For architectures leveraging large-scale cloud providers, its operational fragility currently outweighs its benefits.
Conclusion: From Complacency to Confidence
This journey from diagnosing a subtle bug to executing a multi-faceted migration was a powerful reminder that in modern system administration, there is no finish line. Platforms evolve, standards improve, and our understanding deepens. By embracing continuous verification and choosing transparent, standards-compliant tools, we can move beyond "set-and-forget" to build infrastructure that is not just functional, but resilient.