The Config Report

The Config Report

Infrastructure as Code: Configuration Drift Is Killing Your Network

IaC Series – Issue 4 of 6: When Your Network Becomes Software

JJ – Chief Packet Pusher's avatar
JJ – Chief Packet Pusher
Apr 27, 2026
∙ Paid

🧨 Everything Was Fine… Until It Wasn’t

Everything was working yesterday.

No alerts.
No complaints.
No tickets titled “URGENT: THE INTERNET IS BROKEN AGAIN.”

Then today?

  • One site can’t reach another

  • A firewall rule “definitely exists” (but somehow doesn’t work)

  • That VLAN you swear was there… is gone

  • And nobody touched anything… apparently

Yeah. Sure.

Welcome to configuration drift — the silent killer of stable networks.


🧠 What Configuration Drift Actually Is

Configuration drift is what happens when your intended configuration and your actual configuration slowly drift apart over time.

Not because of one big change…

…but because of hundreds of tiny ones:

  • “Quick fix” CLI changes

  • Emergency firewall rules

  • Someone tweaking a port at 2AM

  • A “temporary” change that became permanent

  • That one engineer who “just logs into the box real quick”

Multiply that across your environment and you get:

“Mostly consistent… except for all the parts that aren’t.”


🧟 The Real Problem: Drift Feels Invisible

Here’s why drift is dangerous:

It doesn’t break things immediately.

It builds slowly. Quietly.

Until one day something depends on a configuration that used to be true…

…and now it isn’t.

That’s when you get:

  • “It works in one site but not another”

  • “It worked last week”

  • “QA works, prod doesn’t”

  • “Same config… I think?”

Spoiler: it’s not the same config.


🔥 Real-World Drift Examples

Let’s be honest… you’ve seen at least one of these recently:

Firewall drift
That rule exists… just not on the device you need.

Switch drift
VLAN on the core? Yes.
On the access switch? Not even close.

Cloud drift
Terraform says one thing.
The cloud console says another.

Guess which one is actually running?

“Temporary fix” drift
“We’ll remove that later.”
We did not remove that later.


💀 Why Most Networks Are Drift Factories

Most environments are set up in a way that guarantees drift:

  • No true source of truth

  • Direct device changes

  • Multiple engineers doing their own thing

  • No enforced process

  • No validation after changes

It’s basically:

“Everyone just try your best and don’t break anything.”


🧩 How IaC Fixes Drift (If You Actually Use It)

Infrastructure as Code doesn’t magically fix drift…

…but it gives you control over it:

  • One source of truth (Git)

  • Repeatable deployments

  • Change visibility

  • The ability to compare intended vs actual

But none of that matters if people ignore it.


⚠️ The Part Nobody Likes Hearing

You don’t have a tooling problem.

You have a behavior problem.

You can have:

  • Ansible

  • Terraform

  • Pipelines

  • Clean repos

…and still have drift everywhere if people are:

  • Logging into devices

  • Making “just one quick change”

  • Skipping the process

IaC only works when:

Code becomes the ONLY way changes happen.

Not optional. Not “preferred.”

Required.


🧠 The Mindset Shift

Stop thinking:

“The device is the source of truth.”

Start thinking:

“The device is just a deployed artifact.”

If it drifts?

You don’t fix it manually.

You reapply the correct config from code.


🛠️ Quick Win You Can Do This Week

Pick ONE thing you change often:

  • Firewall rules

  • VLANs

  • Interface configs

Then:

  1. Export it

  2. Put it in Git

  3. Treat it as the source of truth

  4. Make changes there first

Even if you still apply changes manually…

you’ve already reduced drift.


🔒 Paid Section: How to Actually Kill Drift

Up to this point, you’ve probably realized something uncomfortable:

Your network isn’t broken… it’s just slowly drifting out of control.

And the worst part?

Most teams don’t notice until it turns into:

  • a production outage

  • a security gap

  • or a “this makes no sense” troubleshooting nightmare


💡 Here’s What We’re About to Fix

In the rest of this issue, I’m going to show you exactly how to start controlling drift using the tools you already have.

No enterprise platform required. No massive rebuild.

Just practical, real-world steps.


🔓 What You’ll Learn in the Full Issue

  • How to detect configuration drift automatically using Ansible + APIs

  • A simple drift audit workflow you can run this week

  • How to enforce “no manual changes” without slowing your team down

  • A clean, realistic Git + pipeline structure for network configs

  • A real-world drift outage scenario (and how it should have been prevented)


If you’ve ever said:

“That shouldn’t have changed…”

This is where you fix that problem for good.


(Unlock the rest below 👇)

Keep reading with a 7-day free trial

Subscribe to The Config Report to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2026 JJ from The Config Report · Publisher Privacy ∙ Publisher Terms
Substack · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture