The Config Report

The Config Report

Infrastructure as Code: Why Most IaC Projects Fail (Even With the Right Tools)

IaC Series – Issue 6 of 6: When Your Network Becomes Software

JJ – Chief Packet Pusher's avatar
JJ – Chief Packet Pusher
Jun 01, 2026
∙ Paid

Every failed Infrastructure as Code project starts the same way:

“We bought the tools.”

Then six months later:

  • configs still live in spreadsheets

  • nobody trusts Git

  • production changes happen directly on the firewall anyway

  • and the “automation server” is now just a Linux VM running three broken Python scripts and a cron job named:

final_final_v2_REAL.py

Because here’s the uncomfortable truth nobody puts in vendor webinars:

Terraform didn’t fail your project.

Ansible didn’t fail your project.

Git didn’t fail your project.

Your processes failed your project.

Over the last five issues, we talked about:

  • declarative vs imperative automation

  • state files

  • configuration drift

  • immutable infrastructure

  • replacing systems instead of endlessly repairing them

But this final issue is the most important one in the entire series because it explains why so many Infrastructure as Code projects quietly die in conference rooms while everyone pretends “automation is the future.”

The tools are rarely the real problem.

The environment is.


Infrastructure as Code Doesn’t Fix Chaos

One of the biggest misconceptions in networking is this:

“If we automate it, things will get better.”

No.

If your environment is chaotic, automation just helps you break things faster and more consistently.

Infrastructure as Code is basically a magnifying glass for technical debt.

If your environment has:

  • inconsistent naming

  • undocumented VLANs

  • random firewall rule logic

  • duplicate address objects

  • mystery static routes

  • “temporary” NAT rules from 2019

  • switches configured differently for absolutely no reason

…then automation doesn’t magically clean that up.

It weaponizes it.

Because now your bad decisions deploy instantly at scale.


The Most Dangerous Phrase in IT

“We’ll clean it up later.”

No you won’t.

You’ll automate around it.

Then document around it.

Then build workarounds for the workaround.

Then eventually nobody remembers why VLAN 317 exists but everyone is too afraid to remove it because it might somehow control HVAC for an office in Nebraska.

Infrastructure as Code works best in environments that value:

  • standards

  • consistency

  • documentation

  • repeatability

Not:

  • vibes

  • tribal knowledge

  • and “Don’t touch that switch, Gary configured it.”


The Source of Truth Problem

Most IaC projects fail because nobody agrees on where truth actually lives.

Is it:

  • the firewall?

  • Panorama?

  • Aruba Central?

  • Git?

  • NetBox?

  • an Excel spreadsheet?

  • Steve’s OneNote notebook?

  • that Visio diagram from 2021?

Because if engineers can still make direct production changes outside the code workflow…

…your Infrastructure as Code project is already drifting.

That’s the hard truth.

You cannot have:

  • Git-based infrastructure
    AND

  • random production edits

at the same time.

Eventually one becomes fiction.

And unfortunately it’s usually the Git repo.


The Controller Trap

This is where network engineers get stuck all the time.

They think:

“We use templates in our controller, so we’re doing IaC.”

Not exactly.

Controllers help with:

  • consistency

  • templating

  • standardization

  • centralized management

And those are GOOD things.

But true Infrastructure as Code means:

  • the code repository is the source of truth

  • changes are version controlled

  • deployments are repeatable

  • rollback is possible

  • history is auditable

If the real source of truth is still:

  • clicking in a GUI

  • changing templates manually

  • or editing objects directly in production

…you’re still operating manually.

Just with nicer dashboards.


“Can You Automate This Real Quick?”

This sentence has destroyed more automation projects than bad YAML ever will.

Because many companies want:

  • Netflix-level automation

  • Google-level reliability

  • Amazon-scale deployment pipelines

…but the actual environment looks like:

  • no lab

  • no testing

  • no Git standards

  • no code reviews

  • no rollback process

  • no change validation

  • and one overworked engineer trying to build automation between ticket escalations.

Infrastructure as Code is not a side quest.

It is an operational model.

That means:

  • process changes

  • team buy-in

  • maintenance standards

  • testing standards

  • documentation discipline

  • and leadership support

Without those things, the tooling eventually becomes shelfware.

Or worse:
a collection of half-working scripts everyone is scared to run.


The “One Automation Person” Problem

Every company eventually creates:

“The Automation Guy.”

You know the one.

The engineer who:

  • writes all the scripts

  • understands the APIs

  • maintains the pipelines

  • fixes the broken jobs

  • knows where the tokens are stored

  • and becomes the only human capable of safely touching the automation platform.

Congratulations.

You accidentally created a new single point of failure.

Real Infrastructure as Code maturity happens when:

  • workflows are documented

  • repos are shared

  • standards are consistent

  • and automation survives employee turnover

Because if your automation platform dies the moment one engineer takes PTO…

…you didn’t build automation.

You built dependency.


Why Testing Gets Ignored

Testing sounds great until someone asks:

“Can we just push it directly to prod?”

And suddenly everybody becomes very confident.

This is where IaC projects start getting dangerous.

Because automation without testing is basically:

“What if outages happened faster?”

Real IaC environments need:

  • development environments

  • QA validation

  • CI/CD pipelines

  • syntax checks

  • peer review

  • deployment approval

  • rollback plans

Otherwise your deployment process becomes:

  1. Push code

  2. Pray aggressively

  3. Open incident bridge


Nothing Humbles an Engineer Faster Than Automation

Manual mistakes usually affect:

  • one device

  • one site

  • one VLAN

  • one firewall rule

Automation mistakes affect:

  • ALL OF THEM

Immediately.

Simultaneously.

At machine speed.

Nothing builds character like accidentally disabling BGP on 42 sites because of one variable typo.

That’s why mature IaC teams become obsessed with:

  • validation

  • testing

  • approvals

  • guardrails

  • and staged deployments

Not because they’re slow.

Because they’ve suffered before.


The Real Mindset Shift

This is the part most people miss.

Infrastructure as Code is NOT:

  • Terraform

  • YAML

  • Python

  • GitHub

  • Ansible

  • pipelines

  • APIs

Those are just tools.

Infrastructure as Code is really about:

  • discipline

  • repeatability

  • version control

  • operational maturity

  • standardization

  • and removing human randomness from production.

That’s the actual transformation.

The engineers who succeed with IaC are usually not the “best programmers.”

They’re the engineers willing to:

  • standardize

  • document

  • test

  • collaborate

  • and stop treating production like a handwritten art project.


Where Most Teams SHOULD Start

Most teams try to jump straight into:

  • full Terraform deployments

  • self-healing infrastructure

  • dynamic provisioning

  • automated rollback systems

Meanwhile their switch naming convention still changes every building.

Start smaller.

Much smaller.

Start with:

  • firewall rules

  • VLAN definitions

  • templates

  • DNS records

  • object groups

  • ACLs

  • NAT policies

Then:

  • put them in Git

  • review changes in pull requests

  • validate before deployment

  • deploy consistently

That alone puts you ahead of most enterprise environments.

Seriously.


The Best IaC Advice Nobody Wants to Hear

You do NOT need:

  • Kubernetes

  • a giant DevOps team

  • expensive tooling

  • AI-generated YAML

  • or twelve cloud certifications

You need:

  • consistency

  • process

  • source control

  • documentation

  • and leadership willing to stop bypassing procedures “just this once.”

Because “temporary exceptions” are how configuration drift becomes company culture.


Final Thoughts

Infrastructure as Code is not about becoming a programmer.

It’s about building infrastructure that survives:

  • outages

  • audits

  • growth

  • acquisitions

  • team turnover

  • and your future self at 2:13 AM trying to remember why a production firewall rule says:
    TEMP-DO-NOT-REMOVE-FINAL

The engineers who succeed with IaC are not the ones with the fanciest tools.

They’re the ones willing to change how infrastructure is operated.

That’s the real challenge.

And honestly?

That’s why most projects fail.


Paid Subscriber Section

What Mature IaC Teams Actually Do Differently

Everything above explains why projects fail.

This section is about what successful teams actually do in the real world.

Keep reading with a 7-day free trial

Subscribe to The Config Report to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2026 JJ from The Config Report · Publisher Privacy ∙ Publisher Terms
Substack · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture