Security Configuration · Security Champions

Module 7a · Security Configuration

Security Champions · Module 7a

Security Configuration

Defaults are not secure. Drift is a defect. Part 1 of 2 — Configuration & Defaults.

01 / 25 · Cover

Your journey · Program map

Your journey so far

9 modules. One toolkit. You are at Module 7.

Context

Your toolkit · So far

What you already have

Modules 1–6

M1–2

Champion role, sprint workflow, business language

The foundation for all Champion work

M3

Risk vocabulary: OWASP Top 10, five lenses

Pattern recognition across architecture layers

M4

Threat finding: STRIDE → DREAD → tickets

Systematic discovery and prioritization

M5

Supply chain: three surfaces, SBOM, SLSA

Verified code, verified dependencies, verified pipeline

M6

Secret management: lifecycle, rotation, health check

Secrets tracked, rotated, and audited quarterly

Context

Module 7 · Where it all runs

Where it all runs

Supply chain delivers code (M5). Code contains secrets (M6). Now: where does it all run?

Configuration is where the most common security failures happen. Misconfiguration rose to #2 on the OWASP Top 10.

This module shows you how to lock it down:

1

Secure configuration baseline

A known-good state you can measure drift against

2

Access control: who has access to what

Least privilege, not convenience-driven permissions

3

Key habit: config drift = defect

A bug in Jira, not a note in a wiki

Context

Act 1 · Why it happens

Why does it keep happening?

You know not to hardcode credentials. Your team has done security training. You've seen the post-mortems. And yet — hard-coded API keys still show up in production. Default configs ship unchanged. Ports stay open for months after a "temporary" exception.

This isn't about ignorance. It's about the conditions under which experienced people make predictable mistakes.

The question isn't "do you know better?" — you do. The question is "what makes smart people skip the step they know matters?"

05 / 25

Act 1 · Root causes

The five forces

Five conditions reliably produce misconfigurations. Not one of them involves incompetence.

The deployment needs to go live. The client demo is tomorrow. The hotfix must ship now. Security becomes a trade-off — "we'll harden it after this hot period." The hot period never ends. You never come back.

KMS started as a nice service for storing small secrets. Today, one unchecked checkbox can leave your certificates unrotated and your system exposed. Complexity doesn't grow year to year — it grows month to month. And no human tracks all the implications of every option.

Production is down. You whitelist an IP directly in the console. Add a security group rule via CLI. Grant explicit permissions to unblock a deploy. It works. Nobody records the change. It stays forever.

Permissions accumulate. Accounts that were needed six months ago still have contributor access. Service accounts multiply — in most organizations, there are more service accounts than people. Nobody audits them because nobody owns them.

A standard means something written down. If you have no baseline, you can't tell what changed. If you can't tell what changed, you can't catch misconfigurations. They're invisible until they're exploited.

06 / 25

Act 1 · Failure modes

When it goes wrong

Misconfigurations appear in two distinct moments.

From scratch — setting up a new deployment, a new pipeline, a new environment. The wrong assumption is made at the beginning, and everything built on top of it inherits the flaw.

During operation — changing a running system under pressure. This is the more dangerous mode. You're changing something that already works. The "just this once" change that becomes permanent because nobody tracks it.

Configuration from scratch is a planning failure. Configuration during operation is a process failure. Both require different controls.

07 / 25

The eternal debate

" "

The eternal debate

"It should work."
— "No, it should work securely."
— "No. It should work. Now. We'll fix it later."

Later never comes.

08 / 25

Time to think · Five forces

FIVE FORCES BEHIND MISCONFIGURATIONS

🔥

Pressure

🧩

Complexity

✋

Manual changes

👤

Identity debt

📏

No baseline

They compound

Which of these is strongest in your current project?

Pause and reflect

Act 2 · Baselines

The default trap

Every system ships with defaults optimized for quick setup, not security. Redis: no authentication, no TLS, plain-text protocol. MySQL: guest user enabled, test database accessible. Kubernetes: API server on port 8080 bypasses all auth.

One of the earliest hardening scripts was mysql_secure_installation — a simple command that removed the guest user, dropped the test database, and set a root password. It was primitive. But it established a principle: the first thing you do after install is harden.

Today, hardening is more complex. But the principle hasn't changed: defaults are a starting point, not a destination. Every environment needs a specific baseline — and that baseline needs to be written down.

A baseline is like rails — it helps you move faster. But when you drift from the rails, that drift is a defect. Track it. Record it in Jira. Don't accept it as "that's just how it is."

Multiple baseline frameworks exist — CIS Benchmarks, NIST guidelines, STIGs, dev-sec.io hardening profiles. Start with CIS Benchmarks for your cloud provider — they're the most practical, widely adopted, and directly map to specific configuration checks. AWS, Azure, and GCP each have dedicated CIS Benchmarks with step-by-step hardening guides.

10 / 25

★ Best practice

★Best Practice

Treat every baseline drift as a defect

This is the single most effective habit for managing configuration security. When you have a baseline — any baseline — and something deviates from it, don't file it as "known risk" or "accepted deviation."

File it as a defect. Put it in Jira. Assign it. Track it.

Why this works: it makes drift visible — all deviations are in one place. It creates accountability — someone owns the fix. It creates a timeline — you can see when drift accelerated. And it enables periodic review — "show me all configuration defects from Q3."

Without this habit, drift is invisible. It compounds silently until an auditor or an attacker finds it.

You don't need a perfect baseline. You need a written baseline and the discipline to track every deviation from it.

Champion's takeaway

The fix isn't better documentation — it's automation that enforces the baseline. If a human has to remember to change a default, it won't happen consistently.

11 / 25 · Best Practice

Act 2 · Credentials

The obvious problem nobody solves

Hard-coded credentials in source code. In environment templates. In container images. In Slack messages. It's the most obvious security issue in the industry — and it persists in a significant number of production systems.

The reasons are human, not technical. Time pressure — "I need to test this integration right now. I'll move the key to a vault later." Copy-paste errors — "I was sure I pasted into the export command, not the config file." Two-screen mistakes — switching between terminal and editor, losing track of what went where. Proof-of-concept inertia — the project started as a PoC, the key was hardcoded for speed, the PoC became production.

Policies and talks don't prevent hardcoded credentials. After 11+ years of experience, only two things work: pipeline tooling that blocks the merge, and culture where "no secrets leave the laptop" is a reflex, not a rule. Tools like git-leaks, TruffleHog, and Snyk must be in the pipeline before the team starts delivering — not added retroactively when someone finds a key in production.

12 / 25

★ Best practice

★Best Practice

Install secret scanning before the first line of code

The investment problem is real: new projects, especially proof-of-concepts, feel like they can't justify pipeline tooling. "We'll add security later, when there's budget."

This is where bad habits form. A developer hardcodes a key in week one. By month three, there are seventeen hardcoded secrets across four services. Retrofitting is exponentially harder than preventing.

The fix is cultural + technical: start every project with .env files — never put secrets in code, not even for testing. No production access for developers — separation of environments from day one. Pipeline scanning (git-leaks, TruffleHog) installed as part of project bootstrap, not as a security add-on. Make it a reflex: "no secrets leave the laptop."

People, process, technology — all three. But technology must be in place before delivery starts.

13 / 25 · Best Practice

Knowledge check · midnight hotfix

Knowledge check

Your team finishes a hotfix at 2am. A DevOps engineer whitelists a partner's IP address directly in the AWS console — not via Terraform — to get the integration working. The fix is successful. What happens next?

D. Manual changes create state drift AND are forgotten. The Terraform state doesn't know about the console change, so the next apply may silently remove it — or worse, the change persists undocumented for months. This is why manual changes must be tracked as defects and reconciled with IaC.

14 / 25 · Quiz

Time to think · Drift

CONFIGURATION DRIFT

Drift compounds. Each "just this once" adds up until nobody knows the real configuration.

Drift visualization

Act 3 · Privileges & encryption

The wildcard problem

Over-privileged IAM roles are significantly more common in AWS than in Azure. The reason isn't that AWS engineers are less careful — it's that the platforms have different design philosophies.

Azure invested heavily in a group-based role model with predefined, well-scoped built-in roles. The UI makes it intuitive to assign a named role to a group. Creating a custom role with specific permissions takes time, but the path is clear.

AWS IAM is more powerful but more complex. JSON-based policies, wildcards that are easy to add and hard to audit, roles that accumulate permissions over time. The result: during audits, wildcard permissions and unrotated secrets appear frequently.

The same pattern shows up in Kubernetes — service accounts created with broad permissions "to get things working" and never scoped down. Tools that help: Prowler and Scout Suite can scan your cloud accounts and identify over-privileged roles, unrotated secrets, and policy violations. But the tool only finds the problem — fixing it requires process.

16 / 25

Act 3 · Encryption at rest

The checkbox nobody checked

A question for you: are all your production databases encrypted at rest right now?

If you're honest, the answer is probably "I'm not sure." And that uncertainty is the problem.

Databases end up unencrypted in production for predictable reasons. Someone forgot to tick the encryption checkbox during launch — or forgot the config attribute in Terraform. The database went to production. Data flowed in. Users connected. Everything worked fine.

Six months later, a compliance audit asks: "Is this database encrypted at rest?" Now you have a problem. Encrypting a running production database requires either stopping production, launching an encrypted replica and migrating, or encrypting record by record. All options are expensive. The checkbox at launch time cost zero.

If your database is encrypted at rest, your backups must be too — with at least the same level of security. An unencrypted backup of an encrypted database is a hole. Check: does your backup configuration match your database configuration?

17 / 25

★ Best practice

★Best Practice

The cost multiplier — catch misconfigurations early

The cost of fixing a misconfiguration increases exponentially the later you catch it.

Pipeline check — Checkov/Terrascan flags missing encryption in your Terraform file. Cost: seconds. Fix: one attribute in code.

Staging catch — QA or security review notices the database isn't encrypted. Cost: hours. Fix: recreate the database, migrate data, update connections.

Production retrofit — compliance audit discovers the gap six months in. Cost: days to weeks. Fix: maintenance windows, migration planning, resource allocation, risk documentation.

The ratio is roughly 1× : 10× : 100×. Pipeline checks aren't overhead. They're the cheapest security investment you'll ever make.

18 / 25 · Best Practice

Knowledge check · encryption

Knowledge check

A production database was launched 6 months ago without encryption at rest. A compliance review now requires it. What's the realistic path forward?

B. Retroactive encryption of a running production database is possible but requires significant effort: a parallel encrypted database, data migration, connection updates, and usually some form of maintenance window. Option A is theoretically possible for some managed services, but rarely "zero downtime" in practice. Option C doesn't satisfy "encryption at rest" requirements. This is why pipeline-stage checks matter — the fix at launch time costs zero.

19 / 25 · Quiz

Time to think · Cost multiplier

THE COST MULTIPLIER

1×

Pipeline

Seconds to fix. One config attribute.

10×

Staging

Hours. Recreate + migrate.

100×

Production

Days. Maintenance windows. Migration planning.

Every stage you wait multiplies the cost. Catch it in the pipeline.

Cost visualization

Act 3 · Ports & exceptions

The exception that became permanent

Security groups and firewall rules accumulate exceptions the same way IAM roles accumulate permissions. A vendor needs access — you add a rule. An integration requires a port — you open it. A partner needs a temporary IP whitelist — you add it.

Each exception is justified at the time. Few are removed when the reason expires.

The challenge is compounded by infrastructure-as-code. Even when you manage security groups via Terraform, there's always the option to change them via CLI or the cloud console. And when someone does, the IaC state and the actual cloud state diverge silently.

When auditing security groups or IAM rules, check both sources at the same time: the Terraform/Ansible files AND the current cloud state. If they don't match, someone made a manual change. That change is either a tracked defect (good) or an invisible drift (dangerous).

21 / 25

Act 3 · Transit encryption

The other encryption

Encryption at rest gets most of the attention. But encryption in transit — TLS between services, between databases and applications, between APIs — introduces its own complexity: certificate management.

Certificates expire. They get revoked. They need to be renewed before expiration, which requires automation or very reliable manual processes. A certificate renewal failure at 3am can take down production just as effectively as a security breach.

The answer is automation — certificate management tools, auto-renewal, monitoring for upcoming expirations. But the first step is knowing what certificates you have and when they expire. Many teams discover they don't have that inventory until something breaks.

22 / 25

Part 1 · Summary

What you covered

01

Five forces produce misconfigurations — pressure, complexity, manual changes, identity debt, and missing baselines. None of them involve incompetence.

02

Defaults are starting points, not destinations. Every environment needs a specific, written baseline.

03

Drift from baseline is a defect. Track it in Jira. Don't accept it as "known risk."

04

Hard-coded credentials persist because of conditions, not ignorance. Pipeline scanning must be in place before the first delivery.

05

The cost of fixing misconfigurations multiplies 10–100× with each stage. Catch them in the pipeline.

23 / 25 · Summary

Module 7a · Results

Your results

Part 1 complete

—

Correct

—

XP earned

—

Best streak

24 / 25 · Score

Next · Part 2

You now understand why configurations go wrong and how to establish baselines. In Part 2: how to control who has access to what — and how to enforce it automatically.

25 / 25 · Bridge