The 2 AM Wake-Up Call
This kind of incident is more common in the DevOps community than anyone admits publicly. Tight deadlines, complex projects, one unguarded file — and suddenly you're in damage control. Here's mine.
I was refactoring an Apache Airflow project — around 20 DAGs and a Scrapy backend running 300+ spiders. Not a toy project. Real production infrastructure, tight schedule, pressure to move fast.
A test .py file slipped through and got pushed to the remote. Almost immediately, emails started flooding in — info alerts, warnings, key revocation notices. Since that key was shared across other services, those went down too. One unguarded file. A cascade failure.
Then came the part that stung most: walking into my development head's office to explain why I needed a new AWS key created. That conversation was more painful than any billing alert.
In hindsight, some blame goes to the tight schedule — but also to my own carelessness. A pre-commit hook or basic secret scanning would have stopped it entirely. Bots scan GitHub continuously. They found the key within minutes of the push. That day I became obsessed with safeguards. This site is the reference I wish existed then.
Environment variables are strings. They solve one problem: keeping sensitive config out of your code. The .env file is just a local convenience for loading them during development. Here's the full picture.
The File Hierarchy
| File | Commit? | Purpose |
|---|---|---|
| .env | ❌ Never | Local real values — your actual secrets |
| .env.example | ✅ Always | Template with placeholder values — the contract for your team |
| .env.local | ❌ Never | Machine-specific overrides (Next.js / Vite convention) |
| .env.development | ✅ OK* | Non-secret dev defaults shared by the team |
| .env.production | ❌ Never | Managed by infra/CI — never lives in the repo |
| .env.test | ✅ OK* | CI test values (mock keys only) |
* Only commit these if they contain zero real credentials — test DB, mock API keys, etc.
# Env files — never commit .env .env.local .env.production .env.staging .env.*.local # DO commit this: # .env.example ← don't add to .gitignore
Already committed secrets? Simply deleting the file is not enough — Git history still contains them. Use git filter-repo or BFG Repo Cleaner to purge history AND immediately rotate every exposed key. Assume they're compromised the moment they touched a remote.
Python, Django & FastAPI
Bare Python Script — python-dotenv
pip install python-dotenv
import os from dotenv import load_dotenv from pathlib import Path # Always resolve relative to this file — not the CWD load_dotenv(dotenv_path=Path(__file__).resolve().parent / ".env") DATABASE_URL = os.getenv("DATABASE_URL") SECRET_KEY = os.environ["SECRET_KEY"] # raises KeyError if missing DEBUG = os.getenv("DEBUG", "False").lower() in ("true", "1") # Fail fast at startup — not silently at runtime def validate_env(): required = ["SECRET_KEY", "DATABASE_URL"] missing = [k for k in required if not os.getenv(k)] if missing: raise RuntimeError(f"Missing required env vars: {missing}") validate_env()
Django — django-environ
import environ from pathlib import Path BASE_DIR = Path(__file__).resolve().parent.parent env = environ.Env(DEBUG=(bool, False), ALLOWED_HOSTS=(list, ["localhost"])) environ.Env.read_env(BASE_DIR / ".env") SECRET_KEY = env("SECRET_KEY") DEBUG = env("DEBUG") ALLOWED_HOSTS = env("ALLOWED_HOSTS") DATABASES = {"default": env.db("DATABASE_URL")} CACHES = {"default": env.cache("REDIS_URL")} AWS_ACCESS_KEY_ID = env("AWS_ACCESS_KEY_ID", default=None) AWS_SECRET_ACCESS_KEY = env("AWS_SECRET_ACCESS_KEY", default=None) STRIPE_SECRET_KEY = env("STRIPE_SECRET_KEY") OPENAI_API_KEY = env("OPENAI_API_KEY", default=None)
FastAPI — pydantic-settings (Type-Safe)
from pydantic_settings import BaseSettings, SettingsConfigDict from pydantic import AnyUrl, SecretStr from functools import lru_cache class Settings(BaseSettings): model_config = SettingsConfigDict(env_file=".env", case_sensitive=False) app_name: str = "My API" debug: bool = False database_url: AnyUrl secret_key: SecretStr # masked in logs: ********** openai_key: SecretStr redis_url: str | None = None @lru_cache def get_settings() -> Settings: return Settings() # Usage anywhere: from core.config import get_settings # settings = get_settings()
SecretStr in Pydantic displays as ********** in logs and tracebacks. Always use it for API keys, passwords, and tokens to prevent accidental exposure.
# App SECRET_KEY=django-insecure-dev-key-changeme DEBUG=True ALLOWED_HOSTS=localhost,127.0.0.1 # Database DATABASE_URL=postgresql://user:pass@localhost:5432/mydb REDIS_URL=redis://localhost:6379/0 # Third-party STRIPE_SECRET_KEY=sk_test_xxxxxxxxxxxxxxxx OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxx AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
React, Next.js & Vite
Frontend vars are public. Anything bundled into your JS is visible in DevTools. Only put public-facing keys (Stripe publishable, Sentry DSN, analytics) in frontend env vars. Secret keys belong on the server only.
Next.js — The Prefix Rule
Next.js uses a deliberate two-tier system. Variables prefixed with NEXT_PUBLIC_ are inlined into the browser bundle at build time. All others are server-only.
# Server-only — NEVER reaches the browser DATABASE_URL=postgresql://user:pass@localhost/mydb STRIPE_SECRET_KEY=sk_test_xxxxxxx OPENAI_API_KEY=sk-proj-xxxxxxx # NEXT_PUBLIC_ → inlined into browser bundle at build time NEXT_PUBLIC_STRIPE_PK=pk_test_xxxxxxx NEXT_PUBLIC_API_URL=http://localhost:8000 NEXT_PUBLIC_SENTRY_DSN=https://[email protected]/123
"use server" import Stripe from "stripe" // Runs on server — process.env.STRIPE_SECRET_KEY never reaches browser const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!) export async function createCheckout(priceId: string) { const session = await stripe.checkout.sessions.create({ line_items: [{ price: priceId, quantity: 1 }], mode: "payment", success_url: `${process.env.NEXT_PUBLIC_API_URL}/success`, }) return session.url }
Next.js — Load Priority Order
Vite — VITE_ Prefix
# VITE_ prefix = exposed to browser VITE_API_URL=http://localhost:8000 VITE_STRIPE_PK=pk_test_xxxxxxx # No prefix = build-scripts only (not bundled) BUILD_TIMESTAMP=2026-01-01
// src/vite-env.d.ts interface ImportMetaEnv { readonly VITE_API_URL: string readonly VITE_STRIPE_PK: string } interface ImportMeta { readonly env: ImportMetaEnv } // Usage: import.meta.env.VITE_API_URL
Prefix Quick Reference
| Framework | Browser Prefix | Access |
|---|---|---|
| Next.js | NEXT_PUBLIC_ | process.env.NEXT_PUBLIC_X |
| Vite | VITE_ | import.meta.env.VITE_X |
| Create React App | REACT_APP_ | process.env.REACT_APP_X |
| Gatsby | GATSBY_ | process.env.GATSBY_X |
| Astro | PUBLIC_ | import.meta.env.PUBLIC_X |
| SvelteKit | PUBLIC_ | $env/static/public |
Rust — dotenvy & config
[dependencies] dotenvy = "0.15" config = "0.14" serde = { version = "1", features = ["derive"] } secrecy = { version = "0.8", features = ["serde"] } anyhow = "1" tokio = { version = "1", features = ["full"] }
Simple — dotenvy
fn main() -> anyhow::Result<()> { dotenvy::dotenv().ok(); // no-op if .env doesn't exist let db_url: String = std::env::var("DATABASE_URL") .expect("DATABASE_URL must be set"); let port: u16 = std::env::var("PORT") .unwrap_or_else(|_| "8000".to_string()) .trim().parse() .expect("PORT must be a valid u16"); Ok(()) }
Production — config + secrecy
use config::{Config, Environment}; use secrecy::SecretString; use serde::Deserialize; #[derive(Debug, Deserialize, Clone)] pub struct DatabaseConfig { pub url: SecretString, // zeroized on drop, hidden in Debug pub max_connections: u32, } #[derive(Debug, Deserialize, Clone)] pub struct AppConfig { pub environment: String, pub port: u16, pub database: DatabaseConfig, pub secret_key: SecretString, } pub fn load_config() -> anyhow::Result<AppConfig> { dotenvy::dotenv().ok(); let cfg = Config::builder() .add_source(Environment::default().separator("__")) .build()?; // DATABASE__URL=postgres://... → config.database.url // DATABASE__MAX_CONNECTIONS=10 Ok(cfg.try_deserialize()?) }
All Alternatives at a Glance
.env files are fine for local dev. For production, use a proper secrets manager. Here's the full landscape.
| Tool | Best For | Cost | Auto-Rotation |
|---|---|---|---|
| .env files | Local development | Free | ❌ |
| AWS Secrets Manager | AWS workloads, production | ~$0.40/secret/mo | ✅ |
| AWS SSM Parameter Store | AWS, budget-conscious | Free / $0.05 advanced | ❌ |
| HashiCorp Vault | Multi-cloud, enterprise | Free (self-hosted) | ✅ |
| Doppler | Multi-env teams, DX-first | Free tier / $6+/mo | ✅ |
| Infisical | Open-source Doppler alternative | Free (self-hosted) | ✅ |
| GCP Secret Manager | GCP workloads | $0.06/10k accesses | ✅ |
| Azure Key Vault | Azure workloads | $0.03/10k ops | ✅ |
| Vercel / Railway / Render env UI | PaaS deployments | Included | ❌ |
# Install Doppler CLI brew install dopplerhq/cli/doppler # Authenticate and link project doppler login && doppler setup # Doppler injects secrets — no .env file needed doppler run -- python manage.py runserver doppler run -- npm run dev doppler run -- cargo run
GitHub Actions & Docker
In CI/CD pipelines, there are no .env files. Variables are injected via encrypted secrets from your platform. Here's the right way.
GitHub Actions
name: Deploy
on:
push:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
env:
# Injected from Settings → Secrets and variables → Actions
DATABASE_URL: ${{ secrets.TEST_DATABASE_URL }}
SECRET_KEY: ${{ secrets.DJANGO_SECRET_KEY }}
STRIPE_SECRET_KEY: ${{ secrets.STRIPE_SECRET_KEY }}
steps:
- uses: actions/checkout@v4
- name: Run tests
run: python manage.py test
deploy-prod:
environment: production # requires manual approval
needs: test
steps:
- name: Deploy
env:
API_KEY: ${{ secrets.API_KEY }} # production value
run: |
docker compose up -d --build
Docker Compose
services:
api:
build: ./backend
env_file:
- ./backend/.env # path relative to compose file
depends_on:
db:
condition: service_healthy
web:
build: ./frontend
env_file:
- ./frontend/.env.local
ports: ["3000:3000"]
db:
image: postgres:16
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: localpass
POSTGRES_DB: mydb
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
FROM python:3.12-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . # ✅ Non-secret config as ENV is fine ENV PYTHONUNBUFFERED=1 ENV PORT=8000 # 🚨 Never bake secrets into image layers # ENV SECRET_KEY=mykey ← visible in docker inspect # ENV DATABASE_URL=... ← visible in every derived image EXPOSE $PORT CMD ["gunicorn", "myapp.wsgi:application"]
Build-time secret in Next.js? NEXT_PUBLIC_ variables must exist at next build time — not just at runtime. Pass them to your CI build step explicitly, otherwise they'll be undefined in the bundle even if set on the server.
10 Bugs You'll Actually Hit
1. Variables are undefined / None everywhereCommon
Cause 1: load_dotenv() was called after importing a module that already read the variable. Move it to the very top of your entry point, before all other imports.
Cause 2: Wrong path. If your working directory differs from the file location:
load_dotenv(dotenv_path=Path(__file__).resolve().parent / ".env")
Cause 3: A real OS env var with the same name exists and override=False (the default) blocks the .env value. Add override=True.
2. NEXT_PUBLIC_ variable is undefined in productionCI/CD
These are inlined at build time. If they're not set when you run next build, they'll be undefined even if the server has them. Pass them to your build step:
- name: Build Next.js
env:
NEXT_PUBLIC_API_URL: ${{ secrets.PRODUCTION_API_URL }}
run: npm run build3. Docker Compose "env_file not found" errorDocker
Docker Compose resolves env_file paths relative to the Compose file location, not where you ran the command. Also ensure the file exists first — create an empty one if needed: touch backend/.env.
4. Boolean env vars behave unexpectedly ("False" is truthy)Bug
Every env var is a string. os.getenv("DEBUG") returns the string "False" — which is truthy in Python. Always parse booleans explicitly:
# ❌ Wrong — "False" string is truthy if os.getenv("DEBUG"): ... # ✅ Correct DEBUG = os.getenv("DEBUG", "False").lower() in ("true", "1", "yes")
5. Secrets appear in Sentry / error logsSecurity
Uncaught exceptions can capture local variable frames — including your settings dict. Use SecretStr (Pydantic) or SecretString (Rust secrecy), and configure Sentry to scrub sensitive keys:
sentry_sdk.init(dsn=..., send_default_pii=False)
6. Vite: import.meta.env.VITE_X is undefinedCommon
Two causes: (1) variable name doesn't start with VITE_ — Vite strips all others. (2) You modified .env while the dev server was running — Vite requires a full server restart (not HMR) to pick up new env vars.
7. GitHub Actions: secret shows as *** but app still failsCI/CD
*** just means GitHub masked it in logs — the value could still be wrong. Check: (1) Secret name mismatch — secrets.MY_SECRET vs actual GitHub secret name. (2) Secret stored under a specific Environment but the job doesn't set environment:. (3) Trailing whitespace in the stored value. Debug with:
- run: echo "Length: ${#MY_SECRET}" env: MY_SECRET: ${{ secrets.MY_SECRET }}
8. Works locally, fails in production (ALLOWED_HOSTS / DB host)Django
Two classic Django mismatches: (1) ALLOWED_HOSTS doesn't include your production domain. (2) DATABASE_URL uses localhost as host — in Docker, use the service name (db) instead:
# Local DATABASE_URL=postgresql://user:pass@localhost:5432/mydb # Docker DATABASE_URL=postgresql://user:pass@db:5432/mydb # Prod DATABASE_URL=postgresql://user:[email protected]:5432/mydb
9. Rust: parse panic on env var with whitespace or wrong typeRust
A value like PORT=8000 (trailing space) causes .parse::<u16>() to fail. Always trim and handle the error:
let port: u16 = std::env::var("PORT") .unwrap_or_else(|_| "8000".to_string()) .trim() .parse() .expect("PORT must be a valid u16");
10. Teammate added a new variable — app breaks silently for othersTeam
The new var is in .env.example but not in teammates' local .env. Add a validation script and run it in CI and as a pre-commit hook:
import sys def keys(f): return {l.split("=",1)[0].strip() for l in open(f) if l.strip() and not l.startswith("#") and "=" in l} missing = keys(".env.example") - keys(".env") if missing: print(f"❌ Missing in .env: {missing}"); sys.exit(1) print("✅ .env is complete")
The Complete Checklist
Run through this before every project and every deploy. It's the distilled lesson from a lot of painful incidents.
🔒 Security
.envis in.gitignore.env.exampleis committed and current- No secrets in Dockerfile
ENV - No secret keys in frontend bundle
SecretStr/SecretStringin use- GitHub Push Protection enabled
🚀 Production
- All secrets injected via CI/CD secrets
- Different values for dev / staging / prod
- Startup validation — fail fast on missing vars
- Rotation plan documented for all keys
- Sentry/logging scrubs sensitive fields
- Docker secrets for Swarm containers
👥 Team
- Setup script copies
.env.example → .env - CI compares
.env.examplevs.env - README says where to get real values
- Shared secrets in a secrets manager
- Pre-commit hook checks for leaked keys
🚨 Emergency
- Know how to revoke each key immediately
git filter-repoready for history purge- AWS GuardDuty / CloudTrail active
- Canary tokens set as honeypots
- Incident response runbook exists
Enable GitHub Push Protection right now. Settings → Code security → Push protection. It blocks pushes containing AWS keys, Stripe keys, GitHub tokens, and 100+ other secret patterns automatically. Free, 30 seconds to enable. Do it now before you forget.
The bots are scanning GitHub right now. Continuously. They find exposed keys within minutes of a push. The prevention is simple — the recovery is not.