Build an AI Environment Variable Manager with Python

Every production outage has that one common root cause nobody wants to admit: a missing environment variable. DATABASE_URL was set in staging but not prod. STRIPE_SECRET_KEY had a trailing newline. REDIS_HOST pointed to localhost. You know the drill. In this tutorial, we'll build a Python tool that uses Claude AI to scan your codebase, extract every environment variable reference, generate validated .env templates, and flag configuration drift between environments — before it takes down your Friday deploy.

Why This Beats grep

Sure, you could grep -r "os.environ" . and call it a day. But grep doesn't understand context. It won't tell you that API_TIMEOUT expects an integer, that LOG_LEVEL should be one of debug|info|warn|error, or that your CORS_ORIGINS needs to be a comma-separated URL list in production. Claude reads your code the way a senior engineer does — it infers types, default values, required vs. optional status, and even catches vars that are referenced but never documented.

Project Setup

You need Python 3.10+ and the Anthropic SDK. We'll route everything through EzAI API so you get access to Claude at reduced rates:

bash

pip install anthropic python-dotenv pathspec

Create a file called envmanager.py. This is the full scanner:

python

import os, json, re, sys
from pathlib import Path
import anthropic

client = anthropic.Anthropic(
    api_key=os.environ["ANTHROPIC_API_KEY"],
    base_url="https://ezaiapi.com",
)

# Patterns that reference env vars across languages
PATTERNS = [
    re.compile(r'os\.environ\.get\(["\'](\w+)["\']'),
    re.compile(r'os\.environ\[["\'](\w+)["\']'),
    re.compile(r'os\.getenv\(["\'](\w+)["\']'),
    re.compile(r'process\.env\.(\w+)'),
    re.compile(r'env\(["\'](\w+)["\']'),
    re.compile(r'\$\{(\w+)\}'),
]
SKIP = {".git", "node_modules", "__pycache__", ".venv", "venv"}
EXTS = {".py", ".js", ".ts", ".jsx", ".tsx", ".go", ".rs", ".yml", ".yaml", ".toml"}

def scan_codebase(root: str) -> dict:
    found = {}
    for path in Path(root).rglob("*"):
        if any(p in path.parts for p in SKIP):
            continue
        if path.suffix not in EXTS or not path.is_file():
            continue
        try:
            text = path.read_text(errors="ignore")
        except:
            continue
        for pattern in PATTERNS:
            for match in pattern.finditer(text):
                var = match.group(1)
                if var not in found:
                    found[var] = {"files": [], "snippets": []}
                rel = str(path.relative_to(root))
                if rel not in found[var]["files"]:
                    found[var]["files"].append(rel)
                start = max(0, match.start() - 80)
                end = min(len(text), match.end() + 80)
                found[var]["snippets"].append(
                    f"{rel}: ...{text[start:end].strip()}..."
                )
    return found

The scanner walks your project tree, matches env var patterns across Python, JavaScript, TypeScript, Go, Rust, and YAML files, then collects surrounding code context for each variable. This context is what makes the AI analysis actually useful — Claude can see how each variable is used, not just that it exists.

AI-Powered Analysis

Now the interesting part. We feed the scan results to Claude and ask it to infer types, defaults, and validation rules:

Environment variable manager workflow: Scan → Analyze → Generate → Validate

Scan → AI Analysis → Template Generation → Cross-Environment Validation

python

def analyze_vars(scan_results: dict) -> list:
    # Truncate snippets to fit context window
    compact = {}
    for var, info in scan_results.items():
        compact[var] = {
            "files": info["files"][:5],
            "snippets": info["snippets"][:3],
        }

    msg = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=4096,
        messages=[{
            "role": "user",
            "content": f"""Analyze these environment variables from a codebase scan.
For each variable, return a JSON array of objects with:
- name: variable name
- required: boolean
- type: string|integer|boolean|url|email|json|csv
- default: suggested default value or null
- description: one-line description
- validation: regex pattern or enum values if applicable
- category: group name (database, auth, cache, api, logging, etc.)

Variables and their usage context:
{json.dumps(compact, indent=2)}

Return ONLY valid JSON, no markdown fences."""
        }],
    )
    return json.loads(msg.content[0].text)

Claude reads the code snippets and infers that DATABASE_URL is a required URL, DEBUG is an optional boolean defaulting to false, and REDIS_MAX_CONNECTIONS is an integer that should probably be between 5 and 100. This is context-aware analysis that no regex tool can replicate.

Generate .env Templates

With the AI analysis in hand, generating a documented .env.example is straightforward:

python

def generate_env_template(analysis: list, env: str = "development") -> str:
    # Group by category
    groups = {}
    for var in analysis:
        cat = var.get("category", "general")
        groups.setdefault(cat, []).append(var)

    lines = [
        f"# ==========================================",
        f"# Environment: {env.upper()}",
        f"# Generated by AI Env Manager (EzAI + Claude)",
        f"# ==========================================",
        "",
    ]

    for category, vars_list in sorted(groups.items()):
        lines.append(f"# --- {category.upper()} ---")
        for v in sorted(vars_list, key=lambda x: x["name"]):
            req = "REQUIRED" if v["required"] else "optional"
            lines.append(f"# {v['description']} [{v['type']}] ({req})")
            default = v.get("default") or ""
            lines.append(f"{v['name']}={default}")
            lines.append("")
        lines.append("")

    return "\n".join(lines)

The output looks like a proper .env.example that your team can actually use — grouped by category, annotated with types and requirements, and pre-filled with sensible defaults.

Cross-Environment Validation

The real power move is validating existing .env files against the AI-inferred schema. This catches the drift that grep misses:

python

from dotenv import dotenv_values

def validate_env(env_path: str, analysis: list) -> list:
    current = dotenv_values(env_path)
    issues = []
    schema = {v["name"]: v for v in analysis}

    # Check for missing required vars
    for name, spec in schema.items():
        if spec["required"] and name not in current:
            issues.append({
                "level": "error",
                "var": name,
                "msg": f"Missing required variable (type: {spec['type']})",
            })
        elif name in current and spec.get("validation"):
            val = current[name]
            if not re.match(spec["validation"], val):
                issues.append({
                    "level": "warning",
                    "var": name,
                    "msg": f"Value '{val}' doesn't match pattern: {spec['validation']}",
                })

    # Check for undocumented vars (in .env but not in code)
    for name in current:
        if name not in schema:
            issues.append({
                "level": "info",
                "var": name,
                "msg": "Variable in .env but not referenced in code (stale?)",
            })

    return issues

Run this across your .env.development, .env.staging, and .env.production files to get a clear picture of configuration drift. The tool catches three classes of problems: missing required vars (errors), invalid values (warnings), and stale vars that exist in your env but aren't referenced anywhere in code (info).

Putting It Together

Here's the CLI entry point that ties everything together:

python

if __name__ == "__main__":
    root = sys.argv[1] if len(sys.argv) > 1 else "."
    print(f"🔍 Scanning {root}...")

    scan = scan_codebase(root)
    print(f"📋 Found {len(scan)} unique env vars")

    print("🤖 Analyzing with Claude...")
    analysis = analyze_vars(scan)

    # Generate template
    template = generate_env_template(analysis)
    Path(".env.example").write_text(template)
    print("✅ Generated .env.example")

    # Validate existing .env if present
    if Path(".env").exists():
        issues = validate_env(".env", analysis)
        if issues:
            print(f"\n⚠️  Found {len(issues)} issue(s):")
            for i in issues:
                icon = {"error": "❌", "warning": "⚠️", "info": "ℹ️"}
                print(f"  {icon[i['level']]} {i['var']}: {i['msg']}")
        else:
            print("✅ All env vars valid!")
    else:
        print("💡 No .env found — use .env.example as your starting point")

Run it against any project:

bash

export ANTHROPIC_API_KEY="sk-your-ezai-key"
python envmanager.py /path/to/your/project

Add It to CI

Drop this into your GitHub Actions workflow to catch missing env vars before they hit production. The script exits with code 1 when it finds errors, so your pipeline will fail loud and early:

yaml

# .github/workflows/env-check.yml
name: Env Var Check
on: [push, pull_request]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install anthropic python-dotenv pathspec
      - run: python envmanager.py .
        env:
          ANTHROPIC_API_KEY: ${{ secrets.EZAI_API_KEY }}

Every PR now gets automatically checked for environment variable consistency. New code that references an undocumented env var will surface immediately instead of causing a 2 AM pager alert three weeks later.

Cost and Performance

A typical scan of a mid-size project (50-100 files, 30-40 env vars) costs roughly $0.02-0.05 per run with Claude Sonnet through EzAI's pricing. That's about $1.50/month if you run it on every push. Compare that to one engineer spending 30 minutes debugging a missing env var in production ($50+ in salary alone), and the ROI is obvious.

For larger monorepos, batch the scan by directory and send multiple smaller requests instead of one massive payload. Claude Sonnet handles 200K tokens of context, but smaller focused requests produce better analysis and cost less.

What's Next

This foundation opens up several extensions worth building:

Secret rotation detection — flag env vars that haven't been rotated in N days by comparing git blame timestamps
Multi-repo drift — scan multiple services and find shared vars with inconsistent values across your microservices
Terraform/Kubernetes sync — generate ConfigMap manifests or tfvars files from the same AI analysis
Slack alerts — post a summary to your team channel when the schema changes between deploys

The core pattern here — scan code context, feed it to Claude, get structured output — works for any kind of codebase analysis. Environment variables are just one practical starting point. For more AI-powered dev tools, check out our guides on building an AI secrets scanner and building an AI code linter.

Build an AI Environment Variable Manager with Python

Why This Beats grep

Project Setup

AI-Powered Analysis

Generate .env Templates

Cross-Environment Validation

Putting It Together

Add It to CI

Cost and Performance

What's Next

Related Posts

Build an AI Secrets Scanner for Your Codebase

How to Secure Your AI API Keys