Build an AI Database Schema Designer with Python

Every application starts with a database schema, and getting it right early saves weeks of refactoring later. The typical workflow — open pgAdmin, stare at an ER diagram, argue about naming conventions for 45 minutes — doesn't scale. What if you could describe your app in plain English and get back production-ready DDL with proper indexes, constraints, and foreign keys?

That's exactly what we're building: a Python tool that takes natural language requirements, sends them to Claude via EzAI API, and outputs complete database schemas ready for psql or your ORM of choice.

Why AI for Schema Design?

Manual schema design has three recurring pain points. First, naming inconsistencies — one developer uses user_id, another uses userId, and someone inevitably creates a column called data. Second, missing indexes on columns you'll obviously query by. Third, relationship gaps where you forget a junction table or set the wrong cascade behavior.

Claude excels at this because it's seen millions of database schemas across every domain. It knows that an e-commerce app needs a separate order_items table (not a JSON column on orders), that email columns should be UNIQUE, and that deleted_at is better than a hard delete for most SaaS apps.

Project Setup

You need Python 3.9+ and the Anthropic SDK. Install the dependencies:

bash

pip install anthropic pydantic

Set your EzAI API key as an environment variable:

bash

export ANTHROPIC_API_KEY="sk-your-ezai-key"

The Schema Designer Class

The core idea is straightforward: define a system prompt that enforces PostgreSQL best practices, send the user's requirements, and parse the structured output. We use structured JSON output to get clean, parseable results every time.

python

import json
import anthropic
from pydantic import BaseModel

class SchemaDesigner:
    def __init__(self):
        self.client = anthropic.Anthropic(
            base_url="https://ezaiapi.com"
        )
        self.system_prompt = """You are a senior database architect.
Given app requirements in plain English, output a JSON object with:
- "tables": array of table definitions
- "indexes": array of CREATE INDEX statements
- "notes": array of design decision explanations

Each table has: "name", "columns" (array of {name, type, constraints}),
"primary_key", "foreign_keys" (array of {column, references_table,
references_column, on_delete}).

Rules:
- Use snake_case for all names
- Always include id (SERIAL/BIGSERIAL), created_at, updated_at
- Add soft delete (deleted_at) for user-facing entities
- Use TIMESTAMPTZ, never TIMESTAMP
- Add indexes on foreign keys and commonly queried columns
- Use DECIMAL for money, never FLOAT
- Add CHECK constraints where appropriate
- Include junction tables for many-to-many relationships
Output ONLY valid JSON. No markdown, no explanation outside the JSON."""

    def design(self, requirements: str, dialect: str = "postgresql") -> dict:
        msg = self.client.messages.create(
            model="claude-sonnet-4-5",
            max_tokens=4096,
            system=self.system_prompt,
            messages=[{
                "role": "user",
                "content": f"Design a {dialect} schema for: {requirements}"
            }]
        )
        return json.loads(msg.content[0].text)

AI Schema Designer pipeline: natural language input to validated SQL output

The four-stage pipeline: input → AI analysis → schema generation → validation

Converting JSON to DDL

The raw JSON from Claude is structured but not executable. We need a converter that turns it into actual SQL statements:

python

def to_ddl(schema: dict) -> str:
    """Convert the AI-generated schema JSON to executable DDL."""
    statements = []

    for table in schema["tables"]:
        cols = []
        for col in table["columns"]:
            line = f"    {col['name']} {col['type']}"
            if col.get("constraints"):
                line += f" {col['constraints']}"
            cols.append(line)

        # Add foreign key constraints
        for fk in table.get("foreign_keys", []):
            ref = f"    FOREIGN KEY ({fk['column']}) "
            ref += f"REFERENCES {fk['references_table']}({fk['references_column']})"
            if fk.get("on_delete"):
                ref += f" ON DELETE {fk['on_delete']}"
            cols.append(ref)

        sql = f"CREATE TABLE {table['name']} (\n"
        sql += ",\n".join(cols)
        sql += "\n);"
        statements.append(sql)

    # Add indexes
    for idx in schema.get("indexes", []):
        statements.append(idx)

    return "\n\n".join(statements)

Running It End-to-End

Here's the full workflow — describe your app, get back SQL you can paste directly into psql:

python

designer = SchemaDesigner()

schema = designer.design("""
A project management app where:
- Users belong to organizations
- Organizations have multiple projects
- Projects contain tasks with assignees and due dates
- Tasks can have comments and file attachments
- Users can watch tasks for notifications
""")

ddl = to_ddl(schema)
print(ddl)

# Save to file
with open("schema.sql", "w") as f:
    f.write(ddl)

# Print design notes
for note in schema.get("notes", []):
    print(f"💡 {note}")

Claude will generate a complete schema with tables like organizations, users, org_members (junction table), projects, tasks, task_comments, task_attachments, and task_watchers. Each table gets proper timestamps, soft deletes, and foreign keys with appropriate cascade rules.

Adding Schema Validation

AI output isn't always perfect. Add a validation layer that catches common issues before you run the DDL against your database:

python

def validate_schema(schema: dict) -> list[str]:
    """Return a list of warnings for the generated schema."""
    warnings = []
    table_names = {t["name"] for t in schema["tables"]}

    for table in schema["tables"]:
        col_names = [c["name"] for c in table["columns"]]

        # Check for timestamps
        if "created_at" not in col_names:
            warnings.append(
                f"⚠️  {table['name']}: missing created_at column"
            )

        # Check foreign key references exist
        for fk in table.get("foreign_keys", []):
            if fk["references_table"] not in table_names:
                warnings.append(
                    f"❌ {table['name']}.{fk['column']} references "
                    f"non-existent table {fk['references_table']}"
                )

        # Check for FLOAT on money columns
        for col in table["columns"]:
            if "FLOAT" in col["type"].upper() and \
               any(w in col["name"] for w in ["price", "cost", "amount", "total"]):
                warnings.append(
                    f"⚠️  {table['name']}.{col['name']}: use DECIMAL for money, not FLOAT"
                )

    return warnings

Generating SQLAlchemy Models

DDL is great for direct database work, but most Python apps use an ORM. Here's how to generate SQLAlchemy models from the same schema:

python

def to_sqlalchemy(schema: dict) -> str:
    """Generate SQLAlchemy 2.0 model code from schema JSON."""
    designer = SchemaDesigner()

    msg = designer.client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=4096,
        messages=[{
            "role": "user",
            "content": f"""Convert this database schema JSON to SQLAlchemy 2.0 models
using mapped_column and DeclarativeBase. Include relationships,
type annotations, and __repr__ methods.

Schema: {json.dumps(schema, indent=2)}

Output ONLY the Python code, no markdown."""
        }]
    )
    return msg.content[0].text

# Usage
models_code = to_sqlalchemy(schema)
with open("models.py", "w") as f:
    f.write(models_code)

This gives you a complete models.py with typed columns, relationship definitions, and proper imports — ready to drop into a FastAPI project.

Iterating on the Schema

Real schema design is iterative. You start with a rough idea, see the output, then refine. Add a method that takes the existing schema and a modification request:

python

def refine(self, current_schema: dict, changes: str) -> dict:
    """Modify an existing schema based on new requirements."""
    msg = self.client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=4096,
        system=self.system_prompt,
        messages=[
            {"role": "user", "content": "Here is my current schema:"},
            {"role": "assistant", "content": json.dumps(current_schema)},
            {"role": "user", "content": f"Modify the schema: {changes}. Return the complete updated schema JSON."}
        ]
    )
    return json.loads(msg.content[0].text)

# Example: add multi-tenancy
updated = designer.refine(schema, """
Add row-level multi-tenancy:
- Every table gets a tenant_id column
- Add a tenants table with name, slug, plan
- All queries should scope by tenant_id
""")

This conversational approach mirrors how you'd work with a senior DBA — present the current state, describe what you need, get back the full updated schema.

Cost and Performance

A typical schema design request uses about 500 input tokens and 1,500-3,000 output tokens. With Claude Sonnet 4.5 through EzAI, that's roughly $0.01-0.03 per design iteration — significantly cheaper than the hour you'd spend doing it manually.

For larger schemas (20+ tables), consider splitting the requirements into domain groups (auth, billing, content) and designing them separately. This keeps each request focused and reduces the chance of hallucinated relationships between unrelated entities.

What's Next?

You now have a tool that turns plain English into production database schemas. From here, you could:

Add migration file generation with Alembic or Flyway
Build a CLI wrapper for quick schema iteration from the terminal
Integrate with your CI pipeline to validate schema changes against existing data
Add support for multiple dialects (MySQL, SQLite) with a single flag

The full source code fits in a single file under 200 lines. The AI handles the hard part — knowing which column types to use, when to add indexes, and how to structure relationships — while your code handles the boring parts: parsing, formatting, and file output.

Build an AI Database Schema Designer with Python

Why AI for Schema Design?

Project Setup

The Schema Designer Class

Converting JSON to DDL

Running It End-to-End

Adding Schema Validation

Generating SQLAlchemy Models

Iterating on the Schema

Cost and Performance

What's Next?

Related Posts

Build an AI SQL Query Generator with Python

How to Get Structured JSON Output from AI APIs