Build an AI-Powered GraphQL API with Python

GraphQL solves the over-fetching and under-fetching problems that plague REST APIs. Pair it with an AI backend, and you get something genuinely powerful: an API that resolves complex queries by reasoning about your data, not just looking up database rows. This tutorial walks through building a production-ready AI-powered GraphQL API using Python, Strawberry GraphQL, and Claude via the EzAI API.

By the end, you'll have a working GraphQL server that accepts typed queries, routes them to Claude for intelligent resolution, and returns structured responses — all with proper error handling and caching.

Why GraphQL + AI?

Traditional GraphQL resolvers map fields to database columns. That works until your data needs interpretation — summarizing documents, answering questions about relationships between entities, or generating content on the fly. An AI resolver can handle all of that through a single schema.

Consider a product catalog API. A REST endpoint returns raw product data. A GraphQL API lets the client pick which fields to fetch. An AI-powered GraphQL API can do both and answer queries like "find products similar to X but under $50" or "summarize the reviews for this item" — all through the same schema.

AI-Powered GraphQL request flow diagram showing client to AI router to schema resolver to response

Request flow: natural language or typed GraphQL queries routed through AI resolvers

Project Setup

You'll need Python 3.11+, an EzAI API key, and three packages:

bash

pip install strawberry-graphql[fastapi] anthropic uvicorn

Strawberry is a code-first GraphQL library — you define your schema with Python dataclasses instead of SDL strings. It integrates cleanly with FastAPI and async resolvers, which is exactly what we need for AI calls.

Define the Schema

Start with types that represent your domain. We'll build a knowledge base API where users can query documents, ask questions, and get AI-generated summaries:

python

import strawberry
from typing import Optional

@strawberry.type
class Document:
    id: str
    title: str
    content: str
    summary: Optional[str] = None
    tags: list[str] = strawberry.field(default_factory=list)

@strawberry.type
class AIAnswer:
    question: str
    answer: str
    confidence: float
    sources: list[str]

@strawberry.type
class SearchResult:
    documents: list[Document]
    ai_summary: str
    total_count: int

These types define the contract. Clients know exactly what they're getting back — typed fields, not mystery JSON blobs. The AIAnswer type includes a confidence score so callers can decide whether to trust the response or fall back to manual lookup.

Build the AI Resolver

The core of the system is a resolver that sends queries to Claude through EzAI and parses the response into your GraphQL types:

python

import anthropic
import json

client = anthropic.AsyncAnthropic(
    api_key="sk-your-ezai-key",
    base_url="https://ezaiapi.com"
)

async def ai_resolve(query: str, context: str = "") -> dict:
    """Send a structured query to Claude and parse JSON response."""
    prompt = f"""You are a knowledge base API. Answer the query using
the provided context. Return valid JSON matching this schema:
{{"answer": str, "confidence": float 0-1, "sources": [str]}}

Context: {context}
Query: {query}"""

    response = await client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=2048,
        messages=[{"role": "user", "content": prompt}]
    )

    text = response.content[0].text
    # Extract JSON from response (handles markdown code blocks)
    if "```json" in text:
        text = text.split("```json")[1].split("```")[0]
    return json.loads(text.strip())

Two things to note here. First, the base_url points to EzAI, which routes the request to Claude at lower cost. Second, we parse JSON from the response defensively — Claude sometimes wraps output in markdown code fences, and the splitter handles that.

Wire Up the Query Root

Connect your resolvers to the GraphQL schema. Each resolver is an async function that calls Claude through the ai_resolve helper:

python

from functools import lru_cache
from hashlib import sha256

# Simple in-memory cache for repeated queries
_cache: dict[str, dict] = {}

def cache_key(query: str) -> str:
    return sha256(query.encode()).hexdigest()[:16]

@strawberry.type
class Query:
    @strawberry.field
    async def ask(self, question: str) -> AIAnswer:
        key = cache_key(question)
        if key in _cache:
            data = _cache[key]
        else:
            data = await ai_resolve(question)
            _cache[key] = data

        return AIAnswer(
            question=question,
            answer=data["answer"],
            confidence=data["confidence"],
            sources=data.get("sources", [])
        )

    @strawberry.field
    async def summarize(self, text: str, max_words: int = 100) -> str:
        response = await client.messages.create(
            model="claude-sonnet-4-5",
            max_tokens=512,
            messages=[{
                "role": "user",
                "content": f"Summarize in {max_words} words:\n\n{text}"
            }]
        )
        return response.content[0].text

schema = strawberry.Schema(query=Query)

The ask resolver includes a hash-based cache. Identical questions return cached results instead of burning tokens on repeated Claude calls. In production, swap the dict for Redis with a TTL — the caching guide covers this in depth.

Add FastAPI and Launch

Strawberry plugs directly into FastAPI with one import. Add a health check endpoint and you're production-ready:

python

from fastapi import FastAPI
from strawberry.fastapi import GraphQLRouter

app = FastAPI(title="AI GraphQL API")
graphql_app = GraphQLRouter(schema)

app.include_router(graphql_app, prefix="/graphql")

@app.get("/health")
async def health():
    return {"status": "ok", "cache_size": len(_cache)}

# Run: uvicorn app:app --host 0.0.0.0 --port 8000

Hit http://localhost:8000/graphql and you'll get Strawberry's built-in GraphiQL playground. Try a query:

graphql

{
  ask(question: "What are the best practices for API rate limiting?") {
    answer
    confidence
    sources
  }
}

Error Handling That Actually Works

AI calls fail. Networks time out. Models hallucinate invalid JSON. Your API needs to handle all of that without crashing:

python

import asyncio
from anthropic import APITimeoutError, RateLimitError

async def ai_resolve_safe(query: str, retries: int = 2) -> dict:
    for attempt in range(retries + 1):
        try:
            return await ai_resolve(query)
        except RateLimitError:
            wait = 2 ** attempt
            await asyncio.sleep(wait)
        except APITimeoutError:
            if attempt == retries:
                return {
                    "answer": "Request timed out. Try a simpler query.",
                    "confidence": 0.0,
                    "sources": []
                }
        except json.JSONDecodeError:
            # Model returned non-JSON — retry with stricter prompt
            if attempt == retries:
                return {
                    "answer": "Could not parse AI response.",
                    "confidence": 0.0,
                    "sources": []
                }
    return {"answer": "Service unavailable.", "confidence": 0.0, "sources": []}

The retry logic uses exponential backoff for rate limits and graceful degradation for timeouts. Clients always get a valid AIAnswer shape back — they check the confidence field to know if the response is real or a fallback. For deeper patterns on retry strategies, see the retry strategies guide.

Going to Production

Before you ship this, add these three things:

Authentication middleware — Strawberry supports permission classes that check API keys or JWTs before resolvers execute
Query depth limiting — Prevent deeply nested queries from generating massive Claude prompts. Strawberry's MaxTokensLimiter extension caps this automatically
Cost tracking — Log response.usage.input_tokens and output_tokens per request so you can monitor your AI spending
Swap the cache — Replace the in-memory dict with Redis. Set a TTL of 1-24 hours depending on how fresh your data needs to be

The full pattern — GraphQL schema, AI resolvers, caching, error handling — scales well. Teams at production scale use exactly this stack to power internal knowledge bases, customer-facing search APIs, and document processing pipelines.

What's Next

You now have a working AI-powered GraphQL API. From here, consider adding streaming responses for long-form answers, or wiring in tool use so Claude can call your database resolvers directly. The GraphQL type system keeps everything predictable even as the AI backend grows more capable.

Get your API key from the EzAI dashboard and start building.

Build an AI-Powered GraphQL API with Python

Why GraphQL + AI?

Project Setup

Define the Schema

Build the AI Resolver

Wire Up the Query Root

Add FastAPI and Launch

Error Handling That Actually Works

Going to Production

What's Next

Related Posts

Build an AI REST API with FastAPI and Claude

How to Get Structured JSON Output from AI APIs