The Frameworks

A field guide to the tools you'll actually use to build with AI. From low-level model libraries to high-level agent orchestrators — what each one is good at and when to skip it.

{ }

Two Layers

There are roughly two layers of AI tooling. Pick the one that matches your problem.

Model layer

Train, fine-tune, and run models. PyTorch, TensorFlow, Hugging Face Transformers. You're here if you're a researcher or you need to ship something custom.

App layer

Build apps on top of someone else's model (usually via API). LangChain, LlamaIndex, Claude Agent SDK, CrewAI. You're here if you're a developer building an AI feature.

App-Layer Frameworks

These are the libraries you'll reach for when building agents and LLM-powered apps.

🦜

LangChain

What: The original LLM app framework. Chains, agents, memory, tool integrations — all the building blocks.

Best for: Prototyping fast, gluing together many services, RAG pipelines.

Trade-off: Big surface area, lots of abstractions. Easy to start, sometimes hard to debug.

🦙

LlamaIndex

What: A data framework for LLM apps. Specializes in connecting your private data to a model — ingestion, indexing, retrieval.

Best for: RAG over documents, knowledge bases, structured data Q&A.

Trade-off: Less of a general agent framework — leans heavily on the retrieval angle.

Claude Agent SDK

What: Anthropic's official SDK for building agents on Claude. Tool use, memory, sub-agents, MCP — first-class.

Best for: Production-grade agents that need careful reasoning and long context.

Trade-off: Tied to Claude. (If that's a feature, great.)

OpenAI Assistants API

What: OpenAI's hosted way to build assistants with tools, file search, and code interpreter built in.

Best for: Quick-to-ship assistants without standing up your own infra.

Trade-off: Locked to OpenAI. Less control over the loop than a SDK approach.

CrewAI

What: A framework for building teams of agents that collaborate on a goal — each with a role, tools, and personality.

Best for: Complex tasks that decompose naturally into specialist roles.

Trade-off: Multi-agent overhead. Often a single well-prompted agent is enough.

AutoGen

What: Microsoft's multi-agent framework. Agents talk to each other in a conversation to solve problems.

Best for: Research-y multi-agent setups, code-generation pipelines, and human-in-the-loop flows.

Trade-off: Same as CrewAI — complexity has a cost.

Haystack

What: A modular framework for search-heavy LLM apps. RAG, document QA, semantic search.

Best for: Production search and retrieval pipelines you want full control over.

LangGraph

What: LangChain's graph-based way to build agent workflows. Nodes and edges instead of opaque chains.

Best for: Agents with branching logic, retries, and human checkpoints.

Model-Layer Tools

🔥

PyTorch

The deep learning framework that won. Almost all modern AI research happens here.

🤗

Hugging Face

The GitHub of models. transformers, datasets, diffusers, and 500k+ open models.

vLLM / Ollama

Run open models locally or on your own infra. vLLM is for production throughput; Ollama is for laptops.

Quick Comparison

Framework Best At Language Layer
LangChainGeneral LLM apps, RAG, prototypingPython, JSApp
LlamaIndexRetrieval / RAG over private dataPython, JSApp
Claude Agent SDKProduction agents on ClaudePython, TSApp
OpenAI AssistantsHosted assistants on OpenAIAny (HTTP)App
CrewAIMulti-agent role-based teamsPythonApp
AutoGenMulti-agent conversationsPythonApp
LangGraphStateful agent workflowsPython, JSApp
HaystackSearch-heavy pipelinesPythonApp
PyTorchTraining & researchPythonModel
Hugging FaceOpen models & datasetsPythonModel

A Tiny Agent in Code

Here's roughly what building an agent looks like with the Claude Agent SDK. (Pseudocode — but very close.)

from claude_agent_sdk import Agent, tool

@tool
def get_weather(city: str) -> dict:
    """Get the current weather for a city."""
    return weather_api.fetch(city)

@tool
def send_email(to: str, subject: str, body: str):
    """Send an email."""
    mailer.send(to, subject, body)

agent = Agent(
    model="claude-opus-4-6",
    tools=[get_weather, send_email],
    system="You are a helpful travel assistant."
)

agent.run("Check the weather in Tokyo and email me a packing list.")

That's it. The agent figures out the order of operations on its own.

Which Should I Use?

Honest opinion

  • Building a small AI feature? Skip frameworks. Just call the LLM API directly. Frameworks shine when you have at least a few moving parts.
  • RAG over documents? Start with LlamaIndex.
  • General LLM app with lots of integrations? LangChain / LangGraph.
  • A production agent? Claude Agent SDK or write a tight loop yourself.
  • Multi-agent system? Try CrewAI, but ask first whether one agent could do it.
  • Self-hosting open models? Hugging Face + vLLM (server) or Ollama (laptop).

Up Next

The Glossary

Every weird acronym defined in plain English. Searchable, no scrolling.