A field guide to the tools you'll actually use to build with AI. From low-level model libraries to high-level agent orchestrators — what each one is good at and when to skip it.
There are roughly two layers of AI tooling. Pick the one that matches your problem.
Train, fine-tune, and run models. PyTorch, TensorFlow, Hugging Face Transformers. You're here if you're a researcher or you need to ship something custom.
Build apps on top of someone else's model (usually via API). LangChain, LlamaIndex, Claude Agent SDK, CrewAI. You're here if you're a developer building an AI feature.
These are the libraries you'll reach for when building agents and LLM-powered apps.
What: The original LLM app framework. Chains, agents, memory, tool integrations — all the building blocks.
Best for: Prototyping fast, gluing together many services, RAG pipelines.
Trade-off: Big surface area, lots of abstractions. Easy to start, sometimes hard to debug.
What: A data framework for LLM apps. Specializes in connecting your private data to a model — ingestion, indexing, retrieval.
Best for: RAG over documents, knowledge bases, structured data Q&A.
Trade-off: Less of a general agent framework — leans heavily on the retrieval angle.
What: Anthropic's official SDK for building agents on Claude. Tool use, memory, sub-agents, MCP — first-class.
Best for: Production-grade agents that need careful reasoning and long context.
Trade-off: Tied to Claude. (If that's a feature, great.)
What: OpenAI's hosted way to build assistants with tools, file search, and code interpreter built in.
Best for: Quick-to-ship assistants without standing up your own infra.
Trade-off: Locked to OpenAI. Less control over the loop than a SDK approach.
What: A framework for building teams of agents that collaborate on a goal — each with a role, tools, and personality.
Best for: Complex tasks that decompose naturally into specialist roles.
Trade-off: Multi-agent overhead. Often a single well-prompted agent is enough.
What: Microsoft's multi-agent framework. Agents talk to each other in a conversation to solve problems.
Best for: Research-y multi-agent setups, code-generation pipelines, and human-in-the-loop flows.
Trade-off: Same as CrewAI — complexity has a cost.
What: A modular framework for search-heavy LLM apps. RAG, document QA, semantic search.
Best for: Production search and retrieval pipelines you want full control over.
What: LangChain's graph-based way to build agent workflows. Nodes and edges instead of opaque chains.
Best for: Agents with branching logic, retries, and human checkpoints.
The deep learning framework that won. Almost all modern AI research happens here.
The GitHub of models. transformers, datasets, diffusers, and 500k+ open models.
Run open models locally or on your own infra. vLLM is for production throughput; Ollama is for laptops.
| Framework | Best At | Language | Layer |
|---|---|---|---|
| LangChain | General LLM apps, RAG, prototyping | Python, JS | App |
| LlamaIndex | Retrieval / RAG over private data | Python, JS | App |
| Claude Agent SDK | Production agents on Claude | Python, TS | App |
| OpenAI Assistants | Hosted assistants on OpenAI | Any (HTTP) | App |
| CrewAI | Multi-agent role-based teams | Python | App |
| AutoGen | Multi-agent conversations | Python | App |
| LangGraph | Stateful agent workflows | Python, JS | App |
| Haystack | Search-heavy pipelines | Python | App |
| PyTorch | Training & research | Python | Model |
| Hugging Face | Open models & datasets | Python | Model |
Here's roughly what building an agent looks like with the Claude Agent SDK. (Pseudocode — but very close.)
from claude_agent_sdk import Agent, tool @tool def get_weather(city: str) -> dict: """Get the current weather for a city.""" return weather_api.fetch(city) @tool def send_email(to: str, subject: str, body: str): """Send an email.""" mailer.send(to, subject, body) agent = Agent( model="claude-opus-4-6", tools=[get_weather, send_email], system="You are a helpful travel assistant." ) agent.run("Check the weather in Tokyo and email me a packing list.")
That's it. The agent figures out the order of operations on its own.
Every weird acronym defined in plain English. Searchable, no scrolling.