An LLM is a brain in a jar. An agent is that brain with hands, eyes, memory, and a to-do list. This is where AI stops being a chatbot and starts being a coworker.
An AI agent is a system where an LLM uses tools in a loop to accomplish a goal — making its own decisions about what to do next.
An LLM. The reasoning engine. Decides what to do, when to stop, what the answer is.
Functions the agent can call: search the web, run code, query a database, send an email, read a file.
The agent keeps thinking and acting until it decides it's done. It doesn't have to know the steps in advance.
The most influential agent pattern. The model alternates between reasoning (think out loud) and acting (call a tool, see the result). Watch it cycle:
Goal: "What's the weather in Tokyo and should I pack an umbrella?"
get_weather("Tokyo"){ temp: 14°C, rain: 80% }Without tools, an LLM can only generate text. With tools, it can do almost anything. Modern LLMs are trained to output structured tool calls.
# A tool definition the model can see { "name": "search_web", "description": "Search the web and return top results", "parameters": { "query": { "type": "string" } } } # What the model outputs when it decides to use it { "tool": "search_web", "args": { "query": "latest mars rover findings 2026" } }
Web search, vector DB lookups, knowledge base queries.
Run Python in a sandbox. Calculate, plot, parse files.
Send email, post to Slack, file a ticket, message a teammate.
Read, write, edit. The bread and butter of coding agents.
Call any third-party service: GitHub, Stripe, Linear, Notion.
You can wrap any function as a tool. The agent figures out when to use it.
An LLM by itself has no memory between calls — every conversation starts fresh. Agents fix this with explicit memory systems.
The current conversation, kept in the context window. Lost when the conversation ends.
Facts written to a file or database. The agent can read them back next time. "User prefers TypeScript over Python."
A vector database of past notes. The agent searches by meaning, not keywords. Powers RAG and recall.
Why have one agent when you can have a team? Multiple specialized agents that talk to each other to solve bigger problems.
When a task is too big or too varied for one agent's context window, or when different parts of the task need very different prompts, tools, or models.
Frameworks like CrewAI and AutoGen exist mostly to make this easier.
Claude Code, Cursor, Devin, Aider. Read your repo, write features, run tests, fix the failures, open a PR.
Take a question, browse dozens of sources, synthesize an answer with citations. Perplexity, Claude's research mode, OpenAI Deep Research.
Agents that actually click, type, and scroll on a real screen. Anthropic's Computer Use, OpenAI's Operator. Spookier and more general.
Agents that read the ticket, search the knowledge base, look up the customer's order, and either answer or escalate.
Without limits, agents can spin on a task they can't solve. Always set max iterations.
Too many tools and the model picks the wrong one. Curate ruthlessly.
Every step is a new LLM call. A complex agent task can rack up serious bills.
Now you know what an agent is. Time to look at the libraries that help you build them — and the trade-offs between them.