AI Agents

An LLM is a brain in a jar. An agent is that brain with hands, eyes, memory, and a to-do list. This is where AI stops being a chatbot and starts being a coworker.

{ }

What is an Agent?

The 1-line definition

An AI agent is a system where an LLM uses tools in a loop to accomplish a goal — making its own decisions about what to do next.

Brain

An LLM. The reasoning engine. Decides what to do, when to stop, what the answer is.

Tools

Functions the agent can call: search the web, run code, query a database, send an email, read a file.

Loop

The agent keeps thinking and acting until it decides it's done. It doesn't have to know the steps in advance.

The ReAct Loop

The most influential agent pattern. The model alternates between reasoning (think out loud) and acting (call a tool, see the result). Watch it cycle:

THINK
ACT
OBSERVE

A walk-through

Goal: "What's the weather in Tokyo and should I pack an umbrella?"

  • Think: "I need current weather. I'll use the weather tool."
  • Act: get_weather("Tokyo")
  • Observe: { temp: 14°C, rain: 80% }
  • Think: "Heavy rain expected. I should recommend the umbrella."
  • Act: Reply to user — "Yes, 80% chance of rain. Bring the umbrella."
  • Stop.

Tool Use — The Superpower

Without tools, an LLM can only generate text. With tools, it can do almost anything. Modern LLMs are trained to output structured tool calls.

# A tool definition the model can see
{
  "name": "search_web",
  "description": "Search the web and return top results",
  "parameters": {
    "query": { "type": "string" }
  }
}

# What the model outputs when it decides to use it
{
  "tool": "search_web",
  "args": { "query": "latest mars rover findings 2026" }
}

Search

Web search, vector DB lookups, knowledge base queries.

Code

Run Python in a sandbox. Calculate, plot, parse files.

Communicate

Send email, post to Slack, file a ticket, message a teammate.

Files

Read, write, edit. The bread and butter of coding agents.

APIs

Call any third-party service: GitHub, Stripe, Linear, Notion.

Custom

You can wrap any function as a tool. The agent figures out when to use it.

Memory

An LLM by itself has no memory between calls — every conversation starts fresh. Agents fix this with explicit memory systems.

Short-term

The current conversation, kept in the context window. Lost when the conversation ends.

Long-term

Facts written to a file or database. The agent can read them back next time. "User prefers TypeScript over Python."

Semantic

A vector database of past notes. The agent searches by meaning, not keywords. Powers RAG and recall.

Multi-Agent Systems

Why have one agent when you can have a team? Multiple specialized agents that talk to each other to solve bigger problems.

The pattern

  • A planner agent breaks the goal into subtasks.
  • Specialist agents (researcher, coder, writer, reviewer) each handle one subtask.
  • An orchestrator routes work and merges results.
  • A critic checks the output before it ships.

When it helps

When a task is too big or too varied for one agent's context window, or when different parts of the task need very different prompts, tools, or models.

Frameworks like CrewAI and AutoGen exist mostly to make this easier.

Real-World Agent Examples

Coding Agents

Claude Code, Cursor, Devin, Aider. Read your repo, write features, run tests, fix the failures, open a PR.

Research Agents

Take a question, browse dozens of sources, synthesize an answer with citations. Perplexity, Claude's research mode, OpenAI Deep Research.

Computer-Use Agents

Agents that actually click, type, and scroll on a real screen. Anthropic's Computer Use, OpenAI's Operator. Spookier and more general.

Customer Support

Agents that read the ticket, search the knowledge base, look up the customer's order, and either answer or escalate.

Agent Pitfalls

Loops Forever

Without limits, agents can spin on a task they can't solve. Always set max iterations.

Tool Confusion

Too many tools and the model picks the wrong one. Curate ruthlessly.

Cost Explosion

Every step is a new LLM call. A complex agent task can rack up serious bills.

Up Next

Frameworks

Now you know what an agent is. Time to look at the libraries that help you build them — and the trade-offs between them.