The Glossary

Every weird acronym, plainly defined. Type to filter. No PhD required.

Search

AAgent

An LLM hooked up to tools and a loop, capable of taking multi-step actions to achieve a goal.

AAGI

Artificial General Intelligence — a hypothetical AI that matches or exceeds human capability across all cognitive tasks. Doesn't exist yet (depending who you ask).

AAttention

The mechanism by which a transformer decides which other tokens to focus on when processing each token. The "Attention Is All You Need" paper kicked off the modern LLM era.

BBackpropagation

The algorithm used to train neural networks by computing how much each weight contributed to the error and nudging it the right direction.

BBenchmark

A standardized test for AI models. MMLU, HumanEval, GSM8K, GPQA — these are how researchers brag.

CContext Window

The max number of tokens an LLM can consider at once — your prompt + history + response. Bigger = the model can read more.

CCoT (Chain of Thought)

A prompting technique where you ask the model to reason step by step before answering. Often dramatically improves accuracy.

DDiffusion Model

The architecture behind most modern image generators (Stable Diffusion, DALL-E, Midjourney). Learns by gradually denoising random noise into images.

DDistillation

Training a small model to imitate a big one. The small model gets most of the capability for a fraction of the cost.

EEmbedding

A vector (list of numbers) that represents the meaning of a piece of text or data. Similar things have similar embeddings.

EEval

A test you run on a model or agent to measure its quality. The best AI teams obsess over evals.

FFew-shot

Showing the model a handful of examples in the prompt before asking it to do the task. Often beats zero-shot dramatically.

FFine-tuning

Continuing to train a pretrained model on a smaller, more specific dataset to specialize it.

GGPU

Graphics Processing Unit. Parallel-math monsters originally for video games, now the engine of all modern AI training.

GGrounding

Connecting an LLM's output to an authoritative source (a document, a database) to keep it from hallucinating.

HHallucination

When an LLM generates something confidently wrong — a fake citation, a non-existent function, a made-up fact.

HHyperparameter

A setting you choose before training (learning rate, batch size, model size) — not learned by the model itself.

IInference

The act of using a trained model to make a prediction. The cheap, fast part of an AI model's life.

LLLM

Large Language Model — a neural network trained on huge amounts of text to predict the next token.

LLoRA

Low-Rank Adaptation — a cheap way to fine-tune big models by only training a small set of extra weights.

MMCP

Model Context Protocol — an open standard from Anthropic for connecting AI agents to tools and data sources, the same way USB connects devices.

MMoE

Mixture of Experts — an architecture where only a subset of the model's weights activates for each token, making giant models efficient. Mixtral, DeepSeek, and GPT-4 are rumored to use this.

MMultimodal

A model that handles more than one kind of input — text + images + audio + video.

NNeural Network

A stack of simple math units ("neurons") connected by weighted edges. The basic building block of all modern AI.

OOverfitting

When a model memorizes its training data instead of learning generalizable patterns. The classic ML villain.

PParameters

The weights inside a neural network. GPT-4 has ~1 trillion of them.

PPrompt Engineering

The dark art of writing prompts that get LLMs to do what you actually want.

PPrompt Injection

An attack where someone slips instructions into the data an agent reads, hijacking its behavior.

QQuantization

Compressing a model by storing its weights with fewer bits. Makes models smaller and faster with a small accuracy cost.

RRAG

Retrieval-Augmented Generation — fetch relevant documents first, then feed them to the LLM. The standard way to give a model access to fresh or private data.

RReAct

Reasoning + Acting — the agent loop where the model alternates between thinking out loud and using tools.

RRLHF

Reinforcement Learning from Human Feedback. The technique used to fine-tune raw LLMs into helpful chatbots.

SSFT

Supervised Fine-Tuning — fine-tuning on labeled examples of "input → desired output".

SSystem Prompt

The hidden instructions an LLM sees before the user's message. Sets the model's role, rules, and tone.

TTemperature

A knob that controls how random the LLM's output is. 0 = deterministic. 1+ = creative chaos.

TToken

A chunk of text the model actually sees. Roughly ¾ of an English word. Pricing on most LLM APIs is per token.

TTool Use

An LLM's ability to call functions (search, calculator, APIs) instead of just generating text.

TTransformer

The neural network architecture invented in 2017 that powers every modern LLM.

VVector Database

A database optimized for storing and searching embeddings by similarity. Pinecone, Weaviate, Chroma, Qdrant.

WWeights

The numbers inside a neural network that get adjusted during training. "Open weights" means you can download them.

ZZero-shot

Asking the model to do a task with no examples — just an instruction. The opposite of few-shot.