encyclopedia

Foundation Models

Foundation models are the pre-trained base from which most modern AI applications are built. Rather than training a new model for each task, organizations start with a foundation model (like GPT-4, Claude, or Llama 3) and adapt it through prompting, retrieval, or fine-tuning. The term was coined in 2021 by Stanford researchers who argued that this architectural shift was as significant as the shift to deep learning itself.

CategoryModels

Reading time6 min read

Last updatedFeb 28, 2025

PublishedFeb 28, 2025

Definition

Large AI models trained on broad, general datasets at massive scale, designed to be adapted for a wide range of downstream tasks.

Need this applied?

We help teams go from definitions to deployed workflows—safely and fast.

Start a project Book a strategy call

What makes a model a "foundation"

Stanford CRFM, 2021 Anthropic

Foundation models are characterized by their scale (billions of parameters), the breadth of their training data (diverse internet-scale text, code, and increasingly images and audio), and their emergent capabilities—behaviors that appear at scale and were not explicitly trained for, such as in-context learning and multi-step reasoning.

• Scale: billions of parameters trained on trillions of tokens.
• Generality: capable of many tasks with minimal task-specific data.
• Emergence: capabilities arise at scale that were not present in smaller models.

Adaptation spectrum

Stanford CRFM, 2021

Organizations adapt foundation models along a spectrum. At one end, prompt engineering requires no changes to the model. Moving further, retrieval-augmented generation (RAG) grounds the model in proprietary data. Fine-tuning modifies model weights for specific behaviors. At the other end, pre-training from scratch (rare and very expensive) gives full control.

Open vs. closed models

Stanford CRFM, 2021 Hugging Face

Closed models (GPT-4, Claude) are accessed via API. Open-weight models (Llama 3, Mistral, Gemma) release model weights publicly, enabling on-premise deployment and full customization. The choice involves trade-offs around capability, cost, privacy, and control.

FAQ

Is GPT-4 a foundation model?

Yes. GPT-4 is a foundation model. When you access it via ChatGPT or the OpenAI API, you are using a system built on top of the GPT-4 foundation model, likely with additional fine-tuning and safety layers applied by OpenAI.

Stanford CRFM, 2021

What is the difference between a foundation model and an LLM?

LLMs are text-based foundation models. Not all foundation models are LLMs—some are trained primarily on images (vision foundation models) or audio. The term "foundation model" is the broader category.

Stanford CRFM, 2021

Email this summary + checklist

Get a copy of “Foundation Models” and an AI readiness checklist in your inbox.

encyclopedia

Foundation Models

CategoryModels

Reading time6 min read

Last updatedFeb 28, 2025

PublishedFeb 28, 2025

Definition

Large AI models trained on broad, general datasets at massive scale, designed to be adapted for a wide range of downstream tasks.

Need this applied?

We help teams go from definitions to deployed workflows—safely and fast.

Start a project Book a strategy call

What makes a model a "foundation"

Stanford CRFM, 2021 Anthropic

• Scale: billions of parameters trained on trillions of tokens.
• Generality: capable of many tasks with minimal task-specific data.
• Emergence: capabilities arise at scale that were not present in smaller models.

Adaptation spectrum

Stanford CRFM, 2021

Open vs. closed models

Stanford CRFM, 2021 Hugging Face

FAQ

Is GPT-4 a foundation model?

Stanford CRFM, 2021

What is the difference between a foundation model and an LLM?

Stanford CRFM, 2021

Email this summary + checklist

Get a copy of “Foundation Models” and an AI readiness checklist in your inbox.