A prototype that proves the picture is not a prototype. The job of a prototype is to test the system's behavior under real conditions — what it notices, how it responds, when it backs off, where it fails — before visual decisions harden around an untested product assumption. Prototyping behavior is the agency move that converts conviction into evidence.
What a prototype is for now
The prototype used to be a way to align a meeting. A high-fidelity static mock, a clickable Figma flow, a marketing site for a feature that did not exist yet — produced in advance, polished to the point that nobody in the room could disagree, and shipped into a slide deck. Its job was to settle the question of what the product would look like before the team committed to building it.
That kind of prototype made sense because committing a team to the wrong thing was costly. Engineering time, data plumbing, permissions, integrations, reliability work, QA, and rollout risk all compound. A screen prototype gave the team a less costly place to argue about the visible path before those commitments hardened. That was a reasonable trade.
That trade still matters when the open question is visual structure, flow, or comprehension. But it is the wrong instrument when the product risk lives in behavior — what the system actually does with the input the user gives it, what it does silently, what it gets wrong, and what happens when it does. The question is not whether screens matter. The question is whether the prototype can test the part of the product carrying the risk. A static mock cannot fail, cannot hesitate, cannot mis-infer, cannot recover. It can only be approved.
The prototype that earns its place now is the one that runs. It accepts real input. It produces real output. Its output is sometimes wrong, and the team learns something from that wrongness that no static review ever produced. The prototype's job stops being to prove the picture and starts being to test the behavior — including the behaviors the team would rather not think about yet.
The seven things a behavior prototype tests
The useful question is not how polished the prototype looks. It is which behavior the prototype makes answerable.
Assumption. What does the prototype assume about the user, the data, or the world that has not yet been verified? Most prototypes embed a quiet assumption — that the user knows what they want, that the input arrives clean, that the model returns what the demo showed. Naming the assumption out loud is half the prototype's value; testing it is the other half.
Trigger. What sets the behavior in motion? A click is the most boring possible trigger; the more interesting prototypes test triggers the user did not initiate — the system noticing, the schedule firing, the threshold being crossed. A prototype that only runs on click is not testing the kind of behavior modern products actually have.
System response. What does the system do when the trigger fires? Not what the screen shows — what the system does. An action taken on the user's behalf. An inference made. A confirmation deferred. A request held back. The response is the answerable part of the prototype; it is also the part most prototypes leave to imagination.
Visibility. What does the user see while the system is working, and after? The prototype that shows only the happy state is hiding the system from the user the way the static mock did — just with motion. The visibility surface tests how much of the system's reasoning the product is going to expose, and where the seams of that exposure live.
User control. What can the user do at the moment the system is acting? Stop it. Override it. Adjust it. Ask why. The prototype that does not let the user intervene is testing a system the user has to live with, not steer. Most products fail here long before they fail at the model.
Failure case. What does the prototype do when the system gets it wrong? A real prototype has at least one path through it where the inference is bad, the input is malformed, the action collides with state the user did not expect. The team that builds for the happy path is shipping a product that will only meet the happy user.
Recovery path. When the failure happens, how does the user get back? Undo. Restart. A different route. A human in the loop. Recovery is the part of the product the marketing screenshot will never show; it is also the part that decides whether the user trusts the system enough to keep using it.
For a photos app deciding when to surface a memory, the behavior prototype might be small. It runs against a real test library, picks candidate clusters from the previous year, and proposes one a day at a hand-picked moment — never inside a session that started with a deletion, never on the anniversary of a date the user has flagged as hard. It shows which photos were considered, which were excluded, and why. The user can dismiss a memory, hide an album from the source set, or mark a face as “never include.” The prototype also includes the failure case: a cluster centered on someone the user has stopped following everywhere else, a sequence whose timestamps are wrong, a weekend the user took the photos to forget.
That prototype can be ugly. It cannot be imaginary.
A prototype that proves the picture is not a prototype. It is a screenshot in motion.
The form can be modest. A behavior prototype might be a script that runs against 50 real inputs and prints the failures. It might be a thin vertical slice with a rough interface and a real data path. It might be a state machine that exposes every transition before the team designs the screen around it. It might be a Wizard-of-Oz flow where a human performs the system's proposed action while the team studies timing, trust, and correction. It might be an instrumented prompt, a mocked API with real outputs, or a rough UI connected to the smallest working loop.
The shared property is not fidelity. The shared property is pressure. The prototype puts the behavior under conditions where it can surprise the team.
Picking what to prototype
Most teams prototype the screen with the highest visual ambition: the hero, the empty state, the marketing splash. Those screens are also the ones the team has the most opinions about, the most stakeholders watching, and the most muscle memory to produce.
The right thing to prototype is the moment the team is most uncertain about. Not the loudest moment, not the prettiest moment — the moment nobody can answer for. That might be the silent inference the model makes the first time it sees a user, the overnight action nobody watches, or the recovery path no one on the team has walked end to end.
The signal for picking is the moment the team flinches when asked to demo it. The team that can demo the hero screen without preparation but stalls when asked to demo what happens when the model returns nothing is showing where the prototype is owed. The flinch is the indicator. Prototype the flinch.
There is a second signal: the moment a stakeholder asks "what does it do when..." and the team's answer is shaped like an opinion instead of a behavior. "It probably handles that" and "the engineer will figure it out" are both signals that no prototype has been built to answer the question. The product call that sits on top of "probably" is the call the prototype is supposed to retire.
What makes this hard is that the unprototyped moments are also the unphotographed moments. They are not what the marketing site will show. They will not be in the case study. The team that prototypes the unphotographed moments first is paying the cost of seriousness up front; the team that prototypes the photographed moments first is producing a sales asset and calling it a prototype.
There is a useful distinction here. A demo tries to persuade. An experiment tries to measure. A behavior prototype tries to make a product decision answerable before the system is hardened around the wrong answer.
Sometimes one artifact can do more than one job. But if persuasion is the only job it does, it is not the prototype this chapter is asking for.
Prototyping as a thinking medium, not a sales medium
The prototype that exists to convince a stakeholder is rarely the one that teaches the team. Convincing prototypes are built backwards from the desired conclusion: the team knows what it wants the meeting to decide, and the prototype is shaped to produce that decision. Whatever the prototype tests, in that case, is what the team already believed. Nothing new arrives.
The prototype that teaches the team is built forward from the team's own uncertainty. It picks a question the team cannot answer with the current artifacts and produces an artifact that answers it — sometimes against the team's prior expectation. The prototype that shows that the inference is worse than the team thought, that the action the system takes on the user's behalf surprises the user the team had in mind, that the recovery path is longer than the slide promised — that prototype changes the product. The convincing prototype confirms the slide.
The discipline is to build the prototype that would change your mind, not the stakeholder's. The team that always builds the convincing prototype loses the muscle to learn from the work; the team that always builds the teaching prototype builds a product that compounds on what it has learned.
There is a small political cost to prototyping forward instead of backward. The teaching prototype sometimes shows the room something the room did not want to know. The team that ships those prototypes earns a reputation for honesty and pays a short-term price for the truths the prototypes reveal. Both sides of that ledger are real. Over a couple of projects, the honesty wins; over a single meeting, the convincing prototype wins. The choice between them is a taste call about which timescale the team is optimizing for.
Behavior Prototype Checklist
What a prototype should test before it is styled: assumption, trigger, system response, visibility, user control, failure case, recovery path.
What behavioral belief is the prototype trying to prove or disprove?
What starts the behavior, and who or what initiates it?
What does the system do in response, over time or immediately?
What does the user need to see about state, reasoning, confidence, or consequence?
Where can the user steer, pause, override, or refuse the behavior?
What happens when the system is wrong, late, uncertain, or incomplete?
How does the user repair the situation without losing trust or context?
Prototype brief for the flinch moment you named
Commit the artifact this chapter produced. The portfolio strip in Chapter 11 reads back what you have written here.
Build Evidence of a New Identity
Once the prototype starts testing behavior instead of pictures, the work itself begins to demonstrate the larger role you are stepping into — and the move that follows is to make that role legible to others as evidence.