Skip to content

Structured design decision process

What this topic is for

Design problems become tractable when you give them structure: enumerate the decisions you're making, generate a minimal set of good options for each, and evaluate candidate designs against believable scenarios. Without that structure, design work feels like creative guessing: every direction is equally valid, nothing gets ruled out, and convergence depends on someone's taste rather than evidence. This topic defines the vocabulary that makes the process runnable and documents the process itself.

The design space

A design is a collection of choices that attempt to achieve a set of design goals. A complete design is a full set of those choices, with no open questions remaining. Those choices are organized into decisions (the dimensions the design must settle) and options (the candidate answers for each decision). From decisions and options, you can enumerate the design space: the set of all possible designs that could be produced by choosing one option per decision.

No real design process can consider every design in the space. With even a handful of decisions and a few options each, the combinations grow quickly. The practical move is to identify what you believe to be the most important decisions and options, and then work through those systematically to determine which combination best achieves the stated design goals. The rest of the process, from constraining the space through assembling candidates to evaluating them against scenarios, serves that narrowing.

The key discipline is keeping the set of decisions minimal and each option set to good choices only. Every decision should be load-bearing and every option a genuine candidate. Admitting unnecessary decisions or implausible options makes the space too large to explore meaningfully.

Interdependence and constraints

Some decisions depend on or affect others. Choosing an option in one decision can nullify or constrain another decision, a relationship called a constraint between decisions.

For example: if a search feature's tokenization decision is resolved by choosing whitespace splitting, the stemming decision (apply Porter stemming, apply Snowball stemming, skip stemming) may become moot. Whitespace tokenization on camelCase identifiers leaves each fragment unstemmed, and the relevant vocabulary is technical rather than natural-language prose, so the stemming decision collapses. That's a constraint: one choice eliminated an entire downstream decision.

Tracking constraints as you populate the design space keeps it from growing beyond what can actually be explored. When you notice a coupling, make it explicit: "if we choose X in decision A, decision B disappears," or "options Y and Z in decision B require option W in decision A."

Fractal granularity

Any decision can be decomposed into several finer-grained decisions when more resolution is wanted. Granularity is a choice: you stop at a level of detail that's actually useful for the current problem. A single decision labeled "matching strategy" might decompose into "AND vs. OR for multi-term queries," "exact vs. fuzzy vs. prefix matching," and "handling of stop words" when the team needs that resolution. Splitting is appropriate when the component decisions are genuinely independent or when coupling between them is an insight worth capturing. Lumping is appropriate when the boundaries don't matter for the evaluation you're running.

The goal is a minimal set of decisions, enough to generate meaningfully different candidate designs, no more.

Goals

"Optimal" is only defined relative to goals. There is no goal-free notion of best design. Before anything else, make the goals explicit: what does the design have to accomplish, and for whom? Goals don't have to be formal; they just have to be stated. "Users with intermittent connectivity should be able to search reliably" is a goal. A vaguer goal like "fast" is still a goal, but it needs enough specificity to guide evaluation: fast for what task, measured how, under what conditions?

Evaluation

With goals established, evaluation is how you find the candidate design that best achieves them.

Scenarios and evaluation criteria

You assess a design by running it through scenarios (believable situational stories that exercise the design in realistic conditions) and judging it against evaluation criteria: the dimensions of quality the goals imply. Criteria can be latency, recall quality, ease of authorship, degradation under edge inputs, or anything else that follows from what the design must accomplish.

Running a design through a scenario and judging it against criteria produces an assessment: a score, a ranking, or a qualitative judgment. Assessments don't have to be quantitative. When a human judges, saying "this result feels right" or "this response feels off," the criteria can stay qualitative. The important thing is that the assessment is made against the goals and criteria, not fiat.

Use cases, scenarios, and test cases

Use cases, scenarios, and test cases are three ways to describe how a design gets exercised. Each serves a different purpose, and using the wrong term blurs the distinction that matters most.

A use case is an interaction between an actor and the design, aimed at achieving a single goal. It's the smallest unit of intent that produces a meaningful result for the actor. Use cases stay categorical: they name the goal and the steps without pinning down specific details. "Find a documentation page by searching for a code identifier" is a use case.

A scenario is a believable situational story: a specific actor with a goal, motivation, and context, playing out over time. It's broader than a use case and necessarily concrete, since it only works if the details are plausible. One scenario can encompass several use cases as the story unfolds. "A developer driving home half-remembers the name of an API method, dictates a rough query, hears two results read aloud, refines once, and finds the right page" is a scenario. This usage follows the UX definition of scenario, not the UML one.

A test case is a specific input, plus the conditions and steps to run it, paired with an expected result, used to check the design's behavior. You build one from a use case or a scenario by fixing the particulars: under these conditions, given this input, the design must produce this result. Test cases are exact and repeatable.

The load-bearing distinction: a test case asserts a fixed correct answer; a scenario is a situation you judge. When you're judging results qualitatively rather than checking against predetermined outputs, scenario is the honest unit. A test case answers "did this pass?"; a scenario answers "does this feel right?"

A project typically uses all three at different stages: use cases to name the units of intent, scenarios to evaluate candidate designs in believable settings, and test cases to lock in chosen behavior against regression.

<use_case>Find a documentation page by searching for a code identifier.</use_case>

<scenario>
A developer is driving and remembers there's a page about HTTP responses
but not its exact title. Hands-free, they dictate a rough query, skim the
results read aloud, refine once when the first hit is wrong, and open the
right page. Over this one scenario they search, scan results, and refine,
spanning several use cases in a single story.
</scenario>

<test_case>
Given the local search index, query "httpres" returns the HTTPResponse
page within the top 3 results.
</test_case>

Running the process

This is the procedure an agent and human actually follow. The shared language established above makes communication between agent and human efficient.

1. Enumerate the design decisions at the right granularity. List the choices the design must make. For each, write down a small set of good options. Aim for a minimal set: every decision should be load-bearing, and every option should be a real candidate. Drop decisions that don't affect the evaluation and drop options that no one would actually pick.

2. Note constraints and couplings between decisions. As you populate the space, mark which options in one decision eliminate or constrain options in another. This trims the actual candidate set and surfaces assumptions worth making explicit.

3. Assemble candidate designs. From the remaining decisions and options, compose a manageable set of candidates. You don't need one for every combination; you need one for each meaningfully different point in the space. Two candidates that differ only in a coupling that makes them equivalent don't need separate rows.

4. Exercise the candidates against scenarios and test cases. Run each candidate through the scenarios that matter for the goals. For each, produce an assessment: a score, a ranking, or a qualitative judgment. Test cases work the same way but check against a fixed expected result rather than a judgment.

5. Judge the results and iterate. A person usually makes the call: "this candidate performs best under the scenarios that matter most." Evaluation can also be automated when the criteria are quantifiable. Either way, the judgment is made against the goals and criteria, not preference alone. If no candidate is clearly better, revise the design space by adding a missing option, splitting a decision that turned out to matter, or relaxing a constraint that was too aggressive, then run again.

Worked example: search-engine design

A team is choosing how to implement local search in a VitePress site using MiniSearch. The goals are two: users should be able to find pages reliably when they remember part of a concept or identifier; and users on mobile with intermittent connectivity should get good results without network round-trips.

Design decisions and options:

DecisionOptions
TokenizationWhitespace splitting · Case-boundary splitting (camelCase → tokens)
Multi-term matchingAND (all terms must match) · OR (any term matches)
Match typeExact · Prefix · Fuzzy
Result granularityPage-level · Heading-level
Field weightingUniform · Title-heavy

Constraints identified: Case-boundary tokenization makes stemming unnecessary for a technical vocabulary corpus, so that decision collapses. AND matching pairs well with prefix; OR matching with fuzzy tends to produce noisy results for short queries, a soft coupling between those two decisions.

Candidate designs (a subset):

  • A: Whitespace · AND · Prefix · Page-level · Title-heavy
  • B: Case-boundary · OR · Fuzzy · Page-level · Title-heavy
  • C: Case-boundary · AND · Prefix · Heading-level · Title-heavy

Scenarios:

In the car, half-remembered. A developer is driving and remembers there's a page about a camelCase function, something like parseHttpResponse, but can't recall the exact spelling. They dictate a prefix: "parsehttp". They hear the top two results read aloud and the right page is the first result.

Low connectivity, short query. A developer on intermittent mobile signal types "rout" and expects to find pages about routing. The index is local, so connectivity isn't the constraint; precision is.

Assessment: Candidate A fails the first scenario: whitespace tokenization leaves parseHttpResponse as a single token, so a prefix query for "parsehttp" doesn't match it. Candidate B handles camelCase correctly but OR + fuzzy matching returns noise for the short "rout" query. Candidate C performs well on both: case-boundary splitting handles the camelCase query, AND + prefix is precise on the short query, and heading-level granularity surfaces the exact section rather than the whole page. The team picks C and locks in a test case:

<test_case>
Given the local search index, query "parsehttp" returns the
parseHttpResponse page within the top 3 results.
</test_case>