Usability testing is the empirical evaluation of a product by observing real users attempting real tasks with the actual system. It is the definitive test of usability — the method that grounds design decisions in evidence about actual human performance rather than expert judgement alone.
Three main types:
- Formative testing — conducted during design to identify problems and inform iteration. Qualitative, small samples (5–8), think-aloud protocols.
- Summative testing — evaluates finished designs against defined criteria. Quantitative, larger samples (20+), performance metrics.
- Comparative testing — compares two or more designs using the same tasks and metrics.
The core procedure: recruit representative users, give them realistic tasks, observe them silently (aside from neutral probing), and record what happens. The think-aloud protocol asks participants to verbalise their thoughts, revealing not just what they do but why.
Facilitator principles:
- Do not help — struggle is the data
- Do not lead — questions should be open, not directive
- Remain neutral — body language communicates evaluative feedback
- Probe, don't direct — "What were you expecting to happen?" rather than "That's wrong"
Nielsen and Landauer (1993) showed that 5 participants discover ~85% of major usability problems in formative testing — the basis for the "five users is enough" rule. This does not mean 5 users find all problems; it means iterating (test with 5, fix, test again) is more efficient than testing with many at once.
Usability testing reveals what expert review cannot: the actual confusion, workarounds, and mental models of real users.
Related terms: Think-Aloud Protocol, Heuristic Evaluation, System Usability Scale, Five-User Rule
Discussed in:
- Chapter 15: Usability Testing — Remote Usability Testing
Also defined in: Textbook of Usability