Usability Testing | Glossary | Textbook of Usability

Usability testing is the empirical evaluation of a product by observing real users attempting real tasks with the actual system. It is the definitive test of usability, the method that grounds design decisions in evidence about actual human performance rather than expert judgement alone.

Three main types:

Formative testing: conducted during design to identify problems and inform iteration. Qualitative, small samples (5–8), think-aloud protocols.
Summative testing: evaluates finished designs against defined criteria. Quantitative, larger samples (20+), performance metrics.
Comparative testing: compares two or more designs using the same tasks and metrics.

The core procedure: recruit representative users, give them realistic tasks, observe them silently (aside from neutral probing), and record what happens. The think-aloud protocol asks participants to verbalise their thoughts, revealing not just what they do but why.

Facilitator principles:

Do not help: struggle is the data
Do not lead: questions should be open, not directive
Remain neutral: body language communicates evaluative feedback
Probe, don't direct: "What were you expecting to happen?" rather than "That's wrong"

Nielsen and Landauer (1993) showed that 5 participants discover ~85% of major usability problems in formative testing, the basis for the "five users is enough" rule. This does not mean 5 users find all problems; it means iterating (test with 5, fix, test again) is more efficient than testing with many at once.

Usability testing reveals what expert review cannot: the actual confusion, workarounds, and mental models of real users.

Discussed in:

Chapter 15: Usability Testing (Remote Usability Testing)

Also defined in: Textbook of Usability