Glossary

Usability Testing

Usability testing is the empirical evaluation of a product by observing real users attempting real tasks with the actual system. It is the definitive test of usability — the method that grounds design decisions in evidence about actual human performance rather than expert judgement alone.

Three main types:

  • Formative testing — conducted during design to identify problems and inform iteration. Qualitative, small samples (5–8), think-aloud protocols.
  • Summative testing — evaluates finished designs against defined criteria. Quantitative, larger samples (20+), performance metrics.
  • Comparative testing — compares two or more designs using the same tasks and metrics.

The core procedure: recruit representative users, give them realistic tasks, observe them silently (aside from neutral probing), and record what happens. The think-aloud protocol asks participants to verbalise their thoughts, revealing not just what they do but why.

Facilitator principles:

  • Do not help — struggle is the data
  • Do not lead — questions should be open, not directive
  • Remain neutral — body language communicates evaluative feedback
  • Probe, don't direct — "What were you expecting to happen?" rather than "That's wrong"

Nielsen and Landauer (1993) showed that 5 participants discover ~85% of major usability problems in formative testing — the basis for the "five users is enough" rule. This does not mean 5 users find all problems; it means iterating (test with 5, fix, test again) is more efficient than testing with many at once.

Usability testing reveals what expert review cannot: the actual confusion, workarounds, and mental models of real users.

Related terms: Think-Aloud Protocol, Heuristic Evaluation, System Usability Scale, Five-User Rule

Discussed in:

Also defined in: Textbook of Usability