Glossary

Usability Testing

Usability testing is the empirical evaluation of a product by observing real users attempting real tasks with the actual system. It is the definitive test of usability, the method that grounds design decisions in evidence about actual human performance rather than expert judgement alone.

Three main types:

  • Formative testing: conducted during design to identify problems and inform iteration. Qualitative, small samples (5–8), think-aloud protocols.
  • Summative testing: evaluates finished designs against defined criteria. Quantitative, larger samples (20+), performance metrics.
  • Comparative testing: compares two or more designs using the same tasks and metrics.

The core procedure: recruit representative users, give them realistic tasks, observe them silently (aside from neutral probing), and record what happens. The think-aloud protocol asks participants to verbalise their thoughts, revealing not just what they do but why.

Facilitator principles:

  • Do not help: struggle is the data
  • Do not lead: questions should be open, not directive
  • Remain neutral: body language communicates evaluative feedback
  • Probe, don't direct: "What were you expecting to happen?" rather than "That's wrong"

Nielsen and Landauer (1993) showed that 5 participants discover ~85% of major usability problems in formative testing, the basis for the "five users is enough" rule. This does not mean 5 users find all problems; it means iterating (test with 5, fix, test again) is more efficient than testing with many at once.

Usability testing reveals what expert review cannot: the actual confusion, workarounds, and mental models of real users.

Related terms: Think-Aloud Protocol, Heuristic Evaluation, System Usability Scale, Five-User Rule

Discussed in:

Also defined in: Textbook of Usability

This site is currently in Beta. Contact: Chris Paton

Textbook of AI · Textbook of Digital Health

Auckland Maths and Science Tutoring