Design Laws from Aviation and Engineering

Dr Chris Paton

Learning Objectives

Describe the origins of human factors engineering in aviation
Explain key cockpit design principles and their application to software interfaces
Understand the role of checklists and standardised procedures in error reduction
Apply error tolerance and fail-safe design principles to interactive systems
Connect aviation safety methodologies to healthcare and software design

Introduction

If architecture offers design principles evolved over millennia, aviation offers principles forged in decades, but forged under extreme pressure. When aircraft designs fail, people die. This unforgiving selection pressure produced the discipline of human factors engineering: the systematic application of knowledge about human capabilities and limitations to the design of systems, equipment, and procedures. The principles developed in aviation (error tolerance, standardisation, crew resource management, and systematic failure analysis) have been adopted by healthcare, nuclear power, and other safety-critical industries. They offer powerful lessons for software design, where the consequences of poor usability are increasingly serious.

Origins of Human Factors Engineering

The World War II Crucible

Human factors engineering emerged as a discipline during World War II, when it became clear that many aircraft accidents were caused not by mechanical failure or pilot incompetence but by poor cockpit design Chapanis, 1999. Identical-looking controls for landing gear and flaps led to gear-up landings. Altimeters with confusing three-pointer displays caused altitude misreadings. Instrument layouts that varied between aircraft types caused experienced pilots to make errors when switching aircraft.

Example

Alphonse Chapanis, one of the founders of human factors engineering, investigated a series of gear-up landings in the US Army Air Forces during WWII Chapanis, 1999. He found that pilots were retracting the landing gear instead of the flaps because the two controls were identical toggle switches located side by side. His solution, shape-coding the controls (a wheel-shaped knob for landing gear, a flap-shaped knob for flaps), eliminated the confusion. This was one of the first documented applications of human factors engineering: changing the design to match human capabilities rather than training humans to overcome poor design.

The lesson was profound: well-trained, motivated, experienced professionals make predictable errors when the design of their tools conflicts with human perceptual and cognitive capabilities [Sarter, 1995; Bainbridge, 1983]. The solution is not more training but better design.

The Boeing B-17 cockpit, the aircraft whose accidents Paul Fitts analysed in his foundational human-factors workU.S. Air Force · Public domain · Wikimedia Commons

From Military to Civil Aviation

After the war, human factors principles were systematically integrated into civil aviation regulation. The Federal Aviation Administration (FAA) established human factors requirements for cockpit design. The International Civil Aviation Organization (ICAO) developed human factors training requirements. Accident investigation bodies (the NTSB in the US, the AAIB in the UK) routinely analyse human factors in accident reports.

Cockpit Design Principles

The T-Arrangement

The basic "T" arrangement of flight instruments (airspeed on the left, attitude in the centre, altitude on the right, heading below the attitude indicator) was standardised in the 1950s. This consistent layout means that a pilot trained on one aircraft can read the instruments of any aircraft without relearning the layout.

Design Law

Standardisation of control and display layouts across systems reduces errors and training time. In aviation, the T-arrangement ensures that instrument scan patterns learned on one aircraft transfer to another. In software, consistent placement of navigation, primary actions, and status indicators across applications serves the same function. Users should not have to relearn basic interaction patterns for each new application.

A modern Boeing 737 flight deck. The standardised primary flight display in front of each pilot follows the T-arrangement: airspeed on the left, attitude central, altitude on the right, heading belowExcounter · CC BY-SA 4.0 · Wikimedia Commons

The Dark Cockpit Principle

Modern cockpits use a "dark cockpit" philosophy: when everything is normal, no lights or alerts are visible Wickens, 2015. Alerts appear only when something requires attention. This reduces visual clutter and ensures that any visible alert is genuinely meaningful. The dark cockpit principle is directly applicable to dashboard and monitoring interface design. Systems that display persistent green "OK" indicators for every normal parameter create visual noise that makes it harder to detect the single red indicator that matters. The alternative, showing nothing when normal and alerting only on exceptions, respects human attention limitations.

Layered Alerting

Aviation alerting systems use a layered approach:

Advisory (lowest priority): blue or white, informational
Caution (medium priority): amber, requires awareness
Warning (highest priority): red, requires immediate action

Each layer has distinct visual (colour, size) and auditory (tone pattern, voice) characteristics Patterson, 1982. The escalation from advisory to caution to warning follows a consistent pattern that pilots learn once and apply universally.

Key Principle

Effective alerting systems use a small number of clearly differentiated priority levels, each with distinct multimodal cues (visual + auditory). Too many priority levels create confusion about urgency. Too few fail to differentiate between situations that require different responses. Three to four levels (information, caution, warning, and optionally critical) is a well-validated range.

Error Tolerance and Fail-Safe Design

Reason's Swiss Cheese Model

James Reason's model of accident causation Reason, 1990 describes safety as multiple layers of defence, each with imperfections ("holes"). An accident occurs only when the holes in multiple layers align, allowing a hazard to pass through all defences Reason, 2000. Each layer (design, procedures, training, monitoring) catches some errors even when other layers fail.

Design Law

Defence in depth: no single design feature should be the sole barrier to a catastrophic error. Multiple independent safeguards (input validation, confirmation dialogs, undo capability, audit logs) each catch errors that the others miss. The goal is not to make each layer perfect but to ensure that the layers are sufficiently independent that their holes are unlikely to align.

Error Classification

Reason Reason, 1990 classifies human errors into three types.

Slips are errors of execution: the user intended the correct action but performed it incorrectly Norman, 1981. Pressing the wrong key, clicking the adjacent button, or selecting the wrong item from a menu are slips. Slips are addressed by design: making targets larger, separating destructive actions from routine ones, requiring confirmation for irreversible actions.

Lapses are errors of memory: the user forgot a step, lost track of their position in a procedure, or failed to monitor a value. Lapses are addressed by external memory aids: checklists, progress indicators, and automated reminders.

Mistakes are errors of planning: the user's intention was wrong because they misunderstood the situation, applied the wrong rule, or lacked the knowledge to choose correctly. Mistakes are harder to catch through interface design alone; they require clear system feedback, training, and decision support.

Fail-Safe Design

A fail-safe design ensures that when failure occurs, the system defaults to a safe state. A traffic light that fails goes to flashing red (all-way stop), not to green. A nuclear reactor that loses power shuts down (SCRAM), not revving up. In software, fail-safe principles include:

Defaulting to the most restrictive permission level
Timing out sessions rather than leaving them indefinitely open
Requiring confirmation for destructive actions
Saving drafts automatically rather than losing work on crash

Checklists and Standardised Procedures

The Aviation Checklist

The aviation checklist was born in 1935, when the US Army Air Corps' new B-17 bomber crashed on a demonstration flight because the pilot forgot to release the elevator lock, a novel procedure that was not part of his habitual routine Gawande, 2009. The resulting checklist (a simple list of items to verify before each phase of flight) has been credited with making the B-17 operational and has since become a fundamental safety tool across aviation.

Key Principle

Checklists externalise memory requirements. Instead of requiring the user to remember every step of a complex procedure (which will eventually fail, per the limitations of working memory), the checklist presents each step for verification. The checklist does not replace expertise; it supplements it by catching the lapses that even experts make under time pressure, fatigue, or distraction.

Atul Gawande's The Checklist Manifesto Gawande, 2009 documented the extension of aviation-style checklists to surgery, where the WHO Surgical Safety Checklist reduced complications by 36% and deaths by 47% in a multinational trial. The mechanism is straightforward: the checklist ensures that critical but easily forgotten steps (confirming patient identity, checking for allergies, verifying the surgical site) are performed every time.

Application to Software

Checklists in software take forms such as:

Setup wizards that step through configuration items in sequence
Pre-submission validation that checks required fields
Deployment checklists built into CI/CD pipelines
Onboarding flows that ensure new users complete essential setup steps

The design challenge is the same as in aviation: the checklist must be short enough to be usable, comprehensive enough to catch critical items, and structured so that it does not become a mindless ritual.

Crew Resource Management

From Individual Error to System Failure

Early aviation safety research focused on individual pilot error. Crew Resource Management (CRM), developed in the 1970s and 1980s WIENER, 1980, recognised that accidents often result from failures in communication, coordination, and decision-making among crew members. CRM training emphasises:

Shared mental models: all crew members should have a common understanding of the current situation
Assertive communication: junior crew members should speak up when they observe a problem
Workload management: tasks should be distributed to avoid overloading any one crew member
Decision-making protocols: structured approaches to making decisions under time pressure

Relevance to Software Teams

CRM principles have been adopted in healthcare ("team training" in operating rooms) and in software engineering ("blameless postmortems," "pair programming," "on-call rotation"). The recognition that safety and quality are systemic properties, not just individual competencies, is a direct legacy of aviation human factors research Dekker, 2014.

Think About It

Aviation accident investigation is conducted with the explicit goal of preventing future accidents, not assigning blame. Investigators have legal protections that allow honest reporting. How does this "just culture" compare to the way software incidents and outages are investigated? Does the blame-oriented approach common in many organisations inhibit the kind of honest analysis that prevents recurrence?

Failure Analysis Methods

Failure Mode and Effects Analysis (FMEA)

FMEA is a systematic method for identifying potential failure modes in a system, assessing their effects, and prioritising corrective actions. For each component or step in a process, the analyst asks: "What could go wrong? What would happen if it did? How likely is it? How detectable is it?" Applied to interface design, FMEA asks: "For each user action, what errors are possible? What are the consequences of each error? How can the interface prevent or mitigate each error?" This systematic approach converts heuristic advice ("prevent errors") into a structured analysis.

Fault Tree Analysis

Fault tree analysis works backward from an undesired event (a patient receiving the wrong medication, a system crash, a data breach) to identify all the combinations of individual failures that could cause it. The resulting tree structure reveals which failure paths are most likely and which safeguards are most critical.

Example

A fault tree analysis for a medication error in a CPOE system might identify paths including: wrong drug selected from a drop-down list (similar drug names), wrong dose entered (decimal point error), allergy check bypassed (alert fatigue), and pharmacist verification skipped (system override). Each path suggests a specific design intervention: grouping drugs by class rather than alphabetically, requiring dose confirmation for high-risk drugs, redesigning allergy alerts to reduce fatigue, and limiting system overrides.

From Aviation to Healthcare to Software

The trajectory of human factors knowledge follows a consistent pattern: principles discovered (often through tragedy) in aviation are adopted by healthcare [Reason, 2000; Gawande, 2009], then gradually by software engineering. Aviation's contribution to usability is not a set of aesthetic preferences or interaction patterns. It is a philosophy: that human error is predictable and preventable through design, that safety requires multiple independent defences, that standardisation reduces errors, and that system-level analysis is more productive than blaming individuals.

Key Takeaways

Human factors engineering originated in WWII aviation, driven by the recognition that cockpit design, not pilot incompetence, caused many accidents.
Key aviation design principles include standardised layouts, the dark cockpit philosophy, layered alerting, and fail-safe defaults.
Reason's Swiss Cheese Model describes safety as multiple imperfect layers of defence; defence in depth ensures no single failure is catastrophic.
Human errors are classified as slips (execution), lapses (memory), and mistakes (planning), each requiring different design mitigations.
Checklists externalise memory requirements and catch the lapses that even experts make under pressure.
Crew Resource Management recognises that safety is a team and system property, not just an individual competency.
Failure analysis methods (FMEA, fault tree analysis) provide systematic approaches to identifying and mitigating design-related risks.

Textbook of Usability