Manual vs automated accessibility testing

An evidence-based comparison of the two main accessibility testing approaches. What automated tools actually catch, where manual review is essential, the cost model for each, and the hybrid approach used in every serious audit.

Quick answer

Manual or automated testing?

Automated accessibility testing catches roughly 30 to 40 percent of WCAG issues, mainly programmatic things like missing alt text, colour contrast, form labels and heading structure. Manual testing with screen readers, keyboard navigation and human review catches the rest, including all subjective and context-dependent issues like meaningful link text, logical reading order and whether content is genuinely understandable. The right answer for most teams is hybrid: automated tools in CI for fast regression checks, plus manual audits at milestones and before launch.

At-a-glance comparison

These numbers are drawn from independent research by Deque, WebAIM and TPGi over more than a decade. Coverage percentages reflect how many WCAG success criteria each approach can fully evaluate, not how many issues each finds in a particular product.

	Automated	Manual	Hybrid (recommended)
WCAG issue coverage	~30 to 40%	~95%+	~95%+, faster to reach
Cost per audit	Free to low	High (expert hours)	Moderate (automation covers the cheap wins, expert time focuses on the rest)
Speed	Seconds per page	Hours per page	Automation in CI; manual at milestones
Expertise needed	Low to read results; some skill to interpret	High (WCAG, ARIA, assistive tech)	Mixed: low for CI, high for audit
False positives	Some (mostly resolvable with tuning)	Rare	Manageable
False negatives (issues missed)	Many (everything semantic)	Few	Few
Best tools	axe DevTools, WAVE, Lighthouse, Accessibility Insights	NVDA, VoiceOver, TalkBack, browser DevTools, keyboard-only review	All of the above, plus a documented audit methodology
Best for	CI regression checks, design system QA, fast page-level pre-flight	Pre-launch audits, certification, customer complaints, complex apps	Everyone serious about accessibility, especially government and enterprise

The two approaches in detail

Automated testing

Testing ~30 to 40% coverage

Automated tools statically analyse the HTML, CSS and (sometimes) live DOM of a page and flag any rule violations they can mechanically detect. They are excellent at the programmatic checks: missing alt attributes, insufficient colour contrast, form fields without labels, heading-level skips, ARIA misuse, duplicate IDs, language declaration. They run in milliseconds, scale to thousands of pages, and integrate into CI pipelines so a regression in main is caught before merge.

Strengths

Fast: a full-page scan in under a second
Repeatable: same input produces same output, so regressions are obvious
Scalable: enterprise platforms scan thousands of pages overnight
CI-friendly: a failed accessibility check can block a pull request the same way a failed unit test does
Cheap to start: axe DevTools, WAVE and Lighthouse are free

Limitations

Hard 30 to 40 percent coverage ceiling. Cannot evaluate anything semantic.
Cannot tell whether alt text is meaningful, just whether it exists
Cannot tell whether the reading order makes sense to a screen reader user
Cannot tell whether a custom widget actually works with assistive technology
Can produce false positives that need triage from someone who knows WCAG

Best tools

axe DevTools by Deque - the de facto standard, browser extension and CLI versions, used by most accessibility programs
WAVE by WebAIM - visual overlays make findings easy to communicate to non-developers
Lighthouse in Chrome DevTools - bundled, free, covers a subset of axe rules
Accessibility Insights by Microsoft - free, guided manual review wrapper around axe
Pa11y, Sa11y, IBM Equal Access - CLI and CI options for build pipelines

Manual testing

Testing ~95%+ coverage

Manual testing is what a trained accessibility specialist does with the actual product: keyboard navigation through every flow, screen reader testing in at least NVDA plus VoiceOver, browser zoom to 200 percent and 400 percent, reflow testing at narrow viewports, focus and tab order review, ARIA tree inspection, and judgement calls on whether content is genuinely understandable. It catches the 60 to 70 percent of WCAG issues that automation cannot see.

Strengths

Catches every category of WCAG issue that automation cannot
Validates real-world usability, not just rule compliance
Produces evidence-rich findings (recordings, screenshots, AT output) that survive procurement review
Surfaces design-level issues that need product or design intervention, not just code fixes

Limitations

Slow: hours per template, not seconds
Expensive: requires WCAG-fluent, AT-fluent specialists
Hard to run on every PR; usually a milestone activity
Depends on the auditor's skill and methodology, so consistency requires a documented process

Best practices

Keyboard-only review first. If you cannot complete every key task without a mouse, you have not started auditing.
Screen reader review in at least NVDA on Windows and VoiceOver on macOS or iOS. See our comparison of the four major screen readers for selection guidance.
Browser zoom and reflow at 200 percent and 400 percent. Check that nothing is lost behind sticky bars or modal frames.
Document everything: WCAG criterion failed, severity, location, evidence, remediation guidance. Vague findings do not get fixed.
User testing with people with disability for any high-stakes flow. Manual expert testing is not a substitute for actual user experience.

Hybrid: the actual answer

Strategy Recommended

For any serious accessibility program, the answer is both, deployed at different cadences. Automation runs continuously in CI to catch regressions cheaply. Manual audits run at milestones (pre-launch, major feature, quarterly governance) to catch everything automation cannot. User testing with people with disability runs occasionally on high-stakes flows. This is the model the Australian Digital Service Standard expects and what every government department procurement panel asks about.

Suggested cadence

Per pull request in CI: axe-core or similar as a build step. Blocks merges that introduce new accessibility regressions.
Per release: spot manual audit of any new or substantially changed flow. Two to four hours of expert time.
Quarterly: full manual audit of top user journeys plus a representative sample of templates. One to two weeks of expert time.
Annually: independent third-party audit for procurement, board reporting and accessibility-statement refresh.
Project milestones: user testing with people with disability for any flow handling authentication, payments, benefits, health information or applications.

Which approach is right for you?

The right mix depends on your team size, governance maturity and risk profile.

Solo developer or small team

Automated in CI (axe DevTools or Lighthouse), plus occasional self-driven keyboard and screen reader passes. Outside expert audit before any major launch.

Mid-size product team (10 to 50 engineers)

Automated in CI, plus quarterly manual audit by an internal champion or external consultant. Pre-release sweep by a trained accessibility tester.

Enterprise

Automated in CI at every PR, manual audit every release, annual independent audit, plus a defined accessibility champion in every squad. Consider a paid platform like Siteimprove, Level Access or Deque if you need cross-portfolio dashboards.

Australian government department

All of the enterprise pattern, plus user testing with people with disability for every high-stakes flow per Digital Service Standard requirements. Document everything for DTA reporting and accessibility-statement evidence.

Compliance-heavy sector (banking, health, insurance)

Enterprise pattern plus independent annual audit, defensible evidence trail, and ARIA-rich custom components verified against assistive technology before release.

Procurement / accessibility statement

Independent manual audit by a specialist firm. Self-attestation based on automated tools alone will not survive procurement review.

Common questions

What percentage of WCAG issues do automated tools actually catch?

Independent research from Deque, WebAIM and others consistently puts automated coverage at around 30 to 40 percent of WCAG success criteria. Automated tools are very good at programmatic checks (missing alt attributes, contrast ratios, form labels, heading structure, ARIA validity) but cannot evaluate anything subjective: whether alt text is meaningful, whether reading order makes sense, whether a label genuinely describes what the control does, whether a custom widget behaves correctly with a screen reader. Most accessibility failures live in that 60 to 70 percent that automation cannot see.

Can automated tools find everything if they get better?

No. The hard cap on automated coverage is not a technology problem; it is a semantic one. Whether an alt text is useful in context, whether a link name is clear, whether content is genuinely understandable, whether a custom interaction works for a switch user, all require human judgement. AI-assisted tools (Evinced, Stark, axe AI) extend automation slightly into the semantic layer (suggesting alt text candidates, flagging unclear link names) but they still need human review. The 30 to 40 percent figure has been stable for over a decade and is not expected to move dramatically.

Are paid automated tools much better than free ones?

Marginally. Free tools (axe DevTools, WAVE, Lighthouse, Accessibility Insights) catch roughly the same set of issues as paid platforms (Siteimprove, Level Access, Deque). What you pay for in the paid tier is enterprise infrastructure: scheduled scans across thousands of pages, role-based dashboards, audit-trail reports, ticketing integration, dedicated support. For a single project audit, the free tools are excellent. For an enterprise governance program, the paid platforms earn their cost.

How long does a manual accessibility audit take?

For a typical web product, a manual audit covering 10 to 20 representative templates and key user flows takes one to three weeks of expert time. That includes keyboard testing, screen reader testing (NVDA plus VoiceOver minimum), zoom and reflow testing, ARIA review, and writing up findings with severity, evidence and remediation guidance. Document accessibility audits run two to three days per document on average. ExceedAbility audits combine automated scanning as a pre-pass with a structured manual review, so the manual time is spent on the issues automation cannot see.

Is user testing with people with disability required?

Not by WCAG, no. WCAG conformance can be demonstrated through a combination of automated and manual expert testing. But user testing with people with disability is the only way to validate that the product is actually usable, not just technically conformant. We strongly recommend it for any high-stakes flow (authentication, payment, applications, health information) and for any product whose primary audience includes people with disability. The Australian government Digital Service Standard explicitly requires testing with users including users with disability.

Need to understand your accessibility risks?

Get an independent accessibility review and prioritised remediation recommendations.

Request an Accessibility Audit

Manual vs automated accessibility testing

Overview

Manual or automated testing?

At-a-glance comparison

The two approaches in detail

Automated testing

Strengths

Limitations

Best tools

Manual testing

Strengths

Limitations

Best practices

Hybrid: the actual answer

Suggested cadence

Which approach is right for you?

Common questions

What percentage of WCAG issues do automated tools actually catch?

Can automated tools find everything if they get better?

Are paid automated tools much better than free ones?

How long does a manual accessibility audit take?

Is user testing with people with disability required?

Need to understand your accessibility risks?