compliance

How do you bias-test AI hiring, and what is the four-fifths rule?

Michael

Founder, KimonRecruit

Published 18 June 2026

Adverse-impact monitoring for AI in hiring: what the four-fifths rule measures, how it overlaps with the Equality Act 2010, and what to test for in practice.

If an automated tool helps decide who progresses in your hiring, you need a way to tell whether it skews outcomes across groups, and you need it before a candidate or a regulator asks. This article explains adverse-impact monitoring, what the four-fifths rule actually measures, and how it sits alongside UK discrimination law. It is a practical orientation, not legal advice; for decisions about your specific situation, speak to your own advisers.

Why bias-test at all?

No assessment process is free of skew, and an automated one scales whatever skew it has across every candidate it touches. That is precisely why hiring AI is treated as high-risk under the EU AI Act, and why monitoring outcomes is not optional good practice but a load-bearing control.

{/* SOURCE (founder-verified 2026-06-18): recruitment/selection AI is high-risk under EU AI Act Annex III(4); the deployer's monitoring and input-data responsibilities sit in Article 26. Provider-side data-governance and bias-testing duties are Article 10. Confirm article numbers against the consolidated text before publish. Source: artificialintelligenceact.eu. */}

The legal exposure is not only an EU-AI-Act question. In the UK, the Equality Act 2010 prohibits indirect discrimination: a provision, criterion or practice that applies to everyone but puts a protected group at a particular disadvantage, unless it can be objectively justified. A tool that produces worse outcomes for a protected group can ground a claim without any proof of intent. That duty applies today, automated tool or not, and it does not wait for any EU enforcement date.

{/* SOURCE (founder-verified 2026-06-18): indirect discrimination is Equality Act 2010 section 19; the protected characteristics are section 4. Confirm section numbers before publish. Source: legislation.gov.uk, Equality Act 2010. */}

What does the four-fifths rule measure?

The four-fifths rule, sometimes called the 80% rule, is a screening test for adverse impact. You compare the selection rate of each group against the selection rate of the most-selected group. If any group's rate falls below four-fifths, or 80%, of the highest group's rate, that is a signal of potential adverse impact worth investigating.

A worked example makes it concrete. Suppose 60% of one group passes a screening stage and 40% of another passes. The ratio is 40 divided by 60, which is 0.67, or 67%. That is below 80%, so the four-fifths rule flags it.

The rule is a flag, not a verdict. It originates in US enforcement guidance and is widely used as a first-pass screen, but it has well-documented limitations: it is sensitive to small samples, it does not establish statistical significance on its own, and a pass does not prove fairness any more than a fail proves discrimination. Treat it as the smoke alarm, and pair it with significance testing, such as a chi-square or a Cohen's h effect size, before drawing conclusions.

{/* SOURCE (founder-verified 2026-06-18): the four-fifths rule derives from the US Uniform Guidelines on Employee Selection Procedures (29 CFR 1607.4(D)). It is a screening heuristic, NOT a UK legal standard; UK indirect-discrimination analysis under the Equality Act 2010 does not adopt the 80% threshold as a test. Verify this framing carefully before publish so we do not overstate the rule's legal status in the UK. Sources: 29 CFR 1607.4, FAccT 2024 "four-fifths is not disparate impact". */}

What should you actually test?

Monitoring is a programme, not a single run. A reasonable shape for an SME:

Measure outcomes by stage, not just at the end. Skew can enter at screening, at assessment scoring, or at interview shortlisting. Aggregate-only monitoring hides where the problem is.
Use the demographic data you are allowed to use, held separately. Diversity monitoring data should be captured separately from candidate identity, anonymised, and used only in aggregate. You cannot monitor what you never collected, but you also must not let monitoring data leak into a hiring decision.
Run the four-fifths screen, then test significance. The screen flags; the significance test tells you whether the flag is noise or signal.
Keep the records. When and what you measured, what you found, and what you changed. A contemporaneous monitoring record is the evidence that your process was designed for tribunal defence, should an outcome ever be challenged.
Close the loop. A flag with no remediation pathway is worse than no monitoring, because it documents that you saw the problem and did nothing.

Does the EU AI Act timeline change this?

The dates are in flux, but the monitoring duty is the most date-resilient part of the picture. The EU AI Act's high-risk obligations covering hiring were set to apply from 2 August 2026; the provisional Digital Omnibus agreement of 7 May 2026 would defer standalone Annex III obligations to 2 December 2027, and as of writing it is not yet formally adopted, so 2 August 2026 still stands today.

{/* SOURCE (founder-verified 2026-06-18): both dates are time-sensitive. The Digital Omnibus (7 May 2026) and the proposed 2 December 2027 deferral were NOT YET ADOPTED as of 2026-06-18; re-check before publish. Sources: Gibson Dunn, Travers Smith, EU AI Act Service Desk. */}

Whichever date wins, the Equality Act 2010 already requires you not to operate a process that disadvantages a protected group without justification. Adverse-impact monitoring is how you find out whether you are, and it is the control you want running long before any EU enforcement question arises.

How KimonRecruit approaches this

We built KimonRecruit so adverse-impact monitoring runs continuously rather than as a pre-tribunal scramble. An outcome dashboard monitors selection across Equality Act 2010 characteristics, with demographic data captured separately from candidate identity and used only in aggregate. Every assessment score is replayable from the prompt, model and version that produced it, so a flagged outcome can be investigated against the actual evidence rather than a black box. You can read more about how we structure this in our approach to bias audit.

None of that makes a process immune to skew, and we do not claim it does; monitoring exists precisely because skew is always possible. It does mean the four-fifths screen and the significance tests have clean, separated data to run against, generated as you hire. For the wider compliance picture, see our pillar guide to the EU AI Act and recruitment.

Bias testing is not a one-off certificate you earn and file. It is a continuous control, and the employers who treat it that way are the ones who can answer the hard question, with evidence, on the day it is asked.

Found this useful? Share via email. · Read more →