AI bias
Systematic errors in AI outputs that disadvantage specific groups, perspectives, or topics — caused by biased training data, biased reward signals, or biased evaluation criteria.
AI bias is real and measurable. Models trained on internet text inherit biases present in that text — gender, race, age, geography, political viewpoint, language. Without explicit mitigation, these biases show up in hiring AI, medical AI, financial AI, and any other application where the population is diverse but the training data is not.
Mitigation in 2026: explicit bias evaluation in pre-launch reviews (Sandhya Sharma's Bias Audit framework, NIST AI Risk Management Framework), diverse training data, RLHF or constitutional AI targeting fairness, and post-deployment monitoring for disparate outcomes.
For agent builders in regulated industries (hiring, lending, healthcare), bias audits are increasingly mandatory. NYC Local Law 144 requires bias audit for any Automated Employment Decision Tool. EU AI Act classifies hiring AI as high-risk and requires documentation of bias mitigation.
Frequently asked
Is AI bias inevitable?+
Bias-free AI is not possible given current training methods — models reflect biases in their training data. But bias can be measured, mitigated, and monitored. The goal is acceptable bias for the deployment context, not zero bias.
How do I audit my AI for bias?+
Define protected groups for your context (gender, race, age, etc.). Create test cases where the only variable is group membership. Measure output disparities. NIST AI RMF and Sandhya Sharma's framework are starting points.