Hand Isaac Newton a tablet for a physics exam today, and he would bomb it—his brilliance obscured by an inability to navigate a simulation. We educators commit this malpractice against neurodivergent students daily, diagnosing their minds as deficient instead of our broken tests.
Myth vs. Science
Pamela Cantor, M.D., notes that 20th-century education rests on an assessment architecture designed to sort students through a deficit lens, anchored by myths that genes dictate destiny, talent is scarce, and capability can be cleanly ranked and sorted on a bell curve. Yet contemporary learning sciences prove human potential is profoundly malleable, abundantly distributed, and dynamically shaped by the environments we provide. Every child is wired to learn and grow.
Why do our systems tell a different story? Flawed measurement. Forcing neurodivergent learners into rigid standardized tests too often measures barriers, not cognitive ability. This can breed “pathological explanations,” falsely blaming learning differences rather than recognizing the limitations of the tests themselves.
Untangling the Signal from the Noise
In psychometrics, a poorly designed test obscuring a brilliant mind creates Construct Irrelevant Variance. In Architecture of Ability, we contend that quality measurement captures the “Signal”—the focal ability—and filters out the “Noise” of barriers and time limits. To address noise, we must embrace three foundational propositions.
Proposition 1: Fairness does not mean identical.
For a century, the testing industry assumed fairness meant standardization—forcing every student to complete tasks under identical conditions. This makes little sense for neurodivergent learners. Indeed, unidentified “nonequivalent surface conditions provide nonequivalent evidence about learners” (Mislevy et al., 2013).
Consider the spelling bee. The “Signal” measures orthographic knowledge, but the format rigidly demands speaking audibly in front of an audience. For a student with a severe speech impairment, the noise of that barrier drowns out their spelling ability. The contest ceases measuring what they know, measuring the “construct-irrelevant” hurdle of their impairment instead. Allowing that student to type on a keyboard alters surface delivery without diluting the spelling bee’s rigor. Intentionally varying task interaction strips away irrelevant barriers, enhancing validity and giving each learner an unclouded lens to demonstrate their capability.
Proposition 2: Equivalent surface conditions may not provide equivalent evidence.
Standardized testing assumes that identical interfaces ensure accurate data. However, equivalent surfaces do not generate equivalent evidence. Consider an eye exam versus an alphabet test. If an optometrist measures visual acuity, standardizing a twenty-foot distance to the eye chart is prudent; visual distance is the measured construct. But if an educator evaluates alphabet knowledge, forcing a visually impaired child to read the board from twenty feet away is measurement malpractice. It merely tests their eyesight. This visual demand introduces massive construct-irrelevant variance, becoming the primary “alternative explanation” for failure. The exam stops measuring the alphabet and measures eyesight instead. Forcing students to use the exact same interface introduces construct-irrelevant barriers.
Proposition 3: Principled variation can provide equivalent evidence.
Assessment design and development should embrace a third insight: intentionally adjusting surface conditions (e.g., task delivery) in evidence-based ways for different learners can provide equivalent evidence. To illustrate why this principled variation can be essential, consider a modern educational computer simulation designed to assess physics, such as Newton’s Playground. For a tech-savvy student, this digital environment offers an engaging medium to demonstrate mastery of Newtonian mechanics. But imagine if Isaac Newton himself were subjected to this assessment, he would almost certainly fail. Despite having authored the theorems, Newton lacks the “Additional KSAs” (Knowledge, Skills, and Abilities) required to operate a computer. His physics genius (the Signal) would be obscured by the format (the Noise), yielding a false negative. By providing principled alternatives for how a student accesses and interacts with a task, we ensure that we are measuring their underlying brilliance, rather than their fluency with our chosen delivery mechanism.
The AI Fork in the Road
For decades, construct-irrelevant “noise” was a tragic flaw hidden in Scantron bubbles. Today, AI gives us unprecedented computational power to break the mold. Because AI can dynamically adjust a task’s delivery in real-time (e.g., simplifying syntax or offering on-demand audio scaffolding), we have the scalable tool needed to personalize assessment and eliminate construct irrelevance once and for all.
But AI is, at its core, an inference engine. It will simply turbocharge whatever inferences we ask it to make. We are at a critical fork in the road, facing two divergent futures.
Path B is the Scaling of Harm
If we blindly feed broken testing approaches into algorithms, AI will automate subpar measurement. It will power “pathological inferences,” sorting neurodivergent learners into failure faster than ever and cementing incomplete narratives with algorithmic authority.
Path A is Meeting the Promise
If we tune AI to strip away the noise and isolate the signal, learners, educators, and families receive the individualized feedback required to pursue continuous improvement and human thriving.
If a test requires overcoming a barrier unrelated to the subject, it is a poorly designed test. By changing our psychometric blueprints, we can shift from a deficit mindset that pathologizes learners to a design orientation that empowers them. We must stop asking, “What is wrong with this student?” and start demanding, “What about this assessment must be improved?”
Mislevy, R. J., Haertel, G., Cheng, B. H., Rutstein, D., Vendlinski, T., Murray, E., Rose, D., Gravel, J., & Mitman Colker, A. (2013). Conditional inferences related to focal and additional knowledge, skills, and abilities (Assessment for Students with Disabilities Technical Report 5). SRI International.
