Exact Path’s Diagnostic Differential Item Analysis: A Fair Assessment for All
Two independently conducted differential item functioning studies indicate that Exact Path diagnostic items work well for diverse students.
Ensuring Fairness in Every Assessment
Independent research confirms that Exact Path diagnostic assessments measure student achievement fairly across diverse groups of learners. In two third-party research studies analyzing millions of student item responses across subjects, no items were flagged for concerning levels of differential item functioning (DIF). This means that students’ performance on Exact Path items reflects their knowledge and skills—not their gender, race/ethnicity, socioeconomic status, language background, or other demographic factors.
Understanding DIF and Why It Matters
In a fair assessment, students with similar achievement levels should have an equal chance of answering the same question correctly. When this does not happen—when students from different groups perform differently on a question, even after accounting for differences in overall achievement—the item is said to exhibit DIF.
DIF analysis is a standard way to evaluate whether test items are fair and unbiased. If an item shows DIF, it may (or may not) contain elements that disadvantage certain groups of students. Because of this, items flagged for DIF are carefully reviewed and sometimes removed from the item bank to ensure fairness.
How DIF Is Evaluated
One of the most widely used approaches for detecting DIF is the Mantel-Haenszel (MH) procedure, along with the Educational Testing Service (ETS) classification rules (Holland & Thayer, 1986).
- The MH procedure calculates an odds ratio (the MH alpha statistic), which is used in the ETS classification to create the MH-DIF statistic.
- The MH chi-square statistic tests whether the difference between groups is statistically significant.
Based on these analyses, each item is placed into one of three categories (Kamata & Vaughn, 2004):
- A – Negligible DIF (no meaningful concern)
- B – Possible DIF (should be reviewed)
- C – Large DIF (not recommended for use)
Exact Path DIF Studies
Edmentum has worked with two independent research organizations to conduct large-scale DIF studies of the Exact Path diagnostic assessments. These studies used data from millions of students and analyzed the fairness of thousands of assessment items.

In Spring 2022, EdMetric, a third-party research firm, conducted a DIF study on the Exact Path diagnostic item bank, including all items K–12 for language arts, mathematics, and reading. This study compared student performance by gender, district-level minority representation, district-level poverty rates, and student responses before and after the start of COVID-19–related school closures.
The analysis covered data from the 2018–19, 2019–20, and 2020–21 school years. No items were flagged for B- or C-level DIF, meaning no items showed concerning differences across groups. This provides strong evidence that the items in Edmentum’s assessments measure achievement fairly for students from different backgrounds.
Comparison Groups:
- Gender (student level) – Male/Female
- Race (group level) – Majority (50% or more white)/Minority (less than 50% white) district
- Socio-economic status (group level) – From a high (above median) poverty level district (17% or more students below the poverty line)/From a low (below median) poverty level district (less than 17% of students below the poverty line)
- Pandemic effect (group level) – Pre-Pandemic (student-response data from before March 2020/ Post-Pandemic (student response data from after March 2020
In 2025, eMetric conducted a second DIF study, expanding on the previous DIF study by including data from the 2021–22, 2022–23, and 2023–24 school years. It also compared groups based on student-level characteristics rather than district-level characteristics, offering a more precise look at fairness.
Even with more data and more comparison groups, this study also found little evidence of DIF. No items were classified as having large DIF (category C), and the vast majority were classified as negligible (category A).
Comparison Groups:
- Gender (student level) – Male/Female
- Race/Ethnicity (student level) – African American/Hispanic/Other/White
- Gifted (student level) – Yes/No
- Economically disadvantaged (student level) – Yes/No
- Student with disabilities (student level) – Yes/No
- English language learner (student level) – Yes/No
- Geographic region (student level) – Northeast/Midwest/South/West
- Grade (student level) – Above: item’s grade level was higher than student’s enrolled grade level/On: item’s grade level was the same as the student’s enrolled grade level/Below: item’s grade level was lower than the student’s enrolled grade level
Together, these studies provide strong evidence that the items used in Exact Path diagnostic assessments are fair and accurate measures of student achievement across diverse groups of learners.
References:
Holland, P. W., & Thayer, D. T. (1986, April 16–20). Differential item performance and the Mantel-Haenszel procedure [Paper presentation]. Sixty-seventh annual meeting of the American Educational Research Association, San Francisco, CA, United States.
Kamata, A. & Vaughn, B. K. (2004). An introduction to differential item functioning analysis. Learning Disabilities: A Contemporary Journal, 2(2), 49–69.