Abstract
Undergraduate STEM programs are faced with the daunting challenge of managing instruction and assessment for classes that enroll thousands of students per year, and the bulk of student assessment is often determined by multiple choice tests. Instructors try to monitor the reliability metrics and diagnostics for item quality, but rarely is there a more formal evaluation of the psychometric properties of these assessments. College assessment strategies seem to be dominated by a common-sense view of testing that is generally unconcerned about precision of measurement. We see an opportunity to have an impact on undergraduate science instruction by incorporating more rigorous measurement models for testing, and using them to assist instructional goals and assessment. We apply item response theory to analyze tests from two undergraduate STEM classes, a resident instruction physics class and a Massive Open Online Course (MOOC) in geography. We evaluate whether the tests are equally informative across levels of student proficiency, and we demonstrate how precision could be improved with adaptive testing. We find that the measurement precision of multiple choice tests appears to be greatest in the lower half of the class distribution, a property that has consequences for assessment of mastery and for evaluating testing interventions.
Original language | English (US) |
---|---|
Journal | ASEE Annual Conference and Exposition, Conference Proceedings |
Volume | 122nd ASEE Annual Conference and Exposition: Making Value for Society |
Issue number | 122nd ASEE Annual Conference and Exposition: Making Value for... |
State | Published - 2015 |
Event | 2015 122nd ASEE Annual Conference and Exposition - Seattle, United States Duration: Jun 14 2015 → Jun 17 2015 |
All Science Journal Classification (ASJC) codes
- General Engineering