Abstract
The proliferation of large-language model (LLM) generative artificial intelligence (AI) tools like ChatGPT raises an inevitable question of how it should impact student assessments in aerospace engineering. To evaluate this, a large sample of multiple-choice questions from undergraduate aerodynamics and aeronautics courses were input into ChatGPT-4 and Gemini and the accuracy evaluated. The cognitive level of each question was coded using Bloom’s taxonomy based on consensus of the authors. It was found that that generative AI performs increasingly poorly as the cognitive level is increased. Chi-square analyses of the data show very strong association with ChatGPT and strong association with Gemini for these trends. Cursory analysis of questions where both tools gave different wrong answers are consistent with the pattern matching aspects of LLMs. Based on the authors’ observations, recommendations are offered for writing multiple choice questions that actually assess human understanding.
| Original language | English (US) |
|---|---|
| Journal | ASEE Annual Conference and Exposition, Conference Proceedings |
| DOIs | |
| State | Published - 2025 |
| Event | ASEE Annual Conference and Exposition, 2025 - Montreal, Canada Duration: Jun 22 2025 → Jun 25 2025 |
All Science Journal Classification (ASJC) codes
- General Engineering
Fingerprint
Dive into the research topics of 'Aerospace Engineering Education in the Era of Generative AI'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver