Skip to main navigation Skip to search Skip to main content

Automated Scoring of Question Complexity with Transformer Language Models

Research output: Contribution to journalArticlepeer-review

Abstract

Question-asking, an essential yet understudied activity, holds significant implications for learning, creativity, and cognitive development. Research shows that asking complex, open-ended questions is better for learning than closed ones. Previous research has explored open-ended question complexity through Bloom’s taxonomy, but the measurement of complexity remains challenging. Recent advancements in natural language processing have enabled automated scoring of psychological tasks aligned to human-ratings. However, automatic assessment of open-ended questions remains understudied. We address this gap by fine-tuning transformer language models to predict human ratings of open-ended question complexity and comparing them to existing baseline measures (i.e., word count and semantic distance). Using previously collected human-rated responses and Bloom ratings from a creative question-asking task, we trained an encoder model (RoBERTa) and a Large Language Model (Llama-2-7B). Our results reveal that RoBERTa correlated strongly with human ratings of complexity (r = .73), exceeding baseline measures and offering an efficient, lightweight solution suitable for broad adoption. Our fine-tuned LLaMA 2 model achieved stronger performance (r = .84), establishing a new benchmark for predictive accuracy. Thus, we demonstrate how language models can be utilized to automatically score the complexity of open-ended questions. Importantly, LLaMA 2 demonstrates higher accuracy, while RoBERTa provides a replicable, accessible, and cost-effective option for everyday educational and psychological applications. Our work paves the way for automatic assessment of open-ended questions, which are critical across a wide range of cognitive domains.

Original languageEnglish (US)
Article number102090
JournalThinking Skills and Creativity
Volume60
DOIs
StatePublished - Jun 2026

All Science Journal Classification (ASJC) codes

  • Education

Fingerprint

Dive into the research topics of 'Automated Scoring of Question Complexity with Transformer Language Models'. Together they form a unique fingerprint.

Cite this