Mining student-generated textual data in MOOCs and quantifying their effects on student performance and learning outcomes

Conrad Tucker, Barton K. Pursel, Anna Divinsky

Research output: Contribution to journalArticlepeer-review

46 Scopus citations


Massive Open Online Courses (MOOCs) are freely available courses offered online for distance based learners who have access to the internet. The tremendous success of MOOCs can in part, be attributed to their global availability, enabling anyone in the world to sign up/drop courses at any time during the course offerings. Course enrollment in MOOCs often range between 10,000 to 200,000 students, thereby providing a potentially rich venue for large scale digital data (e.g., student course comments, temporal and geo-location data, etc.). However, despite the overabundance of digital data generated through MOOCs, research into how student interactions in MOOCs translate to student performance and learning outcomes is limited. The objective of this research is to mine student-generated textual data (e.g., online discussion forums) existing in MOOCs in order to quantify their impact on student performance and learning outcomes. Student performance is quantified based on grades attained in course homework assignments, quizzes and examinations. Similar to in-class learning environments, students enrolled in MOOCs often self-organize and form learning groups, where course topics and assignments can be discussed. One of the major benefits of MOOC data is that student networks and discussion therein are digitally stored and readily available for data mining/statistical analysis. The proposed methodology employs robust natural language processing techniques and data mining algorithms to quantify temporal changes in student sentiments relating to course topics and instructor clarity. Researchers aim to determine whether textual content (e.g., quality VS quantity of student forum discussions) expressed through MOOCs can serve as leading indicators of student performance in MOOCs. A case study involving the Introduction to Art: Concepts and Techniques offered by Perm State University through the Coursera platform, is used to validate the proposed methodology.

Original languageEnglish (US)
Pages (from-to)84-95
Number of pages12
JournalComputers in Education Journal
Issue number4
StatePublished - 2014

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • Education


Dive into the research topics of 'Mining student-generated textual data in MOOCs and quantifying their effects on student performance and learning outcomes'. Together they form a unique fingerprint.

Cite this