Challenges and solutions to the student dropout prediction problem in online courses

Image credit: [Lorenzo Madeddu]


(Testo in lingua originale) Online courses and e-degrees, although present since the mid-1990, have received enormous attention only in the last decade. Moreover, the new Coronavirus disease (COVID-19) outbreak forced many nations (e.g. Italy, the US, and other countries) to massively push their education system towards an online environment. Academics now are also looking at the crisis as an opportunity for universities to adopt digital technologies for teaching more broadly. But they will have to understand what possible ways of evaluating and effectively teaching will be in this new scenario. The depicted overview, in conjunction with the utility and ubiquitous access to the educational platforms of online courses, entails a vast amount of enrolments. Nevertheless, a high enrolment rate usually translates into a significant dropout (or withdrawal) rate of students (40-80% of online students drop out). Student dropout prediction (SDP) consists of modelling and fore-casting student behaviour when interacting with e-learning platforms. It is a significant phenomenon that has repercussions on online institutions, the involved students and professors. Early approaches tended to perform manual analytic examinations to devise retention strategies. Recent research has adopted automated policies to thoroughly exploit the advantages of student activities(hereafter e-tivities) in the e-platforms and identify at-risk students. These approaches include machine learning and deep learning techniques to predict the student dropout status. Therefore, being able to cope with the trend shifting of student interactions with the course platforms in real-time has become of paramount importance. In this tutorial, we comprehensively overview the SDP problem in the literature. We provide mathematical formalisation to the different definitions proposed, and we introduce simple and complex predictive methods adhering to the following: Student dropout definition, Input modelling, Underlying machine and deep learning techniques, Evaluation measures, Datasets, and privacy concerns.

In Proceedings of the 29th ACM International Conference on Information & Knowledge Management
Lorenzo Madeddu
Lorenzo Madeddu
Senior data scientist, PhD

Lorenzo Madeddu è senior data scientist (R&D) in the Knowledge Graph Insights team at AstraZeneca.