Statistical modeling of school dropout in engineering students based on process mining
DOI:
https://doi.org/10.31637/epsir-2025-974Keywords:
School dropout, mathematical training, process mining, conformance checking, Petri nets, curriculum analytics, statistical modeling, predictive modelsAbstract
Introduction: The present work describes the development of a statistical model to estimate the probability of dropout among engineering students at the University of the Caribbean through the application of process mining techniques. Methodology: A mathematical training process was defined as a reference to evaluate the suitability of the academic trajectories of students from four cohorts. The evaluations of conformance with the mathematical training process, during the first three semesters of each student, were used as predictors of dropout probability in a statistical model. Results: The statistical model was adjusted using data from the first three cohorts and validated by applying it to the most recent cohort, comparing the predictions with the observed dropout results. Discussions: The study demonstrates the effectiveness of process mining techniques in generating relevant academic information, useful for decision-making to mitigate the risk of school dropout based on the analysis of students' academic trajectories. Conclusions: Future directions are suggested, such as the implementation of monitoring systems and the inclusion of other critical processes in the analysis to significantly increase the effectiveness of educational interventions.
Downloads
References
Amaya Amaya, A., Huerta Castro, F. y Flores Rodríguez, C. O. (2020). Big Data, una estrategia para evitar la deserción escolar en las IES. Revista Iberoamericana de Educación Superior, 11(31), 166-178. https://doi.org/10.22201/iisue.20072872e.2020.31.712 DOI: https://doi.org/10.22201/iisue.20072872e.2020.31.712
Banihashem, S. K., Aliabadi, K., Ardakani, S. P., Delaver, A., y Ahmadabadi, M. N. (2018). Learning analytics: A systematic literature review. Interdisciplinary Journal of Virtual Learning in Medical Sciences, 9(1), 41-60. DOI: https://doi.org/10.5812/ijvlms.63024
Berti, A., van Zelst, S. J. y Schuster, D. (2023). PM4Py: A process mining library for Python. Software Impacts, 17, 100556. https://doi.org/10.1016/j.simpa.2023.100556 DOI: https://doi.org/10.1016/j.simpa.2023.100556
Chen, T. y Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794. DOI: https://doi.org/10.1145/2939672.2939785
De Witte, K. y Rogge, N. (2013). Dropout from secondary education: All's well that begins well. European Journal of Education, 48(1), 131-149. https://doi.org/10.1111/ejed.12001 DOI: https://doi.org/10.1111/ejed.12001
Dumas, M., La Rosa, M., Mendling, J. y Reijers, H. A. (2019). Fundamentals of Business Process Management. Springer. DOI: https://doi.org/10.1007/978-3-662-56509-4
Felder, R. M. y Brent, R. (2004). The Intellectual Development of Science and Engineering Students. Part 2: Teaching to Promote Growth. Journal of Engineering Education, 93(4), 279-291. http://doi.org/10.1002/j.2168-9830.2004.tb00817.x DOI: https://doi.org/10.1002/j.2168-9830.2004.tb00817.x
Gottipati, S. y Shankararaman, V. (2018). Competency analytics tool: Analyzing curriculum using course competencies. Education and Information Technologies, 23(1), 41-60. https://doi.org/10.5812/IJVLMS.63024 DOI: https://doi.org/10.1007/s10639-017-9584-3
Instituto Nacional de Estadística Geografía e Informática. (s.f.). Tasa de abandono escolar por entidad federativa según nivel educativo. https://bit.ly/3VSicNW
International Business Machines [IBM]. (6 de enero de 2022). BPMN basics: Understanding and using BPMN. IBM. https://www.ibm.com/blog/bpmn/
Loder, A. K. F. (2024). The use of educational process mining on dropout and graduation data in the curricula (Re-)Design of universities. Trends in Higher Education, 3(1), 50-66. https://doi.org/10.3390/higheredu3010004 DOI: https://doi.org/10.3390/higheredu3010004
López Suárez, A., Albíter Rodríguez, Á. y Ramírez Revueltas, L. (2008). Eficiencia terminal en la educación superior, la necesidad de un nuevo paradigma. Revista de la Educación Superior, XXXVII(2), 135-151.
McKinney, W. (2010). Data Structures for Statistical Computing in Python. Proceedings of the 9th Python in Science Conference, 51-56. DOI: https://doi.org/10.25080/Majora-92bf1922-00a
Murata, T. (1989). Petri nets: Properties, analysis and applications. Proceedings of the IEEE, 77(4), 541-580. https://doi.org/10.1109/5.24143 DOI: https://doi.org/10.1109/5.24143
Murillo-García, O. L. y Luna-Serrano, E. (2021). El contexto académico de estudiantes universitarios en condición de rezago por reprobación. Revista Iberoamericana de Educación Superior, 12(33), 58-75. https://doi.org/10.22201/iisue.20072872e.2021.33.858 DOI: https://doi.org/10.22201/iisue.20072872e.2021.33.858
Ocampo Díaz, J., Martínez Romero, M., de las Fuentes Lara, M. y Zatarain Jorge, J. (2010). Reprobación y deserción en la facultad de ingeniería Mexicali de la Universidad Autónoma de Baja California. https://bit.ly/467lUs1
Paramo, G. J. y Correa Maya, C. A. (2012). Deserción estudiantil universitaria. Conceptualización. Revista Universidad EAFIT, 35(114), 65-78.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... y Duchesnay, E. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825-2830.
Pérez, S. L. G. (2010). El papel de la tutoría en la formación integral del universitario. Tiempo de Educar, 11(21), 31-56.
Plotly Technologies Inc. (2015). Collaborative data science. Montréal, QC: Plotly Technologies Inc. https://plot.ly
Salazar-Fernández, J. P., Muñoz-Gama, J., Maldonado-Mahauad, J., Bustamante, D. y Sepúlveda, M. (2021). Backpack Process Model (BPPM): A process mining approach for curricular analytics. Applied Sciences, 11(9), 4265. https://doi.org/10.3390/app11094265 DOI: https://doi.org/10.3390/app11094265
Tinto, V. (1989). Definir la deserción: una cuestión de perspectiva. Revista de Educación Superior, 71(18), 1-9.
Valdivia, E. M., Ruíz, B. V., Cárdenas, C. M. y Ortiz, C. P. (2019). Diseño de un programa de tutoría integral para alumnos de ingeniería. ANFEI Digital, 11.
Van der Aalst, W. (2011). Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer. DOI: https://doi.org/10.1007/978-3-642-19345-3
Van der Aalst, W. (2016). Process Mining: Data Science in Action. Springer. DOI: https://doi.org/10.1007/978-3-662-49851-4
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Héctor Fernando Gómez García, Jessica Mendiola Fuentes, Víctor Manuel Romero Medina
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Non Commercial, No Derivatives Attribution 4.0. International (CC BY-NC-ND 4.0.), that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).