Statistical modeling of school dropout in engineering students based on process mining

Authors

DOI:

https://doi.org/10.31637/epsir-2025-974

Keywords:

School dropout, mathematical training, process mining, conformance checking, Petri nets, curriculum analytics, statistical modeling, predictive models

Abstract

Introduction: The present work describes the development of a statistical model to estimate the probability of dropout among engineering students at the University of the Caribbean through the application of process mining techniques. Methodology: A mathematical training process was defined as a reference to evaluate the suitability of the academic trajectories of students from four cohorts. The evaluations of conformance with the mathematical training process, during the first three semesters of each student, were used as predictors of dropout probability in a statistical model. Results: The statistical model was adjusted using data from the first three cohorts and validated by applying it to the most recent cohort, comparing the predictions with the observed dropout results. Discussions: The study demonstrates the effectiveness of process mining techniques in generating relevant academic information, useful for decision-making to mitigate the risk of school dropout based on the analysis of students' academic trajectories. Conclusions: Future directions are suggested, such as the implementation of monitoring systems and the inclusion of other critical processes in the analysis to significantly increase the effectiveness of educational interventions.

Downloads

Download data is not yet available.

Author Biographies

Héctor Fernando Gómez García, Universidad del Caribe

Full-time Research Professor attached to the Data Engineering and Organisational Intelligence programme. D. in Computer Science from the Centro de Investigación en Matemáticas. He has developed several research projects in digital image processing, computer vision and data science, with special interest in the applications of artificial intelligence in education and in the statistical modelling of environmental systems for the development of renewable energies.

Jessica Mendiola Fuentes, Universidad del Caribe

Degree in Applied Mathematics from the Universidad Juárez del Estado de Durango. Master's and Doctorate in Control and Dynamic Systems, both from the Instituto de Investigación Científica y Tecnológica A. C. (IPICyT). Postdoctoral stay at the Institute of Physics of the Autonomous University of San Luis Potosí (UASLP). Winner of the Sofia Kovalévskaya prize in 2017. Leader of the HPC department at the National Supercomputing Centre (CNS) during 2020 and until 2022. Main areas of interest focus on control theory, dynamical systems and their applications. Member of the National System of Researchers (SNI). She is currently a full-time research professor in the Department of Basic Sciences and Engineering at the Universidad del Caribe.

Víctor Manuel Romero Medina, Universidad del Caribe

Full-time Research Professor attached to the Environmental Engineering educational programme. Aeronautical Engineer with a Master's degree in Mechanical Engineering focused on computational fluid dynamics and a PhD in Materials Science and Engineering oriented to the development of mathematical models for the analysis of crack growth and plastic deformation of composite materials. He has participated in basic and applied research projects, mainly in Computational Fluid Dynamics, focused on the use of renewable energies, developing prototypes. He has participated in several conferences and published several articles. He is an active member of the National Solar Energy Association, the International Solar Energy Society, the Ocean Thermal Energy Association and the Mexican Centre for Ocean Energy Research.

References

Amaya Amaya, A., Huerta Castro, F. y Flores Rodríguez, C. O. (2020). Big Data, una estrategia para evitar la deserción escolar en las IES. Revista Iberoamericana de Educación Superior, 11(31), 166-178. https://doi.org/10.22201/iisue.20072872e.2020.31.712 DOI: https://doi.org/10.22201/iisue.20072872e.2020.31.712

Banihashem, S. K., Aliabadi, K., Ardakani, S. P., Delaver, A., y Ahmadabadi, M. N. (2018). Learning analytics: A systematic literature review. Interdisciplinary Journal of Virtual Learning in Medical Sciences, 9(1), 41-60. DOI: https://doi.org/10.5812/ijvlms.63024

Berti, A., van Zelst, S. J. y Schuster, D. (2023). PM4Py: A process mining library for Python. Software Impacts, 17, 100556. https://doi.org/10.1016/j.simpa.2023.100556 DOI: https://doi.org/10.1016/j.simpa.2023.100556

Chen, T. y Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794. DOI: https://doi.org/10.1145/2939672.2939785

De Witte, K. y Rogge, N. (2013). Dropout from secondary education: All's well that begins well. European Journal of Education, 48(1), 131-149. https://doi.org/10.1111/ejed.12001 DOI: https://doi.org/10.1111/ejed.12001

Dumas, M., La Rosa, M., Mendling, J. y Reijers, H. A. (2019). Fundamentals of Business Process Management. Springer. DOI: https://doi.org/10.1007/978-3-662-56509-4

Felder, R. M. y Brent, R. (2004). The Intellectual Development of Science and Engineering Students. Part 2: Teaching to Promote Growth. Journal of Engineering Education, 93(4), 279-291. http://doi.org/10.1002/j.2168-9830.2004.tb00817.x DOI: https://doi.org/10.1002/j.2168-9830.2004.tb00817.x

Gottipati, S. y Shankararaman, V. (2018). Competency analytics tool: Analyzing curriculum using course competencies. Education and Information Technologies, 23(1), 41-60. https://doi.org/10.5812/IJVLMS.63024 DOI: https://doi.org/10.1007/s10639-017-9584-3

Instituto Nacional de Estadística Geografía e Informática. (s.f.). Tasa de abandono escolar por entidad federativa según nivel educativo. https://bit.ly/3VSicNW

International Business Machines [IBM]. (6 de enero de 2022). BPMN basics: Understanding and using BPMN. IBM. https://www.ibm.com/blog/bpmn/

Loder, A. K. F. (2024). The use of educational process mining on dropout and graduation data in the curricula (Re-)Design of universities. Trends in Higher Education, 3(1), 50-66. https://doi.org/10.3390/higheredu3010004 DOI: https://doi.org/10.3390/higheredu3010004

López Suárez, A., Albíter Rodríguez, Á. y Ramírez Revueltas, L. (2008). Eficiencia terminal en la educación superior, la necesidad de un nuevo paradigma. Revista de la Educación Superior, XXXVII(2), 135-151.

McKinney, W. (2010). Data Structures for Statistical Computing in Python. Proceedings of the 9th Python in Science Conference, 51-56. DOI: https://doi.org/10.25080/Majora-92bf1922-00a

Murata, T. (1989). Petri nets: Properties, analysis and applications. Proceedings of the IEEE, 77(4), 541-580. https://doi.org/10.1109/5.24143 DOI: https://doi.org/10.1109/5.24143

Murillo-García, O. L. y Luna-Serrano, E. (2021). El contexto académico de estudiantes universitarios en condición de rezago por reprobación. Revista Iberoamericana de Educación Superior, 12(33), 58-75. https://doi.org/10.22201/iisue.20072872e.2021.33.858 DOI: https://doi.org/10.22201/iisue.20072872e.2021.33.858

Ocampo Díaz, J., Martínez Romero, M., de las Fuentes Lara, M. y Zatarain Jorge, J. (2010). Reprobación y deserción en la facultad de ingeniería Mexicali de la Universidad Autónoma de Baja California. https://bit.ly/467lUs1

Paramo, G. J. y Correa Maya, C. A. (2012). Deserción estudiantil universitaria. Conceptualización. Revista Universidad EAFIT, 35(114), 65-78.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... y Duchesnay, E. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825-2830.

Pérez, S. L. G. (2010). El papel de la tutoría en la formación integral del universitario. Tiempo de Educar, 11(21), 31-56.

Plotly Technologies Inc. (2015). Collaborative data science. Montréal, QC: Plotly Technologies Inc. https://plot.ly

Salazar-Fernández, J. P., Muñoz-Gama, J., Maldonado-Mahauad, J., Bustamante, D. y Sepúlveda, M. (2021). Backpack Process Model (BPPM): A process mining approach for curricular analytics. Applied Sciences, 11(9), 4265. https://doi.org/10.3390/app11094265 DOI: https://doi.org/10.3390/app11094265

Tinto, V. (1989). Definir la deserción: una cuestión de perspectiva. Revista de Educación Superior, 71(18), 1-9.

Valdivia, E. M., Ruíz, B. V., Cárdenas, C. M. y Ortiz, C. P. (2019). Diseño de un programa de tutoría integral para alumnos de ingeniería. ANFEI Digital, 11.

Van der Aalst, W. (2011). Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer. DOI: https://doi.org/10.1007/978-3-642-19345-3

Van der Aalst, W. (2016). Process Mining: Data Science in Action. Springer. DOI: https://doi.org/10.1007/978-3-662-49851-4

Published

2024-12-24

How to Cite

Gómez García, H. F., Mendiola Fuentes, J., & Romero Medina, V. M. (2024). Statistical modeling of school dropout in engineering students based on process mining. European Public & Social Innovation Review, 10, 1–22. https://doi.org/10.31637/epsir-2025-974

Issue

Section

Cover articles