Predicting Employee Intent-to-Stay from Engagement Survey Data: An Interpretable, Class Imbalanced Machine Learning Case Study

https://doi.org/10.58291/ijec.v5i1.583

Authors

  • Bagus Satrio Diharjo Master’s Program in Human Resource Development, Postgraduate School, Universitas Airlangga, Surabaya, Indonesia
  • Ira Puspitasari Department of Information and Library Science, Faculty of Social and Political Sciences, Universitas Airlangga, Surabaya, Indonesia
  • Agustinus Titis Iswara Department of Legal and Human Capital, Division of Human Capital Strategy, PT PLN (Persero), Jakarta, Indonesia

Keywords:

employee retention, Intent-to-stay, Human resource analytics, machine learning, CRISP-DM

Abstract

Employee retention is a strategic need for capital intensive firms, such as state-owned power enterprises, whose service continuity throughout the upstream to downstream value chain relies on a stable and engaged workforce. Most predictive HR studies focus on attrition; however, a proactive approach necessitates recognizing employees whose commitment to remain is not firmly established, allowing for retention initiatives to commence prior to the escalation of disengagement. This research establishes an interpretable machine learning framework to forecast employee desire to remain, based on the  CRISP-DM approach. The organization wide Employee Engagement Survey, comprising 32,907 respondents and 102 engineered predictors across 53 engagement items and demographic attributes, involves preprocessing, exploratory analysis, dimensionality reduction (PCA and t-SNE), K Means clustering, supervised classification, multi metric evaluation, and permutation based interpretability. The aim is highly skewed as only 6% of employees reported less than full commitment to stay. The evaluation is therefore focused on ROC AUC, recall and precision recall (PR AUC) and not accuracy. Six algorithms were evaluated. Logistic Regression found the optimal balance (ROC AUC = 0.853, recall = 0.758, PR AUC = 0.318) accurately identifying about 75% of employees not fully committed. Interpretability study identified proximity to retirement age, confidence in the company's future, feeling of vitality at work and achievement of career goals as most significant determinants.. The contribution is not a novel algorithm but rather the insight revealed by this analysis: proximity to retirement is the predominant factor, causing a simplistic model to disproportionately identify senior employees. This illustrates that proactive  intent to stay predictions should be regarded as interpretable decision support rather than solely an accuracy driven endeavor.

Downloads

Download data is not yet available.

Published

2026-06-21

How to Cite

Diharjo, B. S., Puspitasari, I., & Titis Iswara, A. (2026). Predicting Employee Intent-to-Stay from Engagement Survey Data: An Interpretable, Class Imbalanced Machine Learning Case Study. International Journal of Engineering Continuity, 5(1), 304–322. https://doi.org/10.58291/ijec.v5i1.583

Issue

Section

Articles