Our task is to uncover the factors that lead to employee attrition and explore the potential relationships between different demographic and descriptive factors. Ultimately, we aimed to discover insights and generate recommendations for corporations about talent retention and management based on this fictional data set created by IBM data scientists. Here is the link to this dataset from Kaggle.
To better understand this dataset, we started with the descriptive analysis.
Once we explored the characteristics of all 35 independent variables, we decided to eliminate our research on 15 categorical independent variables and 10 numeric variables to see which factors can most easily trigger an employee to leave the company ( Research Question).
Results and Findings
We tested various models such as Linear and Logistic Regression, Logistic Stepwise, Ridge, Lasso, GAM, Boosted Tree, and Random Forest etc, and calculated the AUC ratio for each to find the more accurate/suitable model. We also identified the Power Predictor Variables, Weak Predictor Variables, and the correlations between independent variables to understand the relationship further.
Insights and Recommendations
Based on our findings about significant predictors of Employee Attrition(DV) and the correlation between the significant predictors (IVs), we gave our recommendations in three categories: Employee Development, Employee Benefits, and Company Culture.
To start with Employee Development, we have the variable of Total Working Years as a power predictor and is strongly correlated with Job Level and Monthly Income.
For the job level, we recommend providing the multiple paths for career promotion and providing continuing education opportunities for employees to advance their position.
In terms of the monthly income, it is not surprising that the higher pay will lower the attrition rates. We would suggest the company to keep the above industry-based salary levels, with the additional exceptional healthcare and retirement package to maximize the after-tax income.
Keep going with the employee benefits. We have the stock option and non-salaried benefits as two power predictors. We suggest the company provide more stock options for the entry-level employees since the attrition of stock level 1 is significantly different from level 0 (4 levels in total). Meanwhile, free investment advisory will also be helpful here for employees to make better decisions.
For the non-salary benefits, we have some suggestions on things like corporate discounts, extended paid parental leaves, and of course, high-quality office cafeteria.
Last but not least, from the approach of company culture, we have power predictors such as work overtime, job involvement, business travel, and peer relationship satisfaction.
The data shows that all three departments have similar overtime ratios, which means that’s the problem across all departments.
There are certain limitations exist in our research. The unbalanced attrition rate (DV) distribution and compounding relationship within Independent variables might lead to potential inaccuracy and biased result.
In future research, we would like to apply Factor Analysis to categories the independent variables and explore the grouping effects on the employee attrition rates.
Below are the Appendix listed
Thank you for watching 🙂