The project aims to understand factors contributing to employee turnover and create a predictive model to identify employees at risk of leaving the company.
-
Obtaining the Data:
- Downloaded the dataset from Kaggle.
- Imported the data into the working environment.
-
Scrubbing the Data:
- Checked for missing values (dataset was clean).
- Examined the dataset for readability and appropriate feature names.
- Converted categorical features (department, salary) to numeric types.
-
Exploratory Data Analysis (EDA):
- Conducted statistical overview and summary.
- Explored correlations among features using a correlation matrix and heatmap.
- Analyzed turnover patterns in relation to department, salary, promotion, years at the company, project count, evaluation, average monthly hours, etc.
-
Modeling the Data:
- Split the data into training and testing sets.
- Implemented various machine learning models (Logistic Regression, SVM, kNN, Random Forest).
- Evaluated model performance using training and testing scores.
-
Interpreting the Data:
- Summarized findings from EDA.
- Highlighted trends related to turnover, satisfaction, salary, project count, and evaluations.
- Raised questions for further consideration about the impact of losing employees and factors affecting satisfaction and turnover.
- Noted trends related to working hours, salary, promotion, and project count.
- Highlighted correlations between turnover, satisfaction, and salary.
- Posed questions about the impact of losing employees and factors influencing turnover and satisfaction.
Note: The code sections may need to be reformatted and executed in a Python environment for full functionality.