Skip to content

Predicting whether a patient with heart-failure will die or survive based on some risk factors

Notifications You must be signed in to change notification settings

williamagyapong/heart-failure

Repository files navigation

Predicting Mortality by Heart Failure

According to the World Health Organization, cardiovascular diseases (CVDs) are the number one cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worlwide. Of these deaths, 85% are due to heart attack and stroke. Heart failure is a common event caused by cardiovascular diseases.

The early detection of people with cardiovascular diseases or who are at high cardiovascular risk due to the presence of one or more risk factors is paramount to reducing deaths arising from heart failures. As a result, predictive models become indispensable. The dataset explored in this project contains 12 features that can be used to predict mortality by heart failure. The goal of this project, therefore, is to identify an appropriate classification model that can accurately predict the mortality of patients with heart failure.

Ten different classification models were fitted to the heart failure data set. From these models, three classifiers - Linear Discriminant Analysis (LDA) model, Logistic Regression model, and Random Forest model - emerged as best models with the same predictive accuracy of 78 %. The LDA model was dropped because it was the most complex model among the three since it was based on 11 predictors. The other two models were each based on only 4 predictors but the Random Forest model was chosen as the overall best model because of some advantages it has over the competing Logistic Regression model. That is, unlike the Logistic Regression model, the Random Forest model is non-linear classifier which does not require the data to be linearly separable. Therefore, it was concluded that a Random Forest model with Ejection Fraction, Serum Creatinine, Anaemia, and High Blood Pressure is the best model for predicting the death event of a patient with a heart failure.

I consider this to be a preliminary analysis since more can be done to improve the predictive power. For instance, treating the issue of class imbalance and handling outliers could enhance model performance. Again, a formal modeling procedure such as PCA could be utilized to identify the most significant features.

Please see the full report here. You can also access the codes in the RMarkdown file.

About

Predicting whether a patient with heart-failure will die or survive based on some risk factors

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published