Project Overview
This repository explores the development of a machine learning model to predict the risk of diabetes based on data from the Behavioral Risk Factor Surveillance System (BRFSS). The BRFSS is an annual, large-scale telephone survey conducted by the Centers for Disease Control and Prevention (CDC) that gathers crucial information on health behaviors, chronic conditions, and preventive service utilization in the United States population.
Data Source
The project utilizes a 2015 BRFSS dataset obtained from Kaggle: https://www.kaggle.com/datasets/alexteboul/diabetes-health-indicators-dataset. This dataset encompasses responses from 441,455 individuals and boasts 330 features, encompassing both direct survey questions and calculated variables derived from participant responses.
Objectives
This project aims to:
- Implement machine learning algorithms to build a model that effectively predicts diabetes risk.
- Evaluate the performance of the model using various metrics to assess its accuracy and generalizability.
- (Optional) Interpret the model to gain insights into the most influential factors associated with diabetes risk based on the trained model.
The model developed in this project is intended for educational purposes only and should not be used for diagnosing or treating diabetes. Please consult a healthcare professional for any medical concerns.
author: Tiago Russomanno
obs: work in progress