The goal of the project is to predict heart diseases based on the given data. The target variable is 'cardio'. The evalution metric for the project is ROC-AUC.
The given dataset gives the following parameters of the objects:
- id;
- age;
- gender;
- height;
- weight;
- ap_hi;
- ap_lo;
- cholesterol;
- gluc;
- smoke;
- alco;
- active.
- Import of Modules & Files Opening
- Data Preprocessing and Exploratory Data Analysis (EDA)
- Development of ML-models
- Conclusion
- Created a ML-model (RandomForestClassifier) for the regression task, which determines the presence of a heart diseases. ROC-AUC of the model: 0.739. The hyperparameters of the model are the following: criterion = 'entropy', n_estimators = 350, max_depth = 10.
- Created a web-application using which a person can find the risk of a heart disease (that's a model. In case of anything, refer to a specialist).
matplotlib
, numpy
. pandas
, phik
, pickle
, seaborn
, sklearn