Machine learning pipeline for predicting ICU readmission risk using patient data.
Hospital readmissions are costly and can indicate issues in patient care.Â
This project uses machine learning to predict whether a patient is likely to be readmitted to the ICU.
The goal is to demonstrate:
End‑to‑end ML workflow
Model training and evaluation
Interactive deployment using Streamlit
Responsible sharing of models while protecting repository limits and data security
The model was trained using Scikit‑Learn with the dataset included in the repository.
Typical training workflow:
Load dataset
Clean and preprocess data
Train ML classifier
Evaluate performance
Save trained model
The model file is not stored directly in GitHub because GitHub limits files to 100MB.
The trained model is stored externally using Google Drive.
This keeps the repository lightweight and within GitHub file size limits.
When the app starts, it automatically downloads the model if it is not present.
⚠️ Note:
Only the file ID is used in the application code. No personal Google Drive credentials or sensitive data are exposed.
The prediction model uses the following patient features:
Age
Gender
Time in hospital
Number of lab procedures
Diabetes status
Hypertension status
Number of medications
Number of procedures
Glucose level
Blood pressure
Diagnosis code
BMI
Smoking status
Python
Pandas
NumPy
Scikit‑Learn
Streamlit
Joblib
Google Drive (for model hosting)