A machine learning approach to predict hospital readmissions within 30 days
10 years (1999-2008) of clinical care data from 130 US hospitals
Handled missing values, encoded categorical variables, and standardized numerical features across 101,766 patient records.
Created 151 features from raw data, including patient demographics, medical history, and hospital outcomes.
Implemented and compared multiple machine learning models, addressing class imbalance through sampling techniques.
Addressed 11% positive class imbalance using SMOTE and Random Under-sampling techniques.
Implemented sophisticated imputation strategies for handling missing values in key features.
Carefully selected and engineered features to maximize model performance.
Synthetic Minority Over-sampling Technique to balance classes
Reduced majority class to match minority
This model can help hospitals identify high-risk patients and implement preventive measures to reduce readmission rates.