Submission Type

Empirical Research


Outcomes Assessment; Data Analysis Method; Methods;


Background: Machine learning is used to analyze big data, often for the purposes of prediction. Analyzing a patient’s healthcare utilization pattern may provide more precise estimates of risk for adverse events (AE) or death. We sought to characterize healthcare utilization prior to surgery using machine learning for the purposes of risk prediction.

Methods: Patients from MarketScan Commercial Claims and Encounters Database undergoing elective surgery from 2007-2012 with ≥1 comorbidity were included. All available healthcare claims occurring within six months prior to surgery were assessed. More than 300 predictors were defined by considering all combinations of conditions, encounter types, and timing along with sociodemographic factors. We used a supervised Naïve Bayes algorithm to predict risk of AE or death within 90 days of surgery. We compared the model’s performance to the Charlson’s comorbidity index, a commonly used risk prediction tool.

Results: Among 410,521 patients (mean age 52, 52 ± 9.4, 56% female), 4.7% had an AE and 0.01% died. The Charlson’s comorbidity index predicted 57% of AE’s and 59% of deaths. The Naïve Bayes algorithm predicted 79% of AE’s and 78% of deaths. Claims for cancer, kidney disease, and peripheral vascular disease were the primary drivers of AE or death following surgery.

Conclusions: The use of machine learning algorithms improves upon one commonly used risk estimator. Precisely quantifying the risk of an AE following surgery may better inform patient-centered decision-making and direct targeted quality improvement interventions while supporting activities of accountable care organizations that rely on accurate estimates of population risk.

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.