Domain

Clinical Informatics

Type

Conceptual or Process Model/Framework

Theme

effectiveness; quality

Start Date

7-6-2014 10:25 AM

End Date

7-6-2014 11:45 AM

Structured Abstract

Objectives

To identify the characteristics of an optimal predictive modeling framework for personalized prediction of adverse drug events.

Methods

We selected 3 well-established adverse drug events: hyperkalemia due to ACE-inhibitors (frequent), hyponatremia due to SSRIs (infrequent), and rhabdomyolysis due to statins (rare). After developing phenotypes for each outcome based on diagnostic codes (ICD9) an laboratory codes (LOINC), we utilized a large de-identified database of 2.2 million patients (stored in OMOP Common Data Model format) to identify new users of these medications. We then compared three predictive modeling methods— multiple logistic regression, classification and regression trees (CART), and Random Forest (RF)—to determine which performed the best in predicting the target outcome within 30, 90, and 365 days of drug initiation. The derived models were then applied to cohorts who were treated for the same medical condition but who did not receive the target medications.

Findings

Model performance across all drug-outcome pairs showed the best performance from Random Forest (mean AUC 80.3%) followed by logistic regression (mean AUC 75.7%) followed by CART (mean AUC 71.4%). Similarly, modeling the more common adverse events (hyperkalemia and hyponatremia) showed superior performance (AUC 80-85% for RF) compared with the rare event rhabdomyolysis (AUC 70-77% for RF). We observed similar performance when applying our models to identify unexposed patients at risk of these outcomes (mean RF AUC 84% for hyperkalemia, 82% for hyponatremia, and 76% for rhabdoymylosis). This suggests that adverse event models may be generalized by outcome rather than on a per-drug basis. Despite good performance, the PPV for even the best models were mediocre (35% for hyponatremia, 28% for hyperkalemia, and 2% for rhabdomyolysis) given the infrequency of events. Focusing on just the 10% highest risk cases improves this slightly to 38%, 37%, and 4% respectively.

Discussion

Our work suggests that random forest is the optimal method for modeling adverse event risk, regardless of event frequency (frequent, infrequent, or rare). Additionally, this study indicates that risk models may be developed on a per-outcome basis rather than attempting to create a unique model for each drug-outcome pair. This greatly reduces the barriers to developing personalized clinical decision support systems as it reduces the number of models that must be created. A second important finding was that even high performing models in terms of AUC have low positive predictive values in settings of rare events. Because the likelihood of alert fatigue in high in the setting of low specificity warnings, it is advisable to raise the threshold for alerts for rare events to only those patients at the highest risk levels (e.g., top 10%).

Conclusion

An optimal, practical framework for personalized clinical decision support regarding adverse drug events will feature: 1) use of random forest for model creation; 2) reliance on outcome-based models rather than individual models for each drug-outcome pair; and 3) apply models with higher PPV (even if slightly lower AUC than the optimized model) to reduce alert fatigue.

Acknowledgements

This research was supported through a collaboration with Merck (Merck Sharpe & Dohme, Inc)

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.

Share

COinS
 
Jun 7th, 10:25 AM Jun 7th, 11:45 AM

Characterizing an optimal predictive modeling framework for prediction of adverse drug events

Objectives

To identify the characteristics of an optimal predictive modeling framework for personalized prediction of adverse drug events.

Methods

We selected 3 well-established adverse drug events: hyperkalemia due to ACE-inhibitors (frequent), hyponatremia due to SSRIs (infrequent), and rhabdomyolysis due to statins (rare). After developing phenotypes for each outcome based on diagnostic codes (ICD9) an laboratory codes (LOINC), we utilized a large de-identified database of 2.2 million patients (stored in OMOP Common Data Model format) to identify new users of these medications. We then compared three predictive modeling methods— multiple logistic regression, classification and regression trees (CART), and Random Forest (RF)—to determine which performed the best in predicting the target outcome within 30, 90, and 365 days of drug initiation. The derived models were then applied to cohorts who were treated for the same medical condition but who did not receive the target medications.

Findings

Model performance across all drug-outcome pairs showed the best performance from Random Forest (mean AUC 80.3%) followed by logistic regression (mean AUC 75.7%) followed by CART (mean AUC 71.4%). Similarly, modeling the more common adverse events (hyperkalemia and hyponatremia) showed superior performance (AUC 80-85% for RF) compared with the rare event rhabdomyolysis (AUC 70-77% for RF). We observed similar performance when applying our models to identify unexposed patients at risk of these outcomes (mean RF AUC 84% for hyperkalemia, 82% for hyponatremia, and 76% for rhabdoymylosis). This suggests that adverse event models may be generalized by outcome rather than on a per-drug basis. Despite good performance, the PPV for even the best models were mediocre (35% for hyponatremia, 28% for hyperkalemia, and 2% for rhabdomyolysis) given the infrequency of events. Focusing on just the 10% highest risk cases improves this slightly to 38%, 37%, and 4% respectively.

Discussion

Our work suggests that random forest is the optimal method for modeling adverse event risk, regardless of event frequency (frequent, infrequent, or rare). Additionally, this study indicates that risk models may be developed on a per-outcome basis rather than attempting to create a unique model for each drug-outcome pair. This greatly reduces the barriers to developing personalized clinical decision support systems as it reduces the number of models that must be created. A second important finding was that even high performing models in terms of AUC have low positive predictive values in settings of rare events. Because the likelihood of alert fatigue in high in the setting of low specificity warnings, it is advisable to raise the threshold for alerts for rare events to only those patients at the highest risk levels (e.g., top 10%).

Conclusion

An optimal, practical framework for personalized clinical decision support regarding adverse drug events will feature: 1) use of random forest for model creation; 2) reliance on outcome-based models rather than individual models for each drug-outcome pair; and 3) apply models with higher PPV (even if slightly lower AUC than the optimized model) to reduce alert fatigue.