Submission Type

Empirical Research


algorithm, electronic health record, cancer


Introduction/Objective: The objective of this study was to develop an algorithm to identify Kaiser Permanente Colorado (KPCO) members with a history of cancer.

Background: Tumor registries are used with high precision to identify incident cancer, but are not designed to capture prevalent cancer within a population. We sought to identify a cohort of adults with no history of cancer, and thus, we could not rely solely on the tumor registry.

Methods: We included all KPCO members between the ages of 40-75 years who were continuously enrolled during 2013 (N=201,787). Data from the tumor registry, chemotherapy files, inpatient and outpatient claims were used to create an algorithm to identify members with a high likelihood of cancer. We validated the algorithm using chart review and calculated sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) for occurrence of cancer.

Findings: The final version of the algorithm achieved a sensitivity of 100% and specificity of 84.6% for identifying cancer. If we relied on the tumor registry alone, 47% of those with a history of cancer would have been missed.

Discussion: Using the tumor registry alone to identify a cohort of patients with prior cancer is not sufficient. In the final version of the algorithm, the sensitivity and PPV were improved when a diagnosis code for cancer was required to accompany oncology visits or chemotherapy administration.

Conclusion: EMR data can be used effectively in combination with data from the tumor registry to identify health plan members with a history of cancer.

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.