The SPAN research team and the Kaiser Permanente Colorado Human Subjects Protections team collaborated to develop a comprehensive review process for the collection, storage, and future research use of data stored in data repositories. Similar to the process for evaluating research that involves biorepositories, the prospective research conducted with data repositories presented minimal risk to study subjects because the data were retrospective and patients’ identities were protected through the assignment of a randomly generated study identifier. This linking file was never shared, thus preventing reidentification of individual patients.

The teams collaborated to develop specific application procedures for future research utilizing these repositories. These procedures enabled the study teams to highlight similarities and differences in the proposed “substudies” and efficiently and effectively emphasize areas that could increase the risk of the research.

To assist in the administrative oversight of these studies, naming and numbering conventions were developed by the teams (as demonstrated by this example: the SPAN Modifications and Sub-Studies Tracking Sheet) to track modification decisions, substudy lead PI, participating sites, duration of substudy, and subcontract changes for both the main study (research repository) and substudies.

Publication Date


Type of Governance Resource

Process Guidance

Other Data Type


Generalizability to Other Settings

The SPAN example can be modified to track modification decisions, substudy lead PI, participating sites, duration of substudy, and subcontract changes for prospective research conducted with data repositories.

Network Description

The objectives of the Scalable PArtnering Network for Comparative Effectiveness Research, or SPAN Network were to develop a distributed research network that 1) was interoperable across a range of health-care systems; 2) permitted menu-driven querying of data; and 3) utilized patient-level data for analyses. The architecture for the SPAN Network is based on an existing distributed research network (the DEcIDE DRN2 network). The SPAN Network included 11 data partners: 9 from an existing research network; and 2 community partners with different delivery systems, data structures, and patient populations.

Geographic scope type


Acknowledgement of Funders

Support provided by Grants R01HS019912 from the Agency for Healthcare Research and Quality. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality.

Additional Information

Since the hypotheses were not defined until the data infrastructure development was well-underway, creating a process to quickly review proposed substudies was key with only 36 months to complete all study activities. Using this process helped the study team comply with IRB requirements and expedite study activities. It was important to keep careful track of the status of these substudies (submission, approval, site participation, etc.) and this form was a useful tool to do that. Collaborating with your regulatory and compliance staff at the outset and throughout studies like this one is an essential step to achieving project goals.

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.