Thursday, October 24, 2019
Data Warehouse Case Study Essay
History of the CDR When the project began in 1995ââ¬â96, the CDR, initially referred to as the ââ¬Å"clinical research database,â⬠was intended to support and enhance clinical research at the University of Virginia by providing clinicians, students, and researchers with direct, rapid access to retrospective clinical and administrative patient data. Re? ecting this intent, the system was funded by the School of Medicine and housed in the Academic Computing Health Sciences group, which is distinct from the medical centerââ¬â¢s IT group. With considerable assistance and cooperation from data owners and stewards, legacy data from several different sources were loaded into a single relational database and periodically updated. Authorized users accessed the CDR through a standard Web browser and viewed or downloaded data to their personal computers for further analysis. Initially, emphasis was placed on getting the CDR running as quickly as possible and with a minimum of resources; consequently, extensive transformation of data to an enterprise data model was not performed. The CDR project team consists of 2. 5ââ¬â3. 0 FTEs (full-time equivalents)ââ¬â one developer, one developer-database administrator, and portions of analyst, clinician, and administrative FTEs. To date, the costs of developing and operating the CDR have been approximately $200,000 per year, underwritten by the School of Medicine. Over the course of the project, there have been signi? cant enhancements to the user interface, incorporation of additional data sources, and the development of an integrated data model. There has also been increasing interest in using the CDR to serve a broader audience than researchers and to support management and administrative functionsââ¬âââ¬Å"to meet the challenge of providing a way for anyone with a need to knowââ¬âat every level of the organizationââ¬âaccess to accurate and timely data necessary to support effective decision making, clinical research, and process improvement. In the area of education, the CDR has become a core teaching resource for the Department of Health Evaluation Scienceââ¬â¢s masterââ¬â¢s program and for the School of Nursing. Students use the CDR to understand and master informatics issues such as data capture, vocabularies, and coding, as well as to perform Case Study: A Data Warehouse for an Academic Medical Center 167 exploratory analyses of healthcare questions. Starting in Spring 2001, the CDR will also be introduced into the universityââ¬â¢s undergraduate medical curriculum. System Description Following is a brief overview of the CDR application as it exists at the University of Virginia. System Architecture. The CDR is a relational data warehouse that resides on a Dell PowerEdge 1300 (Dual Intel 400MHz processors, 512MB RAM) running the Linux operating system and Sybase 11. 9. 1 relational database management system. For storage, the system uses a Dell Powervault 201S 236GB RAID Disk Array. As of October 2000, the database contained 23GB of information about 5. 4 million patient visits (16GB visit data, 7GB laboratory results). Data loading into Sybase is achieved using custom Practical Extraction and Report Language (Perl) programs. CDR Contents. The CDR currently draws data from four independent systems (see Table 1). In addition, a number of derived values (for example, number of days to next inpatient visit, number of times a diagnostic code is used in various settings) are computed to provide summary information for selected data elements. Data from each of these source systems are integrated into the CDRââ¬â¢s data model. In addition to the current contents listed in Table 1, users and the CDR project team have identi? ed additional data elements that might be incorporated Table 1. Contents of the CDR Type of Data Inpatient, outpatient visits Source of Data Shared Medical Systems Description Patient registration and demographic data, diagnoses, procedures, unit and census information, billing transactions, including medications, costs, charges, reimbursement, insurance information Physician billing transactions from inpatient and outpatient visits, diagnoses, and procedures Laboratory test results Available Dates Jul 1993ââ¬âJun 2000 Professional billing Laboratory results Cardiac surgery IDX billing system HL-7 messages from SunQuest Lab System Cardiac surgery outcomes data (de? ned by Society of Thoracic Surgeons Oct 1992ââ¬âJun 2000 Jan 1996ââ¬âJun 2000 Clinical details for thoracic surgery cases Jul 1993ââ¬âJun 2000 168 Einbinder, Scully, Pates, Schubart, Reynolds into the CDR, including microbiology results, discharge summaries (and other narrative data), outpatient prescribing information, order entry details, and tumor registry information. As of October 2000, we have just ? nished incorporating death registry data from the Virginia Department of Health into the CDR. These data will provide our users with direct access to more comprehensive mortality outcomes data than are contained in local information systems, which generally are restricted to an in-hospital death indicator. User Interface. The user interface runs in a standard Web browser and consists of a data dictionary, a collection of common gateway interface (CGI) programs implemented using the ââ¬Å"Câ⬠programming language, and JavaScriptenabled HTML pages. Structured query language (SQL) statements are generated automatically in response to point-and-click actions by the user, enabling submission of ad hoc queries without prior knowledge of SQL. The SQL queries are sent to the CGI programs that query the database and return results in dynamically created HTML pages. The entire process is controlled by the contents of the data dictionary, which is used to format SQL results, set up HTML links for data drill-down, and provide on-line help. Data may be downloaded immediately into Microsoft Excel or another analysis tool on the userââ¬â¢s workstation. Query Formulation. Most CDR users use the Guided Query function to retrieve data. This process involves three steps: 1. De? ne a population of interest by setting conditions, for example, date ranges, diagnostic codes, physician identi? ers, service locations, and lab test codes or values. 2. Submit the query, specifying how much data the CDR should return (all matching data or a speci? ed number of rows). 3. After the CDR returns the population of interest, use the Report Menu to explore various attributes of the population on a case-by-case or group level. Custom reports can also be de? ned, and the results of any report can be downloaded into Microsoft Excel, Access, or other analysis tool. Generally, the query process requires several iterations to modify the population conditions or report options. In addition, ââ¬Å"browsingâ⬠the data may help the user generate ideas for additional queries. We believe that it is helpful for end users to go through this query process themselvesââ¬âto directly engage the data. However, many users, especially those with a pressing need for data for a meeting, report, or grant, prefer to use CDR team members as intermediaries or analysts. To date, we have attempted to meet this preference, but as query volume increases, our ability to provide data in a timely manner may fall off. Security. A steering committee of clinicians guided the initial development of the CDR and established policies for its utilization and access. Only authorized users may log onto the CDR. To protect con? dentiality, all patient and physician identifying information has been partitioned into a ââ¬Å"secureâ⬠Case Study: A Data Warehouse for an Academic Medical Center 169 database. Translation from or to disguised identi? ers to or from actual identi? ers is possible but requires a written request and appropriate approval (for example, from a supervisor or the human investigations committee). All data transmitted from the database server to the userââ¬â¢s browser are encrypted using the secure Netscape Web server, and all accesses to the database are logged. In addition, CDR access is restricted to personal computers that are part of the ââ¬Å"Virginia. eduâ⬠domain or that are authenticated by the universityââ¬â¢s proxy server. Evaluation Understanding user needs is the basis for improving the CDR to enable users to retrieve the data independently and to increase usage of the CDR at our institution. Thus, assessing the value of the CDRââ¬âhow well we meet our usersââ¬â¢ needs and how we might increase our user baseââ¬âhas been an important activity that has helped guide planning for changes and enhancements and for allocation of our limited resources. Efforts to evaluate the CDR have included several approaches: â⬠¢ Monitoring user population and usage patterns â⬠¢ Administering a CDR user survey â⬠¢ Tracking queries submitted to the CDR and performing follow-up telephone interviews Usage Statistics. Voluntary usage of an IS resource is an important measure of its value and of user satisfaction. 5 However, usage of a data warehouse is likely to be quite different than for other types of information resources, such as clinical information systems. A clinical system is likely to be used many times per day; a data warehouse may be used sporadically. Thus, although we monitor system usage as a measure of the CDRââ¬â¢s value, we believe that frequency of usage cannot be viewed in isolation in assessing the success of a data warehouse. Since the CDR went ââ¬Å"live,â⬠more than 300 individuals have requested and obtained logon IDs. As of September 30, 2000, 213 individuals had logged on and submitted at least one query. This number does not include usage by CDR project team members and does not re? ect analyses performed by team members for end users. Figure 1 shows the cumulative number of active users (those who submitted a query) and demonstrates a linear growth pattern.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.