|
|
||||||||
Corresponding author: Michelle Kho, McMaster University, Program in Health Research Methodology, Department of Clinical Epidemiology and Biostatistics, 1200 Main Street West, MDCL 3200, Hamilton, ON, Canada, L8N 3Z5 (e-mail: khome{at}mcmaster.ca).
| Abstract |
|---|
|
|
|---|
Objective To determine reliability of APACHE II scores calculated by a clinical information system and by health care personnel before and after a multifaceted quality improvement intervention.
Methods APACHE II scores of 37 consecutive patients admitted to a closed, 15-bed, university-affiliated intensive care unit were collected by a research coordinator, a database clerk, and a clinical information system. After a quality improvement intervention focused on health care personnel and the clinical information system, the same methods were used to collect data on 32 consecutive patients. The research coordinator and the clerk did not know each others scores or the information systems score. The data analyst did not know the source of the scores until analysis was complete.
Results APACHE II scores obtained by the clerk and the research coordinator were highly reliable (intraclass correlation coefficient, 0.88 before vs 0.80 after intervention; P = .25). No significant changes were detected after the intervention; however, compared with scores of the research coordinator, the overall reliability of APACHE II scores calculated by the clinical information system improved (intraclass correlation coefficient, 0.24 before intervention vs 0.91 after intervention, P < .001).
Conclusions After completion of a quality improvement intervention, health care personnel and a computerized clinical information system calculated sufficiently reliable APACHE II scores for clinical, research, and administrative purposes.
In a previous study,9 we documented that baseline APACHE II scores collected by 2 research clerks and an ICU research coordinator had excellent reliability (intraclass correlation coefficient [ICC], 0.90). However, 2 APACHE II components, the CHI, and the verbal component (GCS-V) of the GCS score were less reliable (ICC, 0.65 and 0.40, respectively). Polderman et al7 improved the reliability of APACHE II scores from 0.71 to 0.85 through standardized data collection and specific training sessions. Using principles similar to those applied by Polderman et al,7 we sought to improve the less reliable components of the APACHE II score in our ICU.
This prospective before-and-after study had 3 objectives: (1) document the reliability of APACHE II scores recorded by a clinical information system, a database clerk, and a research coordinator, (2) implement a multifaceted, multidisciplinary quality improvement intervention to improve the reliability of APACHE II scores, and (3) reevaluate the reliability of APACHE II scores after the intervention.
| Materials and Methods |
|---|
|
|
|---|
| The APACHE II score consists of the acute physiology score, age, and the chronic health index.
|
Baseline Data Collection (2 Months)
We previously reported the data collection methods for the baseline phase of the study.9 Briefly, we recorded APACHE II scores calculated for consecutive patients admitted to the ICU by a database clerk and a research coordinator. We excluded patients if their ICU stay was less than 24 hours. In addition, we collected APACHE II scores from our bedside clinical information system, Care-Vue Classic (CareVue, Philips, Andover, Massachusetts), which provides new baseline data for this report. CareVue is an electronic medical record system for critically ill patients that collects data on vital signs, ventilation settings, intravenous infusions, nursing and medical assessments, and laboratory values. Data are uploaded hourly unless otherwise specified by bedside nurses. APACHE II data elements were set to autocalculate daily.
Quality Improvement Intervention (5 Months)
The focus of the intervention was improving data collected by health care personnel and the clinical information system. We divided the intervention into 6 categories (Table 1
), including reconfiguration of the CareVue system and 5 interventions10 specifically targeted to nursing staff, who are responsible for ensuring that APACHE II data are available in the clinical information system as part of the nurses routine assessment and charting. In the CareVue reconfiguration, we changed the visibility of the CHI questions on the APACHE II to the initial default screen, on the basis of input from our ICU working group, and reset the timing of the score to run 24 hours after the patient was admitted to the ICU. As a second part of the Care-Vue reconfiguration, we modified the calculation variables. Once we identified significant differences in reliability between the CareVue scores and the research coordinators scores, we systematically examined each of the APACHE II data points contributing to the calculation in CareVue. Through this process, we identified 7 specific data elements that required modifications in CareVue (Table 1
).
|
| The intervention focused on improving data collected by health care personnel and the clinical information system.
|
We implemented 5 different multimodal interventions aimed at nursing staff to improve documentation of the CHI and the GCS-V score: education, point-of-care electronic reminders, prompts from local opinion leaders, provision of audit and feedback, and policy dissemination.10 Our nurse educator and nurse informatician conducted in-service training sessions on how to document CHI components and GCS-V scores in intubated patients. Further, we posted information sheets and electronic resources at each computer workstation to reinforce the training sessions. We programmed point-of-care electronic reminders to be sent twice daily to reinforce CHI documentation. For each new admission, local nurse opinion leaders and charge nurse champions prompted bedside nurses to complete CHI documentation, and the nurse informatician provided individual audit and feedback to each bedside nurse. Finally, our ICU working group codified the documentation requirements for the APACHE II score as a formal policy through the hospitals intranet.
| Interventions targeted at chronic health and GCS items did not result in significant changes.
|
After 5 months, we gradually decreased the frequency of all interventions until they ceased. The APACHE II autocalculation continued to provide point-of-care scores, per our written ICU policy. During the intervention, APACHE II scores were calculated as usual by CareVue and by the database clerk; however, the research coordinator collected APACHE II scores only for patients enrolled in clinical studies. For the purposes of this study, while the intervention was occurring, we did not analyze any APACHE II scores.
Reevaluation (3 Months)
The data collection methods during the reevaluation phase were the same as those used before the intervention. A different data clerk, who was blinded to the APACHE II score calculations and source, entered information from the 3 different raters into a database. The database clerk and research coordinator had no knowledge of each others scores or of the CareVue scores before and after the intervention. Because human performance can improve when people are aware that their behavior is being observed (the Hawthorne effect) or evaluated (the sentinel effect), both before and after the intervention, the bedside nurses were unaware of the conduct of the study. However, because the purpose of the intervention was to improve APACHE II documentation, we explicitly exposed the bedside nurses to the 6 components of the quality improvement intervention.
Patient care was at the discretion of the ICU team throughout the study. This study was approved by the St Josephs Health Care Research Ethics Board, which waived the need for informed consent because the study did not affect patient care.
Sample Size Calculation and Data Analysis
We calculated interrater reliability by using the ICC, and we calculated ICCs for the APACHE II (total score, APS, age, and CHI) and GCS score (total, verbal, motor, and eyes) components. For each phase, we calculated a sample size of 32 patients to test whether an obtained reliability of 0.90 exceeded a reliability of 0.80, given 3 raters, a 1-tailed
=.05, and a power of 80%.11 To ensure we had sufficient observations, we enrolled an additional 5 patients. Reliability was classified as follows: slight, 0.0–0.20; fair, 0.21–0.40; moderate, 0.41–0.60; substantial, 0.61–0.80; and almost perfect, 0.81–1.00.12
We compared ICCs between each pair of raters before and after the intervention.13 We explored differences in ICC from before to after the intervention by using the Bonferroni correction (for 10 comparisons, our critical P value was .005). All tests were 2 sided. We calculated 95% confidence intervals where appropriate.
We calculated descriptive statistics and used t tests and Wilcoxon rank-sum tests to compare continuous data and
2 tests to compare proportions. We used SPSS (version 14, SPSS Inc, Chicago, Illinois) for all analyses. The data analyst had no knowledge of the source of the scores until analyses were complete.
| Results |
|---|
|
|
|---|
|
|
The remaining major subscales of the APACHE II score, age, APS, and GCS, had no significant changes in reliability between the database clerk and the research coordinator from before to after the intervention (Table 3
). Age scores were almost perfect, and although the APS and GCS reliability scores were somewhat lower after the intervention, this difference was not significant. Compared with before the intervention, the Eye subcomponent of the GCS score had significantly worse reliability after the intervention at 0.51 (0.20, 0.73); however, the reliability was still moderate. Following data collection, we examined the distribution of the CHI and each of the GCS components across patients and found little variability in scores, a situation that might decrease the ability to detect change over time.
| Discussion |
|---|
|
|
|---|
Numerous strategies to change behavior have been suggested to improve the quality of health care.10 We selected interventions that were most likely to address the problems we observed, building on previous work on behavior change in our ICU as well as the published literature on practice improvement as summarized in several systematic reviews and adapted to our limited budget and setting. Our quality improvement intervention focused primarily on bedside nurses, because they directly influence patient care and are heavily involved in documenting patients illness. Key components of successful quality improvement projects are leaders and champions.14 In our study, leadership was provided by a nurse informatician, nurse manager, and nurse educator; champions were charge nurses who encouraged and modeled accurate and timely documentation of the CHI component of the APACHE II score, which required input from bedside nurses. Our multifaceted approach included educational meetings and materials, point-of-care electronic reminders, local opinion leaders, prompts, auditing, and feedback. The goals of the project were encoded in a formal unit policy that was posted and endorsed by the multidisciplinary CareVue quality team and ICU working group.
Our study has limitations. In any multifaceted intervention of this type, it is difficult to determine which component was responsible for the greatest change in behavior. We hypothesize that the reconfiguration of CareVue had the greatest impact, and of the other components, we think that the reminders, prompts from peer leaders, auditing, and feedback had the most important role in increasing the completion of the CHI questions. Certainly, changing the CareVue system calculations through automation was important. In this study we did not use a clinical decision support system, a powerful method of changing behavior,15 because we were not using an information system to support clinical decision making for patient care. Neither the CHI nor the GCS-V subcomponents of the APACHE II score were designed to discriminate among patients; thus, documenting significant improvements in the reliability of these subcomponents may not be possible because of the minimal variation across patients. Because we used a computerized clinical information system, our results are not applicable to paper-based bedside records of measures of illness severity, and the reliability of APACHE II scores would most likely be lower among newly hired personnel. Finally, although our results are generalizable to similar medical-surgical ICUs with a wide variety of admission diagnoses, they may not necessarily be generalizable to exclusively neurosurgery, cardiac surgery, or trauma ICUs.
| Computerized charting systems are integral to health care institutions and must perform reliably.
|
Strengths of our project include the consistent team that participated in all 3 phases of the research program. We involved professionals from many disciplines, including staff from nursing informatics, management, physicians, and research personnel, thereby ensuring that we incorporated diverse suggestions from a broad range of perspectives in designing our intervention. We minimized selection bias by enrolling consecutive patients who met entry criteria before and after the intervention. We conducted this study prospectively, thus avoiding errors and incomplete records associated with retrospective chart review. No data were missing. We used blinded data abstraction, entry, and analysis. The implementation strategies we used ranged from simple to complex and were readily available, well accepted, and easily applied in the usual practice setting, and thereby enhancing the feasibility of these interventions elsewhere for similar initiatives related to quality of care.
Today, many members of the health care team depend on computerized devices and systems that collect, transform, display, and analyze data for multiple purposes. As computerized charting systems are now integral to health care institutions, the functions that they perform must be reliable. We found that the clinical information system initially generated APACHE II scores that were insufficiently reliable. After a multifaceted intervention designed to promote accurate and complete charting by health care personnel on a reconfigured CareVue system, APACHE II scores generated by the clinical information system became sufficiently reliable for clinical, research, and administrative purposes, compared with values obtained by health care personnel. Thus, time that data clerks and research personnel would usually spend calculating APACHE II scores could be freed for other activities.
| Conclusion |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
FINANCIAL DISCLOSURES
This study was supported by the Father Sean OSullivan Research Center. L. Donahoe was funded by a McMaster University Bachelor of Health Sciences Research Scholarship. M. Kho is funded by a Canadian Institutes of Health Research Fellowship Award through the Clinical Research Initiative. D. Cook is a research chair of the Canadian Institutes of Health Research.
Now that youve read the article, create or contribute to an online discussion on this topic. Visit www.ajcconline.org and click "Respond to This Article" in either the full-text or PDF view of the article.
| REFERENCES |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |