Experimental Design: One Observation Out-of-Specification Limits system versus SPC methods for Patient Vital Sign Management
James Fackler, MD, Assistant Professor Anesthesiology, Johns Hopkins University, School of Medicine
Christine Tsien, Medical Student, Harvard Medical School
Warren Beatty, Ph.D., Associate Professor Management, University of South Alabama
Steven M. Zimmerman, Ph.D., Professor of Quality and System Management, University of South Alabama
ABSTRACT
A traditional manufacturing engineering/clinical procedure for process control/patient management is to use: a single observation beyond the upper and lower specification limits (established using engineering-clinical judgment) for quality control. History has taught us that using the one point out-of-specification limits inspection procedure (in industry) provided a poor quality output and poor process control. Current medical practice does not recognize SPC, our objective is to demonstrate the advantage of SPC in the clinical environment. We critically examine one experimental design used to demonstrate the advantages/disadvantages of SPC relative to the one-out-of-limits inspection (raw-data) system in the clinical environment.
INTRODUCTION
Our objective to demonstrate to caregivers the advantage of statistical process control (SPC) methods over the one-out-of limits inspection system currently used for vital sign monitoring. Our problems include:
lack-of recording time; statistical training; understanding of variation and natural patient limits
resistance to change
split decision making and responsibility between physician, nurses, and other caregivers
standard bedside monitors provide only raw data sampling systems
there is not a one-to-one relationship between current raw data system and SPC
Using the best current clinical documentation worksheets it is impossible to replicate historical patient reaction and caregiver behavior. During periods of peak demand, data recording has the lowest priority. Standard bedside monitors provide raw data measuring (inspection) devices that sounds an alarm when an observation falls out side of specification limits. Clinical specification limits are typically set by the bedside monitor but may be overridden by the bedside caregiver based on their value judgment and past research. The relationship between the current system and SPC depends on the value of the upper and lower specification limits relative to the natural patient limits. Current clinical practice is the legally required method. This paper reviews an experimental design aimed at comparing the current inspection-raw data system with SPC by 1. collecting vital sign data using computer communication and recording; 2. manually recording the type of each alarm: true positive-clinically relevant; true positive-clinically irrelevant; or false positive; 3. running computerized control charts after-the-fact with a given set of default operating rules; and 4. comparing alarms with control chart behavior.
CURRENT PRACTICE
The critical care environment presents testing problems not typically encountered in other domains. The human body can not be shut down and re-started for experimental and data collection purposes. The life-death nature of the clinical environment means we must work within the current system while testing the use SPC. Theoretically, current clinical practice is to have:
1) the alarm sounds when one vital sign observation is beyond specification limits
2) caregivers examine the patient after every alarm
3) caregivers take action as required
4) caregivers record the event in the patient record
The reality of the situation is: 1) the alarm sounds when one vital sign observation is beyond the spcification limits; 2) the caregiver is rarely provided an indication of the alarm’s significance; 3) the caregiver takes action as indicated; 5) the caregiver may or may not record results. Caregivers selectively react to alarms as a function of patient condition and work load. If a patient is in a constant alarm condition, there may be nothing that can be done, no matter how often an alarm sounds. That is, the caregiver knows the patient is in an "alarm" state. There are no clinical actions necessary and/or available, except to let time pass and hope the patient will become more physiologically stable. In this type of situation, the caregiver does not react to alarms. An example of such a situation would be alarms sounding during administration of cardiopulmonary resuscitation; the monitor correctly alarms for low heart rate and respiration but caregivers know the problem are already, continuously, responding.
When the alarm sounds for a patient (not in a constant alarm condition), the caregiver reacts to the alarm by looking at the patient. Casual research for any clinical problem may be as simple as reconnecting the monitor device or a complex response requiring many months or even years of research. The alarm and results are recorded if the caregiver: feels it is important and/or has the time to do so.
Clinical specification are usually not related to the natural patient limits. If the specifications are inside the natural patient limits, over adjusting occurs, i.e. tampering. If the specification are outside the natural patient limits, statistical methods may provide an early warning of patient changes.
OXYGEN SATURATION
A widely used vital sign measurement for patients with respiratory problems is blood oxygen saturation. Monitors are available that generate oxygen saturation estimates using non-invasive (finger) probes. Standard monitoring practice is to provide whole number estimates of oxygen saturation from 0 to 100 percent. As patients move towards homeostasis, health, their oxygen saturation measurement become more and more auto-correlated and the saturation approaches 100 percent. Traditional statistical process control methods do not work well when data are auto-correlated and rounded to whole numbers. Our SPC system includes adjustment for auto-correlation and poor data precision.
CONTROL LIMITS
SPC methods control limits are numbers defined by the behavior of the process (manufacturing or patient behavior) during a base period. For research purposes, an automatic control chart creating and operating system using a selective set of configuration options was selected to compare to the current method. Figure 1 illustrates a patient's process control chart pair for central tendency: (averages), and for variation: sigma, standard deviation (SD) using our selected SPC software system. The top control chart is for averages while the bottom is for the standard deviation. Build into the software is a procedure to change the background screen color:
1) Blue when observation are beyond the graphing limits
2) WHITE when there are no identifiable changes, a stable system (clear background figure 1)
3) Red when values are single digits, usually meaning bad communication
4) Yellow when the system is resetting (light shaded background in figure 1)
5) white when system is changing just prior to resetting * (clear background in figure 1)
There is no way to identify in real-time that changes are occurring until, the system is reset, so the screen background will be white up until the time when control limits are recalculated. An outlier is a point outside the control limits. The software counts the number of outliers to determine when the control chart resets. During the reset cycle the overall average, the variation average, and the control limits are calculated. The software is programmed to identify a sustained change. Just before the system displaying yellow there is a period of instability (the color background is white). When this change has sustained itself for a period of time specified by the caregiver, the reset calculations begin.
Figure 1 and standard deviation process control chart
GLOBAL SPECIFICATION LIMITS
Control limits are patient specific while clinical out-of-specification limits are global limit for all patients. Figure 2 illustrates a Pareto analysis of the frequency of alarms for 12 patients with 336 annotated events (alarms). The annotator documented alarm events and classified the events into three categories:
Based on the weak assumption that we could identify alarms as they occurred, the Pareto analysis indicated that 88.7 percent of the alarms were false positives, 7.2 percent were true positive-clinically irrelevant, and 4.1 percent were true positive-clinically relevant. Analysis of the annotation demonstrated that there were recording problems specifically associated with the identification of the reasons for an alarm. We concluded that the task of using such a classification was interfering with the data collection., The event: true positive-clinically relevant (4.1 percent occurrence) was of interest to the caregiver. A detail data analysis of all of these occurrences indicated that in half the cases there were no data behavior changes in the neighborhood of the alarm. After review, we concluded that the events were false recordings. In essence, we were not able to identify the reasons for the alarms.
Figure 2 Pareto Analysis of Alarms
CONTROL LIMITS
Because the clinical decision making environment varies from data rich (an observation per heart beat) to data poor (one or two observations per year) the software has many options including the capability to change from two to three to four sigma limit levels. Most options may be controlled with a series of configuration files. The settings for our study were:
The results of our control chart analysis for the patient set are shown in Figure 3, ordered from high to low.
Figure 3 Pareto Analysis of Control Charts
Experimental Design
The objective of our experimental design is: to compare the use of control charts to current practice. For this preliminary work, we limited ourselves to running our control charts after the fact. The steps in our testing procedure were:
Terms and Alarms Identification
The control charts tells us that a number of alarms could not have happened, the data in the neighborhood of the alarm was stable and there was no reason for the monitors to indicate a problem with the patient. In addition, many alarms were missed, data was beyond the specification limits and the medical monitor failed to annotate the event. A review of these results indicated that the problems could be due to:
Retrospective Control Charts and Communication
Figure 4 illustrates after-the-fact control charts pairs for oxygen saturation and heart rate with annotation indicating the timing of alarms. The line (|) is graphed near to the occurrence of the alarm and the number next to the line is the identification number of the alarm. The screen was selected because there is only a single alarm, number 728. Alarm 728 was recorded as a heart rate alarm on 07/20/95 at 11:54:00. The cause was identified as a probe. There was a jump in the heart rate in the neighborhood of the alarm. The control chart also tells us that a SaO2, oxygen saturation, alarm should have sounded at about the same time. The blue area on the top chart indicates that the value of SaO2 was out of the graphing range.
Figure 4 Control Charts with annotation
For our study, a Nellcor monitor was used to collect oxygen saturation. The Nellcor data were transmitted to a SpaceLabs monitor that collected heart rate and then transmitted both oxygen saturation and heart rate to the computer for electronic recording. The double transmission of oxygen saturation was one of the weakness of our study. Figure 5 illustrates a control chart screen for patient number 7 file xx070717.bqc. Red background dominates the oxygen saturation control chart meaning that most values received were single digits, that is a value less than 10. Since the heart rate chart is stable and individual probes were being used for oxygen saturation and heart rate, it is likely that the problem was an oxygen saturation communication problem. The annotations for alarms 490 through 503 indicate either a probe problem or bad data format/bad connection.
Figure 5 Communication Problem
CONTROL LIMITS, NATURAL PATIENT LIMITS, AND SPECIFICATION LIMITS
Control limits refer to the average or standard deviation of a subgroup. Natural patient limits (NPL) refer to individual observation and thus are related to the clinical specification limits. The relationship between one point out-of-specification limits and control limits is a function of the NPL. NPL define the range of output from a process, assuming a stable system of chance causes. A stable system of chance causes assumes that no outside pressures such as changes in agents and treatments are acting on the patient, process. The upper natural patient limit (UNPL) is the maximum value of an individual observation while the lower natural patient limit (LNPL) is the minimum value of an individual observation. The three possible relationships between the specification limits and the NPL are: 1. Tampering; 2. Equal (which seldom occurs); and 3. Early warning.
Tampering means the process is being over adjusted. There are three ways tampering can occur: 1.the patient is unstable, the standard deviation of the process is so wide that there is a probability; that observations will occur outside both the lower and upper specification limits; 2. patient stable, but the average is near the lower specification limit; and 3. the is patient stable, but the average is near the upper specification limit. We found a number of patients in all three tampering classifications. It was not unusual for a patient to have a small standard deviation for a period of time and then for the standard deviation to explode such that a probability of being both high and low existed at the same time. Following a one point out-of- specification decision making rule the patient would be constantly adjusted with very little result.
Early warning is when the specification limits are wide relative to the natural patient limits. Vital signs that fit into this classification included skin temperature, oxygen saturation, and heart rate. Under early warning, the patient vital signs can wander through a wide range before the one point out-of-specification system will indicate a change. Control charts pick up these changes as early warnings of change. An early warning gives the physician time to react before the patient becomes critical.
RESULTS-NEW FOCUS
The failure of our original research design (focused on the inspection process) tells us we must redirect the study towards control and away from inspection. Annotation must be: simple, in real-time, and directed towards caregiver actions such as: taking blood; checking temperature; performing a physical exam; adjusting the FiO2; patient movement; alarms high and low; and touching . Each ICU, ER, OR requires a different selection of caregiver actions.
The objective to: "evaluate the effectiveness of SPC in the area of patient alarm generation." was not accomplished. The experimental design assumed that there was a one-to-one relationship between alarms and control charts. The objective of the current raw data is to alarm when observation are beyond physician set limits.
The objective of process control is to control the process, that is real-time patient monitoring. SPC goes beyond a simple alarm system and the comparison is not a simple analysis of alarms. In addition, the identification of the cause of alarms on the fly proved to be imprecise. As a result alarm identification and annotation provided a limited insight into what was actually happening in the clinical environment and future studies will depend on annotating both actions of caregivers as well as evaluation of alarms
REFERENCES
1. Deming, W.E., Interfaces, 1975 Volume 5 Number 4.
2. Laffel, Glenn, Robert Luttman, and Steven M. Zimmerman "Using Control Charts to Analyze Serial Patient-Related Data", Quality Management in Health Care, 1994 2(1), p.70-77 Volume 3 Number 1 Fall 1994
3. Pfadt, Al & Donald J. Wheeler (1993), "Control Charts-Powerful Tools in a Clinical Setting," SPC Ink, p.1-4.
4. Plsek P (1992), "Introduction to control charts," Quality Manage Health Case. p.65-73.
5. Wheeler, Donald J. and David S. Chamgers Understanding Statistical Process Control second edition, SPC Press Knoxville, Tennessee pages 6-11, 1992
6. Zimmerman, Steven M., Robert N. Zimmerman, Lonnie D. Brown, and Shannon S. Brown, (1992) "Using Moving Average Process Control Charts in Biomedical Applications," Proceedings- Ninth International Conference of the Israel Society of Quality Assurance, 1992, November 1992, p.761-764.
7. Zimmerman, Steven M., Lonnie Brown, Shannon Brown, and Robert N. Zimmerman (1992), "Using the Theory of Runs in a Biomedical Application," 46th Annual Quality Control Congress Transactions May 18-20, p.903-908.
8. Zimmerman, Steven M. and Steven Ringer, "Issues in Clinical Monitoring," Computers in Industrial Engineering Vol. 31, , pp. 451-454, 1996 pages 451-454