MATTHEW D. KELLER & BRANDON HARRISON-SMITH: Pulse-oximetry errors affect patient outcomes
The pulse oximeter is a device that estimates a person’s oxygen saturation level, a measure of the oxygen concentration in their blood, by shining light through their tissue, typically a fingertip or an earlobe (Fig. 1). As highlighted by the COVID-19 pandemic, accurate pulse-oximeter readings can be crucial for clinical decisions, especially when arterial blood-gas tests — the gold standard for determining oxygen saturation levels — are not available. But these devices give readings that are often less accurate for people who have dark skin, and this shortcoming has led to medical practices that only exacerbate the problem, making pulse oximetry emblematic of the broader issue of racial bias in medicine. The first step towards a solution must involve an orchestrated effort from those who design, use and regulate these devices.
Driven by clinical experiences early in the pandemic, Sjoding et al.1 published a retrospective report showing that pulse oximeters overestimate the true oxygen saturation of Black people. This inaccuracy means that diagnoses of hypoxaemia, the condition of having low levels of oxygen in one’s blood, are approximately three times more likely to be missed in Black patients than in white patients. Misdiagnosed patients are said to have occult hypoxaemia when arterial blood-gas tests indicate oxygen saturation levels of less than 88% (signalling hypoxaemia), despite pulse oximeters measuring a healthy oxygenation of more than 92%.
Since Sjoding and colleagues’ report, several large retrospective studies have confirmed that darker-skinned people (those self-identifying as Black, Asian, Hispanic or a combination of these) are more likely than white people to experience occult hypoxaemia2–5. In one study of people with COVID-19, 35% of those self-identifying as Black had their eligibility for oxygen treatment delayed, or even missed altogether, compared with just 20% of the white people documented2. In another study, Black people received less therapeutic oxygen than did white people who had equivalent arterial blood-gas values3. A more comprehensive analysis showed that, even when baseline health conditions are taken into account, people with occult hypoxaemia are prone to organ dysfunction and in-hospital mortality, and that Black people in this group have the worst organ dysfunction5.
Although clinical reports of skin-colour bias in pulse oximetry were not widespread until the COVID-19 pandemic, evidence for this issue has been accumulating for decades6,7. A comparison reported in February found that pulse-oximeter readings from nine devices were consistently less accurate for darker-skinned people than for lighter-skinned people8. But the study also found that testing healthy individuals under carefully controlled laboratory conditions resulted in fewer cases of occult hypoxaemia than are measured in hospitals. In fact, none of the 491 people who were tested by the authors had readings consistent with occult hypoxaemia, whereas Sjoding and colleagues tallied 187 cases out of 3,527 measurements from a cohort of 1,609 hospitalized individuals. This discrepancy highlights the need to understand how pulse-oximetry errors are exacerbated in real-world use.
All of these findings echo a long history of the health-care system using fixed racial offsets for certain instruments and risk formulas that are now recognized as potentially contributing to health inequities, rather than alleviating them9. For example, an algorithm that is commonly used to assess the risk of heart failure (see go.nature.com/3mw3zda) was originally designed to systematically increase the score (and thus the perceived risk) for people who are not Black. This offset came under scrutiny for raising the threshold for treating Black people, and is now an optional feature of the calculator.
The unseen Black faces of AI algorithms
In the case of pulse oximetry, the idea that race-based adjustments (rather than effective device design and calibration) could rectify the overestimation error also seems inappropriate. And although this overestimation is not solely responsible for patient-outcome disparities, such as those experienced during the COVID-19 pandemic, efforts to correct it are crucial. That’s because it is increasingly clear that reports of bias in medical devices could aggravate the already-complex historical relationship between the Black community and medicine.
Sjoding and co-workers’ findings prompted the US Food and Drug Administration (FDA) to issue a safety communication in February 2021 highlighting the limitations of pulse oximeters (see go.nature.com/3wkgket); it is likely that few health-care workers, and even fewer patients, had appreciated these drawbacks. And last month, the FDA announced that the Medical Devices Advisory Committee would convene in November to gather all available evidence on the issue and to determine ways of improving the accuracy of pulse oximeters. Addressing the problem appropriately will require a coordinated effort from researchers, health-care workers, device manufacturers and the FDA.
Once the mechanism of the oximetry overestimation is clearly understood, it should be possible to make this crucial piece of health-care equipment work equitably for all. This might involve altering pulse-oximeter calibration and clinical-study procedures by adopting objective metrics for skin tone, instead of using self-identified ethnicities or subjective assessments of pigmentation. An ideal solution might involve a new generation of devices that can objectively sense and account for a patient’s skin tone — as well as any other factors that could affect pulse-oximetry measurements.
CHETAN PATIL & MOHAMMED SHAHRIAR AREFIN: The basis of bias in pulse oximetry
The modern finger-clip pulse oximeter was developed in the 1970s and, over the past 50 years, has revolutionized patient monitoring by enabling rapid identification of acute respiratory distress. However, both the device itself and the way it is calibrated are characterized by biases that are linked to the person’s skin pigmentation. The combined consequence of these factors is an apparent racial bias in oximetry measurements that was no doubt unintended by its inventors. Overcoming these technical problems is a multifaceted challenge that requires careful analysis, and rigorous scrutiny of the way in which clinical trials are designed.
The device works by measuring the time-varying optical signal that is produced by the interaction of red and infrared light with tissue perfused with blood10 (Fig. 1a). How light interacts with the tissue results in photons being either absorbed or scattered by molecules such as haemoglobin, melanin, lipids and water11.
Pulse oximetry is possible because oxygenated haemoglobin absorbs infrared light more efficiently than it does red light, whereas the opposite is true for deoxygenated haemoglobin. The device shines red and infrared light through a person’s skin and the detected light produces an oscillating signal, because the amount of blood in the tissue fluctuates with each heartbeat. The average value of this oscillating signal is conventionally used to indicate the total absorbance from all the biomolecules in the tissue, whereas its amplitude quantifies fluctuations in the concentration of oxygenated haemoglobin throughout the cardiac cycle.
Ending racism is key to better science: a message from Nature’s guest editors
By calculating the ratio of this amplitude to the average for red light, and normalizing it by the same ratio for infrared light, one arrives at an oximetry parameter that is linearly related to measurements of arterial blood oxygen saturation. Precise determination of this relationship for specific devices is performed through calibration studies that compare oximetry parameter values with oxygen levels in blood samples that are measured with a gas analyser.
A long-standing misconception in oximetry is that variation in the biomolecular composition of an individual — including, for example, their melanin levels — is accounted for, because the oximetry parameter is normalized by the average values of light detected by both the red and the infrared signals. This idea was supported by the results of limited theoretical analysis12, which considered the finger to be a homogeneous absorbing material, and did not account for the fact that light scatters differently depending on its wavelength. Such scattering effects are substantial in tissues with a multilayered anatomical structure, such as those in the finger (Fig. 1b).
Computational modelling of how light interacts with tissue offers a robust theoretical framework with which to revisit the assumptions associated with the simplified conceptual frameworks used in oximetry13. Such studies incorporate scattering, as well as geometric factors related to tissue anatomy and sensor configuration. Simulations of pulse oximetry have shown that increased pigmentation reduces the overall intensity of optical signals, which can result in a degraded signal-to-noise ratio, and thus explain observations of increased measurement variability in dark-skinned people14. Other simulations have contradicted the conventional belief that the widely used calibration parameter is not affected by pigmentation, supporting empirical findings indicating that increased pigmentation decreases the normalized ratio15.
Currently, enrolment guidance from the FDA for testing oximeter accuracy suggests that studies should involve a minimum of ten people, at least two of whom should be “darkly pigmented” (see go.nature.com/3rc1whx). However, the shade of a person’s pigmentation is an inherently subjective criterion and can contribute to inconsistency in study design. Given the evidence, both from measurements and from simulations, that the parameter used for conventional oximeters is pigmentation dependent, there is reason to question the FDA guidance that only 20% of people tested in these studies must have dark skin to achieve equitable calibration. Computational studies simulating the expected outcome of calibration studies in which 20% of people are ‘darkly pigmented’ support the findings of retrospective clinical studies that reveal an overestimation bias in oxygen saturation measurements in Black Americans16.
A combination of theoretical analyses and clinical findings will ultimately strengthen our understanding of challenging issues posed by pulse oximetry; these include the effects of pigmentation, as well as those of low perfusion of blood through a person’s tissue, carbon monoxide poisoning and anaemia17. The success of pulse oximetry as a real-time low-cost tool for monitoring a person’s cardiorespiratory status has led to its widespread use, and the technique’s prevalence has, in turn, highlighted situations in which inaccuracies occur. Clearly, clinical findings from the past few years provide an imperative for developing and validating oximeters without a fundamental dependence on pigmentation. These studies also highlight the importance of carefully reconsidering the enrolment criteria suggested for calibration studies, so that the skin pigmentation of test participants is evenly balanced, and determined using objective measures.