Blog
How IQVIA Addresses Biases in Healthcare AI
Alice Joules, MSc, Healthcare Data Science Manager, AI for Healthcare & MedTech, IQVIA Real World Solutions
Irene Brusini, Ph.D., Senior Data Scientist, AI for Healthcare & MedTech, IQVIA Real World Solutions
Nov 04, 2024

As artificial intelligence (AI) becomes a cornerstone in healthcare, particularly in the development of clinical decision support tools (CDSTs), there is a growing concern about the potential for AI algorithms to perpetuate biases, exacerbating existing health inequalities instead of mitigating them.

A 2019 study published in Science serves as a cautionary example of such bias. This research analyzed a commercial algorithm used in US hospitals to identify patients needing additional medical care [1]. It was found that the algorithm exhibited significant bias against patients who self-identified as Black. For a given predicted risk level, patients who identified as Black were sicker, had more chronic conditions, and incurred higher costs for emergency care visits and lower costs for inpatient and outpatient specialist costs, than their White counterparts who had better access to healthcare. This disparity resulted from the algorithm’s reliance on healthcare costs as a proxy for medical need, inadvertently favoring White patients due to their higher healthcare utilization. This highlights how biased data can lead to flawed AI insights to further entrench health inequalities.

To promote health equity, it is essential to identify and mitigate sources of bias in healthcare AI. By doing so, we can build fair algorithms that benefit all patients, regardless of their background or demographics, and support healthcare professionals in providing more equitable care.

Bias in disease detection algorithms

IQVIA develops CDSTs that use AI algorithms to help clinicians identify patients likely to be diagnosed with a disease in the future. This allows healthcare providers to review patient records and determine whether diagnostic assessments are warranted to aid earlier diagnosis.

An unfair or biased algorithm in this context would be one that, when presented with two identical medical records, determines a greater need for diagnostic testing based solely on non-medical demographic factors like gender, ethnicity, socio-economic status, or disability. Bias can stem from data that misrepresents certain populations, such as underdiagnosed subgroups or those with limited access to healthcare. Additionally, minority populations may be underrepresented in the dataset, which makes it challenging to build a truly equitable AI model.

Disentangling the exact sources of data bias is not straightforward, as bias can stem from multiple factors. For example, a primary care dataset may be biased if certain subgroups of the population experience higher rates of underdiagnosis than others. This leads to “mislabelled” medical records during the AI algorithm’s training process, where individuals who should have been diagnosed are instead labelled as healthy. Additionally, some subgroups may access healthcare less frequently due to socioeconomic or systemic barriers. As a result, the dataset may not accurately reflect the full range of health issues present in these populations. Furthermore, minority groups may naturally constitute a smaller proportion of the overall population, meaning that fewer data points are available for these groups. This lack of sufficient data exacerbates the challenges of developing an equitable AI model as predictions for these subgroups become less reliable.

IQVIA’s strategy to address health inequalities

To deploy AI algorithms in the real world, IQVIA follows a research-led 3-step process to identify, report and mitigate bias.

  1. Identifying potential sources of bias in the data

    The first step is to ensure that the training data accurately represents the population the CDST will be used on. This involves identifying the protected characteristics (such as age, gender, and ethnicity) within the dataset and comparing distributions, disease prevalence and incidence against existing literature and/or census records. This analysis helps identify underrepresented or underdiagnosed subgroups that may be at risk of biased predictions.

  2. Identifying bias in the AI algorithm’s output

    IQVIA employs a combination of statistical tools and clinical expertise to identify potential bias. For instance, when a disease detection algorithm is applied to identify patients needing diagnostic assessment, we compare false positive rates across demographic subgroups. A higher proportion of false positives in a specific group, such as a certain ethnicity, may indicate bias in favor of that population, which should prompt additional investigations.

    Moreover, AI algorithms use a set of predictors—extracted from the patients’ clinical history—to produce their outputs, and several methods exist to identify which predictors have highest impact on the algorithm’s outputs. We evaluate whether the importance of these predictors varies across different subgroups to unveil any underlying data patterns that may contribute to unfair predictions.

  3. Mitigating algorithmic bias

    When bias is detected in disease detection algorithms, IQVIA investigates possible approaches to try and mitigate it, depending on the use case, the end user, and the specific nature of the bias. One possible approach involves assigning different weights to patients during the algorithm training process. Patients from underrepresented or underdiagnosed groups may be given higher weights to encourage the model to make more accurate predictions for those populations.

    IQVIA also promotes transparency by collaborating with CDST users, sharing insights into potential biases, and encouraging clinical interventions to address them. For example, clinicians may actively target patients from groups that the algorithm may be biased against to improve their care.

Final thoughts

Effectively identifying and mitigating algorithmic bias, especially for minority groups with multiple overlapping protected attributes, is challenging. The scientific community continues to research better methods to understand and correct bias in AI. Beyond technical solutions, the ultimate goal is to foster healthcare equity and fairness.

Bias in healthcare data can stem from various sources, including sample selection issues, biological variability, and measurement errors during data collection. Like algorithmic bias, these issues must be addressed to ensure fair outcomes. IQVIA remains committed to transparency, thoroughly investigating whether its AI tools can be safely deployed without exacerbating inequalities [2]. IQVIA applies mitigation strategies guided by best practices and standards such as TRIPOD-AI and PROBAST-AI frameworks [3].

References

  1. Obermeyer, Z., et al., Dissecting racial bias in an algorithm used to manage the health of populations. Science, 2019. 366(6464): p. 447-453.
  2. Rigg, J., et al., Finding undiagnosed patients with hepatitis C virus: an application of machine learning to US ambulatory electronic medical records. BMJ Health Care Inform, 2023. 30(1).
  3. Collins, G.S., et al., Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open, 2021. 11(7): p. e048008.

Related solutions

Contact Us