Generative Models for Augmenting Limited Sample or Biassed Healthcare Data

This article may be relevant to your organisation if you are working in areas of rare disease and the application of sample augmentation strategies is relevant.

There is a great potential in the use of generative AI models for boosting clinical datasets that are limited in sample size or that exhibit sampling bias. Recent advances in synthetic data generation methodologies have allowed for the generation of high-fidelity synthetic data that are both statistically and clinically indistinguishable from real patient data. Other experimental work has demonstrated that synthetic data generation methods can be used for selective sample boosting of underrepresented groups. This project will assess how boosting methods can augment data from small trials by exploiting linked population datasets and / or clinical expertise. The project will provide a perspective on what are the opportunities as well as the limitations and risks of boosting data when trying to assess AI models and decision-making.


Aim - to explore how data boosting approaches can be used to augment existing health datasets in different innovator settings


Objectives:

  • Clinical trials self boosting versus data linkage / population boosting

  • Assessment of fairness and performance tradeoffs


If you are interested in this topic, RADIANT-CERSI would like to hear from you.
Please answer the questions below and leave your contact details to continue the conversation

 
Previous
Previous

A living systematic review of bias mitigation methods in Natural Language Processing for equitable healthcare AI

Next
Next

Bias in Geolocation and Sub-Population Models versus Foundational Global Models