Generative Models for Augmenting Limited Sample or Biassed Healthcare Data

15 Apr

This article may be relevant to your organisation if you are working in areas of rare disease and the application of sample augmentation strategies is relevant.

There is a great potential in the use of generative AI models for boosting clinical datasets that are limited in sample size or that exhibit sampling bias. Recent advances in synthetic data generation methodologies have allowed for the generation of high-fidelity synthetic data that are both statistically and clinically indistinguishable from real patient data. Other experimental work has demonstrated that synthetic data generation methods can be used for selective sample boosting of underrepresented groups. This project will assess how boosting methods can augment data from small trials by exploiting linked population datasets and / or clinical expertise. The project will provide a perspective on what are the opportunities as well as the limitations and risks of boosting data when trying to assess AI models and decision-making.

Aim - to explore how data boosting approaches can be used to augment existing health datasets in different innovator settings

Objectives:

Clinical trials self boosting versus data linkage / population boosting
Assessment of fairness and performance tradeoffs

If you are interested in this topic, RADIANT-CERSI would like to hear from you.
Please answer the questions below and leave your contact details to continue the conversation.

Courtney Lea Evans Murdock https://clemva.com

Generative Models for Augmenting Limited Sample or Biassed Healthcare Data

A living systematic review of bias mitigation methods in Natural Language Processing for equitable healthcare AI

Bias in Geolocation and Sub-Population Models versus Foundational Global Models