Bias in Geolocation and Sub-Population Models versus Foundational Global Models


This article may be relevant to your organisation if you are working with localized models and global models to assess their performance and fit with your solution.

Synthetic Individual-Level Geospatial Data (SILGSD) offers a number of advantages in Spatial Epidemiology when compared to census data or surveys conducted on regional or global levels. The use of SILGSD can bring a new dimension to the study of the patterns and causes of diseases in a particular location while minimizing the risk of patient identity disclosure, especially for rare conditions. This project will explore the use of SILGSD by allocating synthetic patients to general practices (healthcare providers) in the UK using the demographics and prevalence of health conditions in each practice. The assigned general practice locations can be used as proxies for patient locations. The project will explore how this fine-grain simulated location data can mimic health condition distribution in the UK regions, as well as how models that are learnt at different levels of locality compare to models trained on a global scale. For example, to understand whether models that are specialised on a local population demonstrate improved performance or risk overfitting.


Aim - to investigate the performance of models based on localised data compared to global models with a focus on covid modelling.


Objectives:

  • Using synthetic geolocation data to build and assess localised models of covid around the UK

  • Comparing the performance of these localised models to global England-based foundational models

  • A full fairness assessment of models to explore the impact of differences in demographics


If you are interested in this topic, RADIANT-CERSI would like to hear from you.
Please answer the questions below and leave your contact details to continue the conversation

 
Previous
Previous

Generative Models for Augmenting Limited Sample or Biassed Healthcare Data

Next
Next

How Post Market Surveillance Interacts with Statistical and Computational Models of Drift in AI Models