Skip to main content

Optimising testing for disease surveillance with machine learning

Posted:

A new machine learning informed strategy could support public health leaders to design better disease surveillance during a disease outbreak, a new study has found.

Research Associate Mengyan Zhang and Associate Professor Seth Flaxman are co-authors of a new paper in Proceedings of the American Academy of Sciences that may help optimise testing strategies for infectious disease surveillance.  

Alongside authors from the University of Oxford Pandemic Sciences Institute (PSI) Professor Moritz Kraemer and Professor Oliver Pybus, the study included researchers from the University of Oxford Department of Biology as well as colleagues from the Oxford Martin Programme on Pandemic Genomics, Imperial College London, Royal Veterinary College and University of California, Los Angeles. 

When epidemics and pandemics occur, screening the population for infection is essential to understand how disease is spreading. However, testing resources are finite and tests should be allocated to maximise the information gained about disease distributions. 

The study proposes a novel machine learning strategy (‘policy’), Selection by Local-Entropy (LE), to guide the selection of testing sites. When tested in a range of simulated outbreak scenarios, LE mostly outperformed other testing policies considered by the authors. 

The framework created by this study will allow researchers and policymakers to more adaptively design surveillance systems for infection disease. 

Methods  

Active Learning (AL) is an iterative form of machine learning that aims to maximise a model’s performance by strategically selecting the most informative data points that need labelling. 

The new study tested eight AL policies, including LE, exploring the performance of different test allocation strategies in simulated outbreak scenarios. 

In real-life outbreaks, the deployment of different testing policies should depend on resources and budget as well as outbreak structure and stage. When resources are constrained, the study argues, frequent exploratory testing yields better results by testing locations that are at the periphery of the outbreak and for which predictions are most uncertain.  

Our active learning strategy is designed to effectively explore local uncertainties within the mobility network. By leveraging Selection by Local-Entropy, we address a balance between exploitation and exploration, which enables more efficient and targeted testing given limited testing resources. Research Associate Mengyan Zhang 

The infection status of an initial node in an outbreak model was revealed; then AL policies were iteratively deployed to determine which nodes needed to be labelled (in this case tested) as infected or not infected. The goal of the exercise was to maximise the model’s predictive performance while using the least amount of labelled data. 

In real outbreak scenarios, this would mean testing locations in a way that would minimise the resources used while still providing an accurate picture of how disease is spreading. 

Adapting testing approaches  

The newly developed LE is an uncertainty-based policy, meaning it selects nodes for testing based on the uncertainty of the outbreak model’s predictions. The more uncertain predictions are for a certain node, the more informative testing is likely to be. 

Unlike other policies of this type, LE considers the uncertainty of the nodes it selects for testing as well as of their connected nodes.   

Our work opens up exciting new avenues for future research. By building on the framework we've developed, we hope to explore how surveillance policies can be tailored for specific pathogens with unique transmission characteristics, such as varying incubation periods or different modes of transmission. Ultimately, our goal is to develop a framework that will provide actionable insights and recommendations in real-time, enabling policymakers to respond more effectively during emerging outbreaks. Joseph Tsui, DPhil student in the Department of Biology and Oxford Martin School Programme on Pandemic Genomics

Data and robust understanding of the transmission process early in epidemics is essential for effective public health policies. Our study provides a step towards more rational implementation of public health policies. Professor Kraemer

Read the study in Proceedings of the National Academy of Sciences.