I have recently graduated with a PhD from the University of Cambridge, where I worked on machine learning problems for healthcare as part of the AI group in the computer science department. I am also a final year medical student and I will become an (academic) foundation doctor in West Suffolk and Addenbrooke’s Hospitals from August 2023.
I have previously intercalated in part IIA engineering.
Download my CV.
MB BChir (in progress), 2023
University of Cambridge
PhD in Machine Learning for Healthcare, 2022
University of Cambridge
BA in Engineering/Preclinical Medicine, 2016
University of Cambridge
My work focuses on predicting patient outcomes in the Intensive Care Unit (ICU). When designing my deep learning models, I am often inspired by my knowledge of clinical decision making.
For example, for time series processing in Electronic Health Records (EHRs), I use temporal and pointwise convolution to efficiently extract patient trajectories over time – a method inspired by clinicians.
I am also working on using graph neural networks to link the experiences of similar patients. The rationale is that when clinicians make decisions they will typically lean on their past experience, especially if they are dealing with a rare disease.
If you’re interested in the kind of stuff I do, follow me on twitter!
My thesis focuses on representation learning for patients in intensive care, aiming to improve patient outcomes and healthcare system efficiency. It addresses predicting patient deaths and estimated discharge dates, essential for managing hospital beds effectively. The research incorporates clinical knowledge, periodic signals, systematic biases, and graph neural networks to enhance length of stay prediction, mortality prediction, and patient outcome models for mechanically ventilated patients, with the goal of discovering hidden patient phenotypes and creating real-world deployable representations.
We trained different time series models to embed medical time series data from mechanical ventilation episodes, and then we clustered these to uncover hidden patient subtypes in the data.
Our model, Temporal Pointwise Convolution (TPC), is specifically designed to mitigate common challenges with Electronic Health Records, such as skewness, irregular sampling and missing data. We have achieved significant performance benefits of 18-68% over the Long-Short Term Memory (LSTM) network, and the Transformer.
Our model, LSTM-GNN, is designed to take advantage of similarity between patients in the EHR (established using the diagnoses). First, it processes the time series data for each patient with the LSTM component, before sharing information within the neighbourhood of patients via the GNN. This is an alternative way of presenting diagnoses information (the common approach is to use an encoder in the late stages of a model). We found that using both methods together gains the best performance.
*My ML4H reviews were explicitly recognised as excellent by metareviewers