Like any enquiring scientist – and especially one recently diagnosed with cancer – Barzilay had a host of questions for her oncologist, for which there were no answers. She discovered that only 3% of the 1.7 million people diagnosed with cancer in the USA each year were enrolled in clinical trials, and yet current treatment protocols relied exclusively on data drawn from this small percentage of patients. Barzilay said, “We need treatment insights from the other 97% receiving cancer care.”
One of her first projects was to use machine learning to parse breast pathology reports that has recently been published in Breast Cancer Research and Treatment. She collaborated with clinicians at the Massachusetts General Hospital (MGH) in which they trained a machine-learning model on pathology reports using natural language processing to extract tumour characteristics. This enabled them to build a large database of more than 90,000 pathology reports that could be searched for various cancer-related attributes. The model was tested on 500 patient reports that did not overlap the training set and revealed an accuracy of 97%, demonstrating its potential to assist physicians identify appropriate treatment protocols for individual patients.
Recognising that mammograms contain information that may be difficult for a radiologist to decipher – while computers are adept at detecting subtle changes – Barzilay has begun collaborating with Constance Lehman, head of breast imaging at MGH. Using deep learning to automate the analysis of mammograms, they are aiming to compute breast density and other scores currently done manually. Lehman and Barzilay have two ambitious goals in mind: to identify patients who are likely to develop a tumour before it can be seen on the mammogram; and to predict which patients are likely to experience a recurrence after initial treatment.
Regina Barzilay is an inspiration to her students, believing in their innovative spirit: “There is so much to do, and we are just getting started.”