A group of multidisciplinary researchers in Texas has just published an original article in Cancer entitled “Correlating mammographic and pathologic findings in clinical decision support using natural language processing and data mining methods.” Senior author Stephen Wong told AuntMinnie.com: “Obamacare calls for implementation of electronic health records, and once that raw data is in the digital archive, we can start mining it more cost-effectively to improve decision-making and patient care. Mammography lends itself to overdiagnosis and unnecessary biopsy, so artificial intelligence (AI) techniques such as machine learning and natural language processing (NLP) could help reduce biopsies by providing a more accurate clinical picture.”
The authors developed an NLP software algorithm to automatically extract mammographic and pathologic findings from free text reports, and the correlation between the imaging features and breast cancer subtypes was analyzed. Using clinical data from the Houston Methodist Cancer Center, the NLP algorithm identified 543 patients with BI-RADS 5 mammograms who had invasive carcinoma. These cases were also classified by three breast cancer subtypes: estrogen receptor-positive (ER+); human epidermal growth factor receptive 2-positive (HER2+); and triple negative breast cancer (TNBC).
As seen above right (© American Cancer Society), the algorithm showed that patients with ER+ tumours had a higher percentage of spiculated margins, while women with HER2+ tumours were more likely to have heterogeneous calcifications. The NLP algorithm’s predictions were confirmed by manual review, leading the authors to conclude that their tool could be used to drive evidence-based biopsy decision making. According to Wong (seen below left), “This software intelligently reviews millions of records in a short amount of time, enabling us to determine breast cancer risk more efficiently using a patient’s mammogram. This has the potential to reduce unnecessary biopsies.”
It is relevant to point out that the AI approach used is different from computer-aided diagnosis (CAD) that emerged fifteen years ago when full-field digital mammography (FFDM) began to replace film-based analogue mammograms. The NLP algorithm incorporates many more factors than CAD, and takes advantage of features that are embedded in both structured medical data and in free-form clinical notes.
Wong and his team are optimistic that AI will assist radiologists in the future, enabling them to deliver an improved service to both referring clinicians and their patients. Their next focus will be to classify BI-RADS 4 patients based on more than 10,000 text reports and image features. The widespread application of AI to map breast cancer risk would appear to be within reach.