The introduction of full-field digital mammography (FFDM) as a replacement for plane-film analogue mammography started over two decades ago. Shortly thereafter, computer-aided detection (CAD) algorithms emerged and it was predicted that these new tools would soon lead to a significant improvement in the performance of breast-screening programmes. So confident was Hologic, a leading manufacturer of FFDM systems, that it spent $220 million in July 2006 to acquire R2 Technology, maker of the first CAD system approved by the FDA. However, despite its potential, CAD never delivered on its early promise, leading radiologists to question the utility of these algorithms.
In the past few years, CAD systems have been replaced by algorithms based on artificial intelligence (AI) and, as seen at the recent congresses in Chicago and Vienna, AI is now one of the “hottest” topics in radiology. So, how are these new algorithms performing in breast-screening programmes? Two recent papers that have appeared this month suggest that AI may be delivering on its nascent potential.
A group based at Radboud University in the Netherlands published their findings in the Journal of the National Cancer Institute, comparing the performance of a commercial AI system called Transpara that uses deep learning based on convolutional neural networks, with experienced radiologists. Datasets were gathered from seven different countries, with images acquired by FFDM systems from four different vendors, yielding 2,652 exams of which 653 were malignant.
The performance of the AI system was statistically equivalent to that of the average of 101 radiologists. In fact, as seen at left (© JNCI), the AI system, with an area under the curve (AUC) equal to 0.840, performed slightly better than the radiologists (AUC = 0.814). Senior author Ioannis Sechopoulos commented: “It was exciting to see these systems have reached the level of matching performance of not just radiologists, but of radiologists who spend at least a substantial portion of their time reading screening mammograms.”
A multi-national group led by scientists from the USA, UK and Poland, published their findings in arXiv.org. Their deep learning AI algorithm (AUC = 0.876) performed better than the human readers (AUC = 0.778), while a hybrid approach – a combination of radiologist and AI – led to the best performance (AUC = 0.891). The authors wrote: “These results suggest our model can be used as a tool to assist radiologists in reading breast cancer screening exams and that it captured different aspects of the task compared to experienced radiologists.” The evidence is growing: it would appear that AI is here to stay.