The medical field is often considered to be on the edge of the AI revolution. Many well-known enterprises in the field of artificial intelligence, such as Google’s deepmind, claim that they have been working hard in the medical field, “artificial intelligence is expected to change the existing medical pattern”. But how much influence has AI had so far? Can we really know the specific medical areas that benefit from new technologies?
At the ACM Chi “human factors in computer systems” conference held in May this year, Carrie J. Cai from Google showed her award-winning works in the discussion of “people-oriented tools to solve AI’s imperfect algorithms in the medical decision-making process”, and claimed that machine learning technology will be used more and more in medical decision-making. She developed a new system that enables doctors to instantly improve and modify the search method of pathological images to continuously improve its accuracy.
Using the visual model of deep learning, it is a promising way to refer to the medical images of known patients (such as tissues from biopsy) when making diagnosis for new patients. However, it is a great challenge for the existing system to accurately obtain the similar images needed by doctors during a specific diagnosis, because of the existence of “intention gap”, that is, it is difficult to capture the doctor’s accurate intention. We will discuss this issue in detail later.
Cai’s research shows how the thinning tool developed on the medical image retrieval system can improve the diagnostic accuracy of images. More importantly, it increases doctors’ trust in machine learning algorithm to assist medical decision-making. In addition, the survey results show that doctors can understand the advantages and disadvantages behind the algorithm, find and correct the errors in the system by themselves. Overall, medical experts are optimistic about the future of AI system to assist medical decision-making.
In the past two decades or so, due to the increasing accessibility of visual data on the network, content-based image retrieval (CBIR) has become a hot field of computer visualization research. Text-based image search technology has been criticized for its mismatch with visual content. Therefore, sorting similar visual content is considered to be very important in many cases. Wengang Zhou et al. Pointed out two key challenges of CBIR system, which they called “intention gap” and “semantic gap”.
The so-called “intention gap” means that it is difficult to understand the user’s exact intention through the existing database, such as the keywords in the figure. This was proposed by Carrie J. Cai et al. Looking back on previous studies, querying through example images seems to be the most widely explored field, obviously because it is very convenient to obtain rich query information through images. But this needs to extract accurate features from the image, so we need to go to the next angle, that is, the semantic gap.
Semantic gap mainly refers to the difficulty of describing high-level semantic concepts with low-level visual features. Now, after many years of research, this problem has made some significant breakthroughs, such as the introduction of invariant local visual features (SIFT) and the introduction of visual word bag (bow) model. Recently, learning based feature extractors, such as deep convolutional neural network (CNN), have explosively opened up many new research ways, which can be directly applied to solve the semantic gap discussed in CBIR system. These technologies are significantly improved compared with manual input feature extractors, and have shown potential in semantic aware retrieval applications.
The convolutional neural network (CNN) algorithm is used in the embedded computing module shown in Fig. 2 as the feature extractor in the system. The system compresses the image information into digital feature vector (also known as embedded vector), calculates and stores the image database and its numerical vector through the pre trained CNN algorithm. When querying and retrieving the image, the same CNN algorithm is used to calculate the query input image, and compared with the vector in the database to retrieve the most similar image.
In addition, Narayan hedge et al. Explained that the CNN architecture is based on the deep sorting network proposed by Jiang Wang et al. It is composed of volume layer, aggregation layer and connection operation. In the network training stage, three groups of images are input: the first group of reference images of a specific class, the second group of images of the same class and the third group of images of completely different classes. Then, the loss function is modeled so that the assignment distance of the network when embedding the same kind of images is shorter than that when embedding different kinds of images. Therefore, images from different classes help to enhance the similarity between embedding images from the same class.
They use large natural image data sets (such as dogs, cats, trees, etc.) to train networks rather than just pathological images. After learning to distinguish similar natural images from different natural images, the same training framework is directly applied to the feature extraction of pathological images. This approach is regarded as an enhanced version of neural network in the application of limited data, commonly known as transfer learning.
Narayan hedge et al. Said that the CNN feature extractor set 128 vectors of different sizes for each image, and selected L2 distance as the comparison function between vectors. All data sets generated on pathological image slides were embedded using t-sne visualization technology. As shown in Fig. 3: (a) embedding colored by organ sites (b) embedding colored by histological features.
In fact, similar deep ranking network architecture and training technology can be widely used in deep learning literature such as Siamese neural networks, and even have been applied to face recognition. Now, back to the CBIR system, we understand that deep learning technology can reduce the semantic gap. These deep learning based methods can recognize important features even in complex natural images. So far, we have studied the application of CBIR system and the potential of deep learning technology in overcoming the semantic gap. But what is the applicability of CBIR in medical treatment? Can we clearly quantify its impact?
In 2002 alone, the radiology department of the University Hospital of Geneva produced more than 12000 images a day. Among them, cardiovascular department is the second largest digital image maker. The goal of medical information system should be “to provide the right information for the right person at the right time and place, so as to improve the quality and efficiency of treatment process.” therefore, in clinical decision-making, case-based reasoning or evidence-based medical decision-making hope to benefit from CBIR system.
No matter how sound the technology is, these systems need more improvement in practical clinical application, especially in establishing the trust between the system and doctors. This is proposed by Carrie J. Cai et al. Doctors improve the system by using relevance feedback very flexibly, that is, rating the obtained system results. Henning m ü ller et al. Also affirmed the importance of relevance feedback in an interactive environment to improve system results and improve the adaptability of CBIR system.
Another focus is to quantify the impact of these systems, which is essential for adaptation and development in this research area. After a user study with 12 pathologists, Carrie J. Cai et al claimed that through their CBIR system, doctors can more easily increase the diagnostic utility of the system. In addition, the results also show that doctors’ trust in them is improved, which also increases the possibility of clinical practice in the future. However, diagnostic accuracy was not evaluated in this study (although experience shows that it remains unchanged) because it is beyond the scope of the study.
Looking ahead, it is clear that medical experts and AI system developers need to collaborate continuously to identify examples and assess the impact of AI applications in health care. In addition, the scientific research community should also focus on the development of open test data sets and query standards in order to set benchmarks for CBIR system, which is very helpful to promote the development of research.