Generally, the task of image classification is often hundreds of thousands, millions, even tens of millions of data. Compared with medical image, the amount of data is very small. At the same time, the apparent changes of esophagus are very complex due to the different equipment parameters, the doctor’s photographing technique or angle, and the light and shade of light. So, how can we get a reliable and stable model under such conditions?
Feature map is used. Feature map is rolled out by convolution kernel. You can get all kinds of feature maps by multiplying the convolution kernel in various cases by the original image. You can understand that you analyze the picture from multiple perspectives. Different feature extraction (kernel) will extract different features. The goal of the model is to solve an optimization, to find the best set of convolution kernels that can explain the phenomenon.
In the same layer, we want to get the description of a picture from multiple angles. Specifically, we use a variety of different convolutions to check the image and get the response on different cores (the core here can be understood as the description) as the characteristics of the image. Their connection lies in the description of images on different bases at the same level. The lower core is mainly some simple edge detectors (also can be understood as physiological simple cell).
After getting the esophageal data, how to determine whether the esophagus belongs to healthy normal esophagus or diseased esophagus? When we judge whether an esophagus is abnormal or not, we only need to find a lesion area to show that the esophagus is abnormal. But on the contrary, in normal images, it can not be said that finding a normal feature means that the esophagus is normal. We can only say that we did not find abnormal features in this image, it may be normal.
Therefore, between normal features and abnormal features, we prefer to extract lesion features and suppress normal features. Both pathological and normal cases will go through neural network to get eigenvectors. For this vector, we want to highlight the abnormal features as much as possible, so that the normal features tend to zero. How do we model this information into the model? We re modeled the model, and the final accuracy is about 97%. The previous model is relatively simple. The third model mainly distinguishes inflammation from cancer, which is not the same as the first two problems.
In general, the image of esophageal lesions will be accompanied by some inflammatory features. Our judgment of cancer is often based on a small texture area, so we need to extract more refined features. A better way is to let many experts mark the focus area very carefully, so we only need to identify this area. This annotation is very large, so the data is extremely scarce. We don’t have annotated data of cancer regions, but we hope to get very fine features. How can we solve this contradiction?
Fortunately, although we can’t get a very accurate annotation image of the lesion area, we can relatively easily know whether an image contains cancer, because we only need to associate it with the case. In this way, we can get the global label of the image more easily.
If an image contains cancer, there must be one or several areas containing the characteristics of cancer. In other words, if we cut the image into several patches, one or several patches will contain cancer features. Based on this idea, we adopt a multi sequence learning method. The internal idea of this method is very simple, that is, the image is divided into several patches, and then each patch is modeled to determine the probability of cancer. Finally, we use the patch with the highest cancer probability as the label of whether the image contains cancer.
In the process of doing this, we will gradually accumulate accurate annotation data, which is very little and not enough to simulate a model. But the features in the image are the most accurate, which are artificially verified and labeled. How can we apply this small amount of accurate data to cancer identification? This is a very interesting problem. If we can solve this problem, even if there is only a small amount of standard data, we can continue to improve. This paper mainly adopts the method of multi task learning, which needs to complete two tasks
A supervised learning task is established based on the labeled data of the lesion area; For the data without lesion region annotation, the multi sequence learning task mentioned above is established. The two models share the feature extraction network, and the feature extraction network must meet two tasks at the same time, so as to enhance the accurate labeled features to cancer recognition.
What is the purpose of auxiliary diagnosis? We hope that machines will eventually be able to diagnose diseases like clinicians. Before introducing the auxiliary diagnosis project, let’s take a look at how a doctor or an ordinary student grows into an expert: a student can accumulate a certain degree of medical knowledge after learning a large number of professional courses and reading a large number of professional medical literature. When the medical knowledge reaches a certain level, you can go to the hospital for practice, and the clinician will guide him to learn the diagnostic skills combined with some real cases.
The construction of medical knowledge map is the process of machine learning knowledge; With the knowledge, we can learn the ability of diagnosis, that is, to establish some models of disease discrimination; Let the machine in the process of game with experts, constantly improve the level of diagnosis, and gradually approach or even exceed the experts. In the construction process of medical knowledge map, we must first process the text data. Text data is divided into two categories, one is semi-structured data, the other is unstructured data.
We can divide the medical history into several parts: the condition of the disease, the course of treatment and the basis of admission; After the medical history is divided into such parts of information, each kind of information is refined and extracted; After extraction, unstructured text becomes structured text that computer can understand; We will transform this information into a medical knowledge map and store it in the computer, so the computer will learn this knowledge.
The process of diagnosis is as follows: firstly, the disease described by human language is transformed into structured knowledge that can be understood by computer. With structured knowledge, the machine can understand the person’s situation and push the knowledge to the disease diagnosis model. The model will give a disease list, and the process of the diagnosis model is roughly like this.