The guest to share in this issue is lixiaoxiao, a doctoral student from the image processing and Analysis Laboratory (ipag) of Yale University. She is under the guidance of Professor Dr. James Duncan. She currently studies deep learning algorithms for medical image analysis. Five papers were entered into the top conferences in the field of medical imaging and Neuroscience (all one work), and won the best Abstract award, student travel award, IPMI scholarship and other awards. He has been engaged in the research and development of deep learning algorithms in Sony of Japan, National Institute of Informatics of Japan, Siemens Medical Division of the United States and JPM AI research.

As far as medical imaging is concerned, GNN is widely used in dozens of ways, so today we will introduce it to you through examples.

Why use GNN to study medical imaging? Because many medical images can be modeled with natural graph structure. It is used for segmentation of blood vessels, analysis of surgical images, multimodal fusion, disease prediction, brain segmentation and brain connection. The main content shared today will be discussed with you according to the application articles mentioned above.

image segmentation

First, let’s take a look at the work of image segmentation:

InteracTIve 3D SegmentaTIon EdiTIng and Refinement via Gated Graph Neural Networks

The graph convolution model proposed in this paper does not do image segmentation from scratch, but improves the results of rough segmentation. The input of this work is the rough segmentation of the image. The edge is not so smooth, and there will be some polygon structures. The purpose of this work is to learn how to predict the motion of some points after obtaining these rough polygon structures through graph learning, so as to make the final segmented image smoother or achieve better segmentation results. The article said that through this method, the effect of IOU measurement has been improved by as much as 10%.

Their method of modeling the segmented image is: first, there will be a rough segmentation result. The contour of the rough segmentation result is a polygon, and the polygon measurement is calculated from the slice of each 3D image. The rough segmentation results are obtained by some existing algorithms. The emphasis of GNN proposed in this paper is still on the later improvement. The modeling process is shown in the following figure:

The green box uses each vertex of the polygon as a node on the graph. Then there are three kinds of connection relationships. The green arrow represents the connection between two closely connected nodes; The blue arrow indicates the connection of nodes far away; The orange arrow indicates the connection of the two nearest nodes between two adjacent slices.

In addition, the graph in this study is a directed graph. From the above figure, we can see that the adjacency matrix is divided into two parts: input and output. The input indicates that it points to this node, and the output indicates that it points to other nodes from this node. The characteristics of each node in the figure are obtained with resnet-50. Finally, input such a graph structure into the gated GNN using Gru. The output model has two parts: one is to predict whether the point has reached the edge, and the other is to move the point to the next point. The movement prediction part in the figure is an m*m matrix, which means the moving range centered on this point.

image retrieval

Linking Convolutional Neural Networks with Graph Convolutional Networks: Application in Pulmonary Artery-Vein Separation

The task of this work is to separate arteries and veins from CT images of pulmonary vessels. The first input graph uses some traditional methods of vessel segmentation and branch extraction to get all nodes, and the edges only consider the connection of first-order neighbors. So this is a graph with very many nodes and very sparse. How do they define the characteristics of nodes in the graph? They extracted the 3D small patch wrapped by each vertex and extracted the features of the patch.

A feature of their work is that they want to connect CNN and GNN end-to-end, which involves a large amount of memory when inputting graphs.

As shown in the following figure, the network on the right is the network that extracts node features. The GCN operations involved are also more traditional GCN operations.

The following figure shows two results. The left is better and the right is worse. What I want to emphasize is that this article compares the performance of three models. One is 3D CNN, the other is cnn-gcn model, and finally there is cnn-gcnt model. This T represents that the CNN model is transformed from the pre-trained model, and cnn-gcnt has the best effect.

Surgical intervention

Graph Convolutional Nets for Tool Presence Detection in Surgical Videos

This work is to study the detection of various surgical instruments. They think that for these surgical videos, the frames of labels are very few, and such videos are also very short. Many traditional target detection only considers the information of a single frame, so they want to use GCN to take the information of spatial and temporal into account at the same time. Because the marked frame length is very short, it is difficult to connect the time-domain dependencies with RNN, so GCN is used. This article is based on two large public datasets.

Let’s take a look at its general framework:

As shown in the above figure, several consecutive frames of video are input into the inflated 3D densenet-121. The article improves densenet. For details, please refer to the original text.

The specific calculation process is as follows:

The temporary pooling mentioned here is actually no different from the pooling core we usually use. The only reason is that the input graph is composed of some frames in time, so the article calls it temporary pooling.

image registration

Learning Deformable Point Set Registration with Regularized Dynamic Graph CNNs for Large Lung Motion in COPD Patients

Traditional image registration is done in the image domain. This article proposes that doing so will have a lot of computing consumption and a long computing time. There are some feature points on the image surface, which can be regarded as the registration between two point sets. So you can use GNN to do it.

Generally speaking, registration is to calculate a spatial transformation and align two graphs or two groups of feature points. In many studies on brain images I have done, registration is usually the first step. If we get these point sets, we can calculate the transformation matrix through point-to-point registration.

The main contribution of this article is the dgcnn module in the figure above. The input to the corresponding module is two sets of point sets, the blue box is the fixed point set, and the orange box is the mobile point set.

Simply take a look at the input required for the operation:

PF and PM are the fixed and moving points mentioned above. (PF) and (PM) are obtained through dgcnn module. The framework of dgcnn is shown in the network diagram above. Each point set has 4096 points and each point has 3D features. After passing through layers including edgeconv, linear, etc., 16 dimensional vector representations of each point, namely (PF) and (PM), are obtained. Here are the results:

Multimodal fusion

Multimodal fusion studies how to combine medical images of different modes by graph convolution.

Interpretable Multimodality Embedding of Cerebral Cortex Using Attention Graph Network for Identifying Bipolar Disorder

This is one of our jobs. It combines the structural magnetic resonance imaging (sMRI) and functional magnetic resonance imaging (fMRI) information of the brain to classify bipolar disorder and normal people. In our work, the composition is based on the correlation of functional MRI in different brain regions. Each brain region is defined as a vertex on the graph, and the correlation coefficient between them is used as the weight of the edges. The way to combine sMRI and fMRI is to stack them on the features of each node.

Another interesting point in this work is that the weighted EGAT (weighted graph attention neural network) is used, as shown in the attention layer in the above figure, because we want to know which brain functional division has a greater impact on bipolar disorder. The pooling method uses diffpool.

On the left of the above figure is the visualization result of attention map and node features. On the right are the results of some parameters and comparative experimental design. In general, the combination of fMRI and sMRI is the best.

Disease prediction

Disease prediction using graph convolutional networks: Application to Autism Spectrum Disorder and Alzheimer’s disease

This is one of the early work of applying GCN to the field of medical images. The main work is to take people as nodes in the graph, and construct edges based on their similarities according to some phenotypic data, such as genes, gender, age, etc. In this paper, the feature vector extracted from the brain image is used as the feature representation of each node. This is a semi supervised learning problem. Some nodes in the graph are labeled (with or without disease) and some nodes are not labeled. The main task is to predict whether these people without labels have disease.

The above work will not be discussed in detail here. Let’s take a look at another related work:

InceptionGCN: Receptive Field Aware Graph Convolutional Network for Disease Prediction

This work, like the previous one, also regards people as nodes in the graph, and uses semi supervised learning to predict whether the people without labels have diseases. What is worth mentioning in this article is its innovation in algorithm, which puts forward inception GCN. Generally, the k-hop neighborhood we consider when doing graph convolution is fixed. For example, graphsage only considers one-hop neighborhood. In this paper, we propose to combine convolution kernels of different receptive fields. For example, in the first dotted box in the figure below, K1 to KS represent that these convolution kernels consider receptive fields of different dimensions, and then combine them together. Then, through an aggregator, the aggregator in this article has two attempts. The first is connection to concatenate all the features obtained from convolution kernels, and the other is maximum pooling.

Interestingly, they found contradictory conclusions in the experiment. They tried on the tapole and abide datasets, and found that the result of inception GCN on the tapole datasets was better than the benchmark model, but on the abide datasets, the result was not as good as the benchmark model.

So they visualized the characteristics of the input data through tsne, and found that the data of tadpole dataset was visualized, and the characteristics of different nodes were more separable. Abide, however, is less separable.

Is it true that the inceptiongcn is not applicable to the current separable graph with such node characteristics? So they did some simulation work. As shown in the following figure:

The figure on the left is a good case. The node characteristics of different groups are clearly distinguished. The figure in the middle is not easy to distinguish.

The results show that in the second case, the effect of inceptiongcn is not very good. So this is a very interesting exploration. When we choose to use the model, we must first consider the data and select the appropriate model.

Large scale medical image analysis

Large medical images are mainly histological medical images. Usually, a histological image is at least several G in size. The traditional GNN based algorithm can not take the whole image as input, so we will use the patch based method to analyze, but it is easy to ignore the connection in the image space. This is also the original intention of using GCN to analyze large images.

Cgc-net: Cell graph convolutional network for grading of colorectal cancer histology images.

The first work I want to talk about is from the 19-year CVPR, using GCN to classify histological images.

Its composition method is to use detection to get each node.

This is the overall framework of this work. If you are interested in details, you can look at the original text.

Another work on the application of GNN to large-scale medical image analysis is:

Weakly-and Semi-supervised Graph CNN for Identifying Basal Cell Carcinoma on Pathological Images

The task is pathological examination to detect basal cell carcinoma in pathological images. Let’s take a look at some of the things they do

The top line is ground truth, which is to detect some pathological patterns. The main idea is to get similar detection through patch based analysis method and GNN. The network framework is as follows:

Firstly, the features of the small patch of the image are input into the pre trained CNN model, and the vector representation of each patch is obtained. After obtaining the vector representation of patch, there are two settings, one is weak supervision setting, and the other is semi supervision setting.

They are all quite simple. You can see the specific content of the article.

Brain segmentation

Graph convolutions on spectral embeddings for cortical surface parcellation.

This work is different from our direct application of existing GNNS. They proposed convolution in spectral domain. As shown in the following figure:

After the input image passes through several spectral domain convolution layers, the segmentation results of the brain are obtained. The small figure on the right is a comparison of the results of their method and other methods. It can be seen that their segmentation method can retain a lot of details and is relatively smooth.

What is special about this spectral convolution? It is mentioned that the traditional spectral embedding can only be realized in the orthogonal grid space on the left. If you want to realize this spectral embedding on the right, you must convert all the basis vectors to the same reference coordinates. The final spectral convolution formula is Z.

Look at the figure below to understand more intuitively how spectral convolution operates.

Brain connection

This is the last application. This is my own work to study how to use GNN to do some analysis of brain connections. This work has two goals. One is how to classify different brain connections, generally patients and non patients. In addition, I also want to explore what kind of brain connection subnetwork is related to diseases.

The whole framework is divided into two parts. The first part is to construct brain networks and classify brain networks. The method of composition is to divide the brain into some intervals, each interval is used as a node, and the edge between the nodes is constructed by the correlation of fMRI time signals between each brain interval. The features of some nodes are extracted by hand. After completing the first step of graph classification, in the second part, we want to explain which subgraphs / nodes have precondition power. So we will test the divided subgraphs in the trained GNN to find the subgraphs / nodes that are important for classification.

Another brain connection is:

Graph Embedding Using Infomax for ASD Classification and Brain Functional Difference Detection

This work combines the recently proposed deep graph Infomax method to strengthen the embedding part after the convolution layer.

In addition to using the graph constructed from the data for graph classification, we have another branch to achieve better node feature embedding effect. In this branch, we construct some false graphs, and then put some representations of the true graph and the false graph into the discriminator to distinguish whether these representations are from the true graph or the false graph.

Here are the visualization results of embedding of 24 brain regions in 148 brain regions. Patients are red and normal people are green. After adding the loss of graph Infomax, some brain regions of normal people and patients are more linearly separable.

Among the 148 brain regions, we found 31 relatively linearly separable brain regions, which are marked in red in the above figure.


There are graph structures in medical images, so GNN can be used to accomplish these tasks mentioned above. After reading these papers, my inspiration is that it is very important to construct the map structure based on medical images. Different construction methods have a great impact on the experimental results. Another point is how to design an appropriate GNN to do specific tasks.
Edit: hfy

Leave a Reply

Your email address will not be published.