Multimodal transformer graph convolution attention isomorphism network (MTCGAIN): a novel deep network for detection of insomnia disorder

Yulong Wang; Yande Ren; Yuzhen Bi; Feng Zhao; Xingzhen Bai; Liangzhou Wei; Wanting Liu; Hancheng Ma; Peirui Bai

doi:10.21037/qims-23-1594

Original Article

Multimodal transformer graph convolution attention isomorphism network (MTCGAIN): a novel deep network for detection of insomnia disorder

Yulong Wang^1#, Yande Ren^2#, Yuzhen Bi², Feng Zhao³, Xingzhen Bai⁴, Liangzhou Wei⁵, Wanting Liu⁵, Hancheng Ma², Peirui Bai¹

¹College of Electronic and Information Engineering, Shandong University of Science and Technology, Qingdao, China; ²Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China; ³School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China; ⁴College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao, China; ⁵Department of Gastroenterology, The Affiliated Hospital of Qingdao University, Qingdao, China

Contributions: (I) Conception and design: Y Wang, Y Ren; (II) Administrative support: Y Ren, P Bai; (III) Provision of study materials or patients: Y Bi, L Wei, W Liu, H Ma; (IV) Collection and assembly of data: Y Bi, L Wei, W Liu, H Ma; (V) Data analysis and interpretation: Y Wang, F Zhao, X Bai; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work.

Correspondence to: Peirui Bai, PhD. College of Electronic and Information Engineering, Shandong University of Science and Technology, 579 Qianwangang Road, Qingdao 266590, China. Email: bprbjd@163.com.

Background: In clinic, the subjectivity of diagnosing insomnia disorder (ID) often leads to misdiagnosis or missed diagnosis, as ID may have the same symptoms as those of other health problems.

Methods: A novel deep network, the multimodal transformer graph convolution attention isomorphism network (MTGCAIN) is proposed in this study. In this network, graph convolution attention (GCA) is first employed to extract the graph features of brain connectivity and achieve good spatial interpretability. Second, the MTGCAIN comprehensively utilizes multiple brain network atlases and a multimodal transformer (MT) to facilitate coded information exchange between the atlases. In this way, MTGCAIN can be used to more effectively identify biomarkers and arrive at accurate diagnoses.

Results: The experimental results demonstrated that more accurate and objective diagnosis of ID can be achieved using the MTGCAIN. According to fivefold cross-validation, the accuracy reached 81.29% and the area under the receiver operating characteristic curve (AUC) reached 0.8760. A total of nine brain regions were detected as abnormal, namely right supplementary motor area (SMA.R), right temporal pole: superior temporal gyrus (TPOsup.R), left temporal pole: superior temporal gyrus (TPOsup.L), right superior frontal gyrus, dorsolateral (SFGdor.R), right middle temporal gyrus (MTG.R), left middle temporal gyrus (MTG.L), right inferior temporal gyrus (ITG.R), right median cingulate and paracingulate gyri (DCG.R), left median cingulate and paracingulate gyri (DCG.L).

Conclusions: The brain regions in the default mode network (DMN) of patients with ID show significant impairment (occupies four-ninths). In addition, the functional connectivity (FC) between the right middle occipital gyrus and inferior temporal gyrus (ITG) has an obvious correlation with comorbid anxiety (P=0.008) and depression (P=0.005) among patients with ID.

Keywords: Insomnia disorder (ID); functional connectivity (FC); graph neural networks (GNNs); transformer

Submitted Nov 24, 2023. Accepted for publication Mar 06, 2024. Published online Apr 07, 2024.

doi: 10.21037/qims-23-1594

Introduction

Insomnia disorder (ID) is becoming a serious health problem due to the acceleration in the pace of life pace and the increase of work pressure (1). The main symptom of ID is an emerging persistent disorder in sleep quality or sleep time that directly affects the individual’s quality of life and social function. According to the suggested criteria of the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), if the sleep problem occurs at least three times a week for at least 3 months, it can be considered ID (2,3). However, it remains a challenging task to arrive at an accurate and objective diagnosis of ID in clinic (4), and this is compounded by the presence of comorbid anxiety and depression (5,6). Therefore, there is a critical need to develop an accurate diagnostic method of ID, identify reliable biomarkers, and more fully clarify its pathogenesis. Intuitively, functional connectivity (FC) is employed to disclose cerebral diseases, including brain impairments in patients with ID and other psychiatric disorders (7-9). That is, the brain network can be conceived of as a graph composed of nodes and edges, where nodes represent brain regions and edges represent the correlation between different regions (10). Through the exploration of the interactions between different brain regions and the analysis of FC, a better understanding about brain mechanism can be achieved (11).

Traditionally, the detection of ID is mainly based on electroencephalography (EEG) (12). For example, Almuhammadi et al. (13) used support vector machine (SVM) to detect obstructive sleep apnea (OSA) in the publicly available EEG dataset of 70 recordings, achieving a classification accuracy of 97.14%. Shahin et al. (14) employed SVM to detect ID on a self-produced EEG dataset of 115 participants and achieved an F1-score, sensitivity, and specificity of 0.88, 84%, and 91%, respectively. Qu et al. (15) used a convolutional neural network (CNN) and a recurrent neural network (RNN) to detect ID on the self-produced EEG dataset and achieved an insomnia detection rate of 90.9%. Kusmakar et al. (16) collected nocturnal actigraphy signals and employed random forest and SVM to detect chronic insomnia, reporting a classification accuracy of 80% in classifying insomnia individuals from their healthy bed partners. However, the collection time of EEG is relatively long and susceptible to interference from the scalp and the skull, and it is relatively difficult to locate the internal sources of brain activities.

Resting-state functional magnetic resonance imaging (rs-fMRI) which is a noninvasive technique, provides an alternative means to investigating brain activities (17). Through the extraction and analysis of the blood oxygen level-dependent (BOLD) signal, fMRI can accurately localize activated brain regions and visualize brain activity patterns and FC across entire brain region (18-20). For instance, Lee et al. (21) achieved 80% accuracy in detecting ID using SVM by analyzing a self-produced fMRI task data of 40 participants. The bilateral inferior frontal gyrus, right calcarine cortex, right lingual gyrus, left inferior occipital gyrus, and left inferior temporal gyrus (ITG) were identified as brain regions associated with insomnia. Shahid et al. (22) employed CNN segmentation of the pharynx of patients with OSA by analyzing the self-produced MRI data of 50 participants, and the Jaccard coefficient for the pharynx being segmented was approximately 86%. Ma et al. (23) employed the multivariate relevance vector regression method to detect short-term/acute and chronic subtypes by analyzing self-produced MRI data of 73 participants. They found that FC predicted sleep quality in both short-term/acute and chronic insomnia and that FC patterns changed during the transition from short-term/acute to chronic insomnia. These studies demonstrate the effectiveness of resting-state fMRI (rs-fMRI) in detecting ID, thus its application in revealing the altered brain activities in patients with ID may yield valuable insights. Expanded dynamic FC (dFC) serves as a better detector for describing the relationship between BOLD signals and FC (24,25). dFC-based analysis can be used to conveniently to identify potential biomarkers and gain deeper insights (26) but may be constrained by certain limitations. First, the complexity of causative factors problematizes using machine learning to detect ID, especially when sample sizes are small. Second, rational network architectures and feature interactions need to be considered to better cope with the complex associations between brain regions. Third, it is challenging to track and identify the brain activity in patients with ID, especially when there are comorbidities with other psychiatric disorders.

To address these problems, we introduce the graph neural network (GNN) to extract the interactions among different brain regions. The GNN is a powerful tool for handling graph-structured data and is suitable for capturing information propagation between graph nodes (27-29). Furthermore, we have leveraged global parallel computing of multimodal transformer (MT) (30) to improve the information exchange between DFC and FC features. The MT does well in finding global dependencies within the multimodal input sequence or multimodal data. If we embed the multimodal attention (MA) blocks into the GNN, this can extract the features of the brain’s functional network and reveal the associations between brain activity and different diseases. Our study has three main contributions. First, a novel deep network named multimodal transformer graph convolution attention isomorphism network (MTGCAIN) is proposed to detect and diagnose ID. To the best of our knowledge, the advantages of GNN and transformers have not yet been applied to the detection of ID. Second, we propose the identifications of more reliably distinguishing ID symptoms by fusing the related brain regions and FC with clinical indicators. This approach captures the intricate relationship between brain activity and clinical characteristics, providing a comprehensive understanding of the brain activity underlying ID. Third, we employ the proposed method to evaluate the psychological influences and neural states of ID and attempt to identify the psychological and neural comorbidities associated with ID. The experimental results indicated that our approach may provide a new means to unravelling the associations between brain network activity and ID and open up novel perspectives and methods for investigating and treating ID.

The remainder of the paper is organized as follows. The details of the methods are presented in the methods section, including materials and preprocessing, dynamic graph construction, MTGCAIN construction and experimental setup. The performance evaluation and extraction of abnormal brain regions and FC are presented in the results section. The analysis and discussion of the mining of abnormal regions and the relevant FC of ID are presented in the discussion section. In the last section, concluding remarks are drawn to summarize the findings and implications regarding the relationship between brain functions and ID.

Methods

Data acquisition and preprocessing

In this study, we acquired rs-fMRI images using the Signa HDX 3.0T MRI device (GE HealthCare) under the permission of the Affiliated Hospital of Qingdao University and recruited 62 volunteers, comprising 32 healthy controls (HCs) and 30 patients with ID. The diagnostic criteria for ID included meeting the definition of DSM-5 and a Pittsburgh Sleep Quality Index (PSQI) greater than 7 (2,31). Patients with ID were excluded due to having (I) a history of serious neurological or medical illness; (II) an occupation requiring shift work; (III) any contraindication to MRI; (IV) a history of medication-based treatment for ID; (V) a history of alcohol abuse, drug abuse, or smoking; (VI) abnormal signals on conventional MRI imaging; and (VII) a condition of pregnancy, lactation, or menstruation. The demographics of the participants from both groups are shown in Table 1. During the scanning procedure, the volunteers were required to keep their eyes closed and stay awake and relaxed, without engaging in any specific thinking. We used a single-excitation gradient echo-planar echo imaging sequence. The scan parameters were as follows: time to repetition (TR) =2,000 ms, time to echo (TE) =30 ms, flip angle (FA) =90°, 35 slices, thickness =4.0 mm, gap =0.6 mm, matrix size = 64×64, field of view (FOV) =22 cm × 22 cm, and acquisition times =240 time points (about 8 minutes).

Table 1

Demographic data of the two study groups

Variable	HCs	Patients with ID	P value
Age (years)	32.28±8.90	35.93±12.68	0.19
Sex (male/female)	12/20	13/17	0.62
Education (years)	12.28±2.71	13.05±3.22	0.32
PSQI	3.44±1.72	11.50±3.13	<0.001

Data are expressed as the mean ± standard deviation. HCs, healthy controls; ID, insomnia disorder; PSQI, pittsburgh sleep quality index.

The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). This study was approved by the Institutional Review Board at the Affiliated Hospital of Qingdao University (No. QYFY WZLL 28505) and informed consent was taken from all the patients. The MRI volumetric data were preprocessed using the RESTplus v. 1.2 MATLAB toolbox (32). The preprocessing included the following set of operations: (I) converting image format from Digital Imaging and Communications in Medicine (DICOM) to Neuroimaging Informatics Technology Initiative (NIfTI); (II) removing the first 10 points of the original data; (III) correcting the time layer, head motion, and original points, with participants with a head movement greater than 2.5 mm or 2.5° being excluded; (IV) spatial normalization, with the voxel size being rescaled to 3×3×3 mm³; (V) regression of several interfering covariates such as cerebrospinal fluid signal, white matter signal, and six head motion parameters (spatial smoothing was not adopted here to avoid increasing local spatial correlation) (33); (VI) eliminating linear trends of the data and conducting band-pass filtering (0.01–0.08 Hz); and (VII) extracting the BOLD signals using three brain atlases, including Automated Anatomical Labeling (AAL; 116 parcels), Craddock-200 (CC200; 200 parcels), and Schaefer (400 parcels), to sequentially refine the delineation of brain regions and guide the experiments. To augment the dataset, the BOLD signals of each participant were nonoverlappingly and equally divided into five parts (34). Thus, one participant was divided into five samples. It is worth noting that these five samples could only appear in either the training or test set at the same time and could not be split between the two sets. As a result, a total of 310 input data samples were obtained, comprising 160 HCs and 150 patients with ID.

Construction of the dynamic graph

The construction process of the dynamic graph is illustrated in Figure 1. Assume a BOLD signal has M points. A set of FC matrices are formed by traversing the BOLD signals with a sliding window. The total number of windows is determined as follows:

$K = \frac{M - τ}{s} + 1$ [1]

Figure 1 Workflow for construction of the dynamic graph. The BOLD signal is extracted using multiple plots. For each graph, the node features H of the dynamic graph are constructed from the BOLD signals by timestamp encoding and position encoding. The adjacency matrix C is obtained from the Pearson correlation between BOLD signals. The top G% of the strongest FC in C is selected to obtain the binarized adjacency matrix A. BOLD, blood oxygen level-dependent; rs-fMRI, resting-state functional magnetic resonance imaging.

where τ is window length and s is sliding stride.

The Pearson correlation between different BOLD signal pairs in each window can be calculated as follows:

$\begin{matrix} C_{i j} (k) = c o r r (B_{i} (k), B_{j} (k)) = \frac{cov (B_{i} (k), B_{j} (k))}{\sqrt{var (B_{i} (k)) \cdot var (B_{j} (k))}} \end{matrix}$ [2]

where C denotes the FC matrix; ij denotes the index number of BOLD signals, 1≤i, j≤N; N represents the total number of brain regions; and B(k) denotes the BOLD signal within the k-th sliding window, 1≤k≤K. Subsequently, the values of the top G FC in matrix $\begin{matrix} C (k) \end{matrix} \in ℝ^{N \times N}$ are set to 1, while the values of the remaining FC are set to 0. Then, a binary adjacency matrix $\begin{matrix} A (k) \end{matrix} \in {0, 1}^{N \times N}$ is obtained to represent the dynamic graph.

For the i-th node in the k-th dynamic graph, the feature is encoded with timestamps using long short-term memory (LSTM) for all the time points in the first k windows. The encoded feature is then concatenated adopting one-hot positional encoding and fed into a linear layer to produce the feature vector $\begin{matrix} h_{i} (k) \end{matrix} \in ℝ^{d}$ of i-th node (d represents the feature dimension). The operations of forming $\begin{matrix} h_{i} (k) \end{matrix}$ can be expressed as follows:

$h_{i} (k) = c o n c a t (L S T M (B_{i}), o n e h o t (B_{i})) W$ [3]

where $\begin{matrix} W \end{matrix}$ denotes the weights of the linear layer. Thus, the node feature matrix $\begin{matrix} H (k) \end{matrix}$ of the k-th dynamic graph is obtained by concatenating the feature vector of each node; that is, $\begin{matrix} H (k) = [h_{1} (k), ..., h_{i} (k), ..., h_{N} (k)] \end{matrix} \in ℝ^{d \times N}$ .

Construction of the MTGCAIN

The pipeline of MTGCAIN is illustrated in Figure 2. The MTGCAIN has L parallel channels in total. Each channel contains four blocks: the graph convolution block, the readout block, the encoder block, and the output block. In each channel, the dynamic graph is first input into the graph convolution block. The features are then extracted by the readout block. In the encoder block, the information interaction among multiple atlases is carried out using the dynamic graph. Finally, the output classification results expressed by predicted probabilities are generated through a linear layer.

Figure 2 The pipeline of the MTGCAIN with L parallel channels. Each channel consists of a graph convolution block, readout block, encoder block, and output block. The output of the graph convolution block is used as the input of the next channel. MTGCAIN, multimodal transformer graph convolution attention isomorphism network; MLP, multilayer perceptron; MA, multimodal attention; GCA, graph convolution attention; GC, graph convolution.

The graph convolution block

In GNN, the convolution operation can be expressed iteratively as follows:

$h_{i}^{(l)} (k) = {MLP}^{(l)} ((1 + ϵ^{(l)}) \cdot h_{i}^{(l - 1)} (k) + \sum_{j = 1, j \in A_{i}}^{N} A_{i, j} (k) \cdot h_{j}^{(l - 1)} (k))$ [4]

where MLP represents a multilayer perceptron containing two linear layers, and ϵ denotes the learnable weight parameters. In the l-th channel, the node feature $\begin{matrix} h_{i}^{(l)} (k) \end{matrix} \in ℝ^{d}$ is obtained by aggregating the node features and its neighboring nodes in the (l-1)-th channel. Hence, the node feature matrix of the k-th dynamic graph in the l-th channel is represented as follows: $\begin{matrix} H^{(l)} (k) = [h_{1}^{(l)} (k), ..., h_{i}^{(l)} (k), ..., h_{N}^{(l)} (k)] \end{matrix} \in ℝ^{d \times N}$ . Consequently, Eq. 4 can be rewritten in the form of the matrix multiplication as follows:

$H^{(l)} (k) = R e L U (BN (ReLU (BN ((ϵ^{(l)} \cdot I + A (k)) H^{(l - 1)} (k) W_{h, 1}^{(l)} (k))) W_{h, 2}^{(l)} (k)))$ [5]

where $\begin{matrix} I \end{matrix}$ denotes an identity matrix, $\begin{matrix} W \end{matrix}$ represents the network weights of the MLP, and BN denotes a batch normalization operation.

The readout block

The readout block includes graph convolution attention (GCA) and sigmoid nonlinear mapping. The schematic diagram of GCA is illustrated in the middle right portion of Figure 2. The role of GCA is to capture the global characteristics of node features and to emphasize local key features and edges. The global information and attention scores are obtained from the neighborhood node features and graph edges (27). Specifically, the graph convolution is applied to $\begin{matrix} H^{(l)} (k) \end{matrix}$ to compress the d × N feature matrix into a 1 × N feature vector. The vector is used as the attention score vector $\begin{matrix} v^{(l)} (k) \end{matrix} \in ℝ^{N}$ in the dynamic graph. The procedure can be expressed as follows:

$v^{(l)} (k) = R e L U (w^{(l)} (k) (D^{\frac{1}{2}} (k) (I + A (k)) D^{- \frac{1}{2}} (k) H^{(l)} (k)))$ [6]

where $\begin{matrix} D \end{matrix}$ represents the degree matrix of $\begin{matrix} I \end{matrix} + A$ , and $\begin{matrix} w \end{matrix} \in ℝ^{1 \times d}$ represents the weights of the linear layer. Subsequently, the nonlinear mapping of $\begin{matrix} v^{(l)} (k) \end{matrix}$ can be implemented to obtain the readout vector $\begin{matrix} x^{(l)} (k) \end{matrix} \in ℝ^{d}$ as follows:

$X^{(l)} (k) = H^{(l)} (k) \times s i g m o i d (v^{(l)} (k))$ [7]

The feature matrix with size of D×K is then obtained by concatenating the dynamic feature vector as follows:

$\begin{matrix} X^{(l)} = [x^{(l)} (1), \dots, x^{(l)} (k), \dots, x^{(l)} (K)] \end{matrix}$ [8]

The encoder block

The traditional transformer only facilitates information propagation within a single modality using the self-attention mechanism. However, in this study, we aimed to realize the information interaction between different atlases. Therefore, we introduced MA in the encoder block to share and aggregate information across different modalities.

Assuming there are m modalities, the i-th modality can be expressed as follows:

$\begin{matrix} Q_{i}^{(l)} = X_{i}^{(l)} W_{i, q u e r y}^{(l)} \end{matrix}$ [9]

$K_{i}^{(l)} = X_{i}^{(l)} W_{i, k e y}^{(l)}$ [10]

$V_{i}^{(l)} = X_{i}^{(l)} W_{i, v a l u e}^{(l)}$ [11]

Subsequently, $X_{i, a t t e n t i o n}^{(l)}$ and $X_{i, h i d d e n}^{(l)}$ for this modality can be derived as follows:

$X_{i, a t t e n t i o n}^{(l)} = L a y e r N o r m (X_{i}^{(l)} + \frac{1}{m} \sum_{j = 1}^{m} softmax (\frac{Q_{j}^{(l)} K_{j}^{(l) T}}{\sqrt{d}}) V_{j}^{(l)})$ [12]

$X_{i, h i d d e n}^{(l)} = L a y e r N o r m (X_{i, a t t e n t i o n}^{(l)} + ReLU (X_{i, a t t e n t i o n}^{(l)} W_{i, 1}^{(l)} W_{i, 2}^{(l)}))$ [13]

Therefore, the feature matrix $X_{h i d d e n}^{(l)} \in ℝ^{d ’ \times K}$ (where d’=d×m) for all modalities can be obtained from the following formula:

$X_{h i d d e n}^{(l)} = c o n c a t ({X_{i, h i d d e n}^{(l)} | i \in {1, \dots, m}})$ [14]

The workflow of MA is illustrated in the top right corner of Figure 2. The attention mechanism of MA enables information exchange between the graph features of different atlases. Therefore, this approach can deal with the features of multiple atlases more directly, thus realizing multimodal feature fusion in the encoder block. More accurate and comprehensive multimodal analysis and modeling can be achieved by introducing the MA.

The output block

As mentioned previously, each channel in the graph convolution yields a feature matrix $X_{h i d d e n}^{(l)}$ for each subject. The feature vector of each subject is obtained by averaging all the dynamic graph vectors under their respective feature matrix. Then, the feature vectors pass through linear layer to obtain n-class probabilities as follows:

${\hat{y}}^{(l)} = (\frac{1}{K} \sum_{k = 1}^{K} X {(k)}_{h i d d e n}^{(l)}) W_{y}^{(l)}$ [15]

where ${\hat{y}}^{(l)} = {{\hat{y}}_{1}^{(l)}, \dots, {\hat{y}}_{n}^{(l)}} \in [0, 1]$ . To alleviate the oversmoothing and optimally leverage the multiscale features, the predicted probabilities at each channel are added up to obtain the final prediction probability as follows:

$\hat{y} = \frac{1}{L} \sum_{l = 1}^{L} {\hat{y}}^{(l)}$ [16]

Loss function design

In this study, we designed a hybrid loss function for training the MTGCAIN. The hybrid loss function is expressed as follows:

$L_{t o t a l} = L_{c e} + λ_{1} L_{o r t h o} + λ_{2} L_{u n i t}$ [17]

where λ₁ and λ₂are weight coefficients.

$\begin{matrix} L_{c e} \end{matrix}$ is the cross-entropy loss function which can be expressed as follows:

$L_{c e} = - \frac{1}{P} \sum_{p = 1}^{P} \sum_{q = 1}^{Q} y_{p, q} \log ({\hat{y}}_{p, q})$ [18]

where P is the number of instance, Q is the number of classes, y is the ground truth, and $\hat{y}$ is the prediction output of the model.

$\begin{matrix} L_{o r t h o} \end{matrix}$ is the orthogonality constraint loss function which can be expressed as follows:

$L_{o r t h o} = \sum_{l = 1}^{L} \sum_{k = 1}^{K} \frac{1}{m} \cdot H^{(l)}^{T} (k) H^{(l)} (k) - I_{2}$ [19]

where $m = m a x (H^{(l)}^{T} (k) H^{(l)} (k))$ . The term 1/m ensures orthogonality between feature vectors in different channels (35). The purpose of the orthogonality constraint is to make the feature matrix $\begin{matrix} H \end{matrix}$ a full rank matrix so that the features represented by the eigenvector $\begin{matrix} h \end{matrix}$ are richer. This can alleviate collinearity issues among features to enhance feature independence.

$\begin{matrix} L_{u n i t} \end{matrix}$ is the unit constraint loss function which can be expressed as follows:

$L_{u n i t} = \sum_{l = 1}^{L} {(w_{2}^{(l)} - 1)}^{2}$ [20]

where w is a feature projection vector. The purpose of the unit constraint is to spread the attention scores as much as possible and improve the model’s focus on different features.

Implementation details

The proposed MTGCAIN was validated on the self-produced dataset. The preprocessing was carried out using the RESTplus v. 1.2 MATLAB toolbox. The training and testing were implemented using PyTorch in the Python environment, supported by an Nvidia RTX A6000 with 48 GB of GPU memory.

In this study, we set the window length and sliding stride based on a priori knowledge (24); that is, W=20 and s=1. The model was optimally parameterized in the ID dataset as follows: channel number L=4, feature dimension d=128, G=30%, and weight coefficients in the hybrid loss function λ₁=λ₂=0.00001. The training parameters were epoch=30 and initial learning rate lr=0.0005, and a changeable learning rate was adopted. In the first 20% training epochs, the learning rate increased gradually to 0.001 and then decreased gradually to 5.0×10⁻⁷. Fivefold cross-validation was employed to ensure the stability of the results. Four popular quantitative metrics, including accuracy, precision, recall, and area under the receiver operating characteristic (ROC) curve (AUC) were used to evaluate the performance of MTGCAIN. Comparative experiments with SVM (21), brain graph neural network (BrainGNN) (28), graph convolutional network (GCN) (36), spatio-temporal attention graph isomorphism network (STAGIN) (35), and multi-granular, multi-atlas spatio-temporal attention graph isomorphism network (IMAGIN) (37) were conducted to demonstrate the superiority of MTGCAIN over existing techniques. The experimental results are shown in the model comparison section. The code is available at https://github.com/YuloongWang/MTGCAIN.

Results

Model comparison

As listed in Table 2, five main methods of brain network analysis were used for comparison. It is clear that MTGCAIN achieved the best performance in all four metrics. The mean accuracy, precision, recall, and AUC were 81.29%, 79.44%, 84.00%, and 0.8760, respectively. Among these, the most significant improvement was observed in for recall, which reached 84%. Furthermore, all the models had relatively low precision, and that of the MTGCAIN was only 79.44%.

Table 2

The four quantitative metrics outcomes of the comparative experiments

Model	Accuracy (%)	Precision (%)	Recall (%)	AUC
SVM (21)	60.52±7.13	64.35±6.15	71.26±8.84	0.6108±0.1026
BrainGNN (28)	69.35±3.68	70.18±3.23	70.63±5.80	0.6931±0.0364
GCN (36)	71.94±1.94	70.82±4.97	73.33±8.17	0.7586±0.0343
STAGIN (35)	72.90±4.72	70.01±6.77	78.67±4.00	0.8029±0.0313
IMAGIN (37)	76.45±4.63	74.58±5.80	78.67±8.59	0.8514±0.0305
MTGCAIN	81.29±3.32	79.44±5.98	84.00±5.73	0.8760±0.0175

Data are expressed as the mean ± standard deviation. AUC, area under the receiver operating characteristic curve; SVM, support vector machine; BrainGNN, brain graph neural network; GCN, graph convolutional network; STAGIN, spatio-temporal attention graph isomorphism network; IMAGIN, multi-granular, multi-atlas spatio-temporal attention graph isomorphism network; MTGCAIN, multimodal transformer graph convolution attention isomorphism network.

Loss function design is a critical factor to keeping the stability and convergence in the training of a deep network. In this study, we used a hybrid loss function, defined in Eq. [17]. The results of the ablation study are shown here to demonstrate the necessity of using the hybrid loss function. As listed in Table 3, the use of $\begin{matrix} L_{c e} \end{matrix}$ and $\begin{matrix} L_{o r t h o} \end{matrix}$ resulted in significant improvements in recall and AUC, which improved by 5.33% and 0.0271, respectively. Similarly, the use of and improved precision by 3.76%. The use of the three losses resulted in 3.87% improvement in accuracy for the MTGCAIN. These results demonstrate that incorporating the orthogonality and unit constraints can enhance the overall performance.

Table 3

Effectiveness analysis of the hybrid loss functions on MTGCAIN

Loss	Accuracy (%)	Precision (%)	Recall (%)	AUC
$\begin{matrix} L_{c e} \end{matrix}$	77.42±2.04	76.91±6.49	78.00±5.42	0.8429±0.0139
$\begin{matrix} L_{c e} \end{matrix}$ , $\begin{matrix} L_{o r t h o} \end{matrix}$	79.03±3.68	76.86±6.79	83.33±7.30	0.8700±0.015
$\begin{matrix} L_{c e} \end{matrix}$ , $\begin{matrix} L_{u n i t} \end{matrix}$	78.71±3.13	80.67±10.48	77.33±11.04	0.8457±0.0185
$\begin{matrix} L_{c e} \end{matrix}$ , $\begin{matrix} L_{o r t h o} \end{matrix}$ , $\begin{matrix} L_{u n i t} \end{matrix}$	81.29±3.32	79.44±5.98	84.00±5.73	0.8760±0.0175

Data are expressed as the mean ± standard deviation. MTGCAIN, multimodal transformer graph convolution attention isomorphism network; AUC, area under the receiver operating characteristic curve.

Abnormal regions explored with the MTGCAIN

A major motivation for developing the MTGCAIN is to explore the most relevant biomarkers associated with ID. As the AAL atlas is widely used to provide accurate brain parcellation and rich structural information, we employed it here as an anatomical reference to maintain the generalizability and reproducibility of the results. The brain regions were divided into six subnetworks: the sensorimotor network (SMN), visual network (VN), execution and attention network (EAN), default mode network (DMN), subcortical nuclei (SBN) region, and cerebellum network (CEN) (38).

After the GCA scores were obtained from the MTGCAIN, and two operations were conducted to realize the accurate localization of brain lesions in patients with ID. First, the nodes within the top 10 ranks of attention scores in each channel of the MTGCAIN were selected. Second, the nodes that appeared consistently across all channels were further filtered. From this, we identified nine nodes corresponding to nine brain regions for diagnosing ID. The nine explored brain regions, corresponding subnetworks, and specific functions are presented in Table 4. The distribution of the explored abnormality regions related to ID is visualized in Figure 3. The right temporal pole: superior temporal gyrus (TPOsup.R), left TPOsup (TPOsup.L), left middle temporal gyrus (MTG.L), right MTG (MTG.R), and right inferior temporal gyrus (ITG.R) were found in all three atlases. The detailed data of the results, such as attention scores, can be retrieved from Github (https://github.com/YuloongWang/MTGCAIN) for researchers to analyze. In the abnormal FC as characterized by the MTGCAIN section, a comprehensive investigation of the neuroimaging biomarkers of ID through the integration of clinical indicators is detailed.

Table 4

The nine most-relevant brain regions to insomnia disorder mined by MTGCAIN

Brain region	Subnetwork	Specific function	References
SMA.R	SMN	Receiving somatosensory information	(4,39,40)
TPOsup.R	SMN	Receiving somatosensory information	(41,42)
TPOsup.L	EAN	Controlling goal-directed and intellectual activities	(41,42)
SFGdor.R	DMN	Meditation and introspection	(4,5,18,39,40,42)
MTG.L	DMN	Meditation and introspection	(3,40,42-44)
MTG.R	DMN	Meditation and introspection	(3,40,42-44)
ITG.R	DMN	Meditation and introspection	(5,31,40)
DCG.L	SBN	Regulation and control the exchange of information	(42)
DCG.R	SBN	Regulation and control the exchange of information	(42)

MTGCAIN, multimodal transformer graph convolution attention isomorphism network; SMA, supplementary motor area; SMN, sensorimotor network; TPOsup, temporal pole: superior temporal gyrus; EAN, execution and attention network; SFGdor, superior frontal gyrus, dorsolateral; DMN, default mode network; MTG, middle temporal gyrus; ITG, inferior temporal gyrus; DCG, median cingulate and paracingulate gyri; R, right hemisphere; L, left hemisphere; SBN, subcortical nuclei.

Figure 3 Illustration of abnormal brain regions related to insomnia disorder under different views explored by the MTGCAIN. (A) Brain region labeling in the AAL atlas. (B) Brain region labeling in the CC200 atlas. (C) Brain region labeling in Schaefer atlas. L, left cerebral hemisphere; R, right cerebral hemisphere; MTGCAIN, multimodal transformer graph convolution attention isomorphism network; AAL, automated anatomical labeling; CC200, Craddock-200.

Abnormal FC as characterized by the MTGCAIN

Given the explored ID-relevant brain regions, we surmised that examining the FC between these brain regions could determine the relationship between FC and ID. Furthermore, we thought it would be fruitful to distinguish between healthy individuals and patients with ID. We selected the effective FC according to the following criteria: (I) FC involving at least one of the explored abnormal brain regions, (II) the strength of FC ranked at the top G values, (III) FC exhibiting significant differences (P<0.05) between the HC group and the ID patient group, and (IV) FC showing significant correlations (P<0.05) with PSQI. As listed in Table 5, four FCs were selected: right supplementary motor area (SMA.R)-left parahippocampal gyrus (PHG.L), TPOsup.R-right middle frontal gyrus (MFG.R), TPOsup.R-ITG.L, and ITG.R-right middle occipital gyrus (MOG.R). The bold brain regions in the table represent the selected abnormal brain regions. The distribution of the FC is depicted in Figure 4A. The comparisons of the FC strengths between the HC and the ID patient group are shown in Figure 4B. In addition, a significant analysis of the selected FCs with the self-rating anxiety scale (SAS) and self-rating depression scale (SDS) was conducted, the results of which are listed in the last two columns of Table 5. Meanwhile, Figure 5 shows the relationship between the FC strengths and the scale scores.

Table 5

Significance test of FC in the insomnia disorder group in terms of PSQI, SAS, and SDS

FC	Group (P)	PSQI (P)	SAS (P)	SDS (P)
SMA.R-PHG.L	0.009	0.02	0.31	0.11
TPOsup.R-MFG.R	0.01	0.01	0.57	0.15
TPOsup.R-ITG.L	0.01	0.01	0.09	0.05
ITG.R-MOG.R	0.01	0.04	0.008	0.005

FC, functional connectivity; PSQI, pittsburgh sleep quality index; SAS, self-rating anxiety scale; SDS, self-rating depression scale; SMA, supplementary motor area; PHG, parahippocampal gyrus; TPOsup, temporal pole: superior temporal gyrus; MFG, middle frontal gyrus; ITG, inferior temporal gyrus; MOG, Middle occipital gyrus; R, right hemisphere; L, left hemisphere.

Figure 4 Spatial distribution and boxplots of the four functional connectivities. (A) The spatial distribution of the selected functional connectivities related to insomnia disorder. (B) Comparisons of functional connectivity strengths between the healthy control group and the insomnia disorder group. L, left cerebral hemisphere; R, right cerebral hemisphere; PHG, parahippocampal gyrus; ITG, inferior temporal gyrus; MFG, middle frontal gyrus; TPOsup, temporal pole: superior temporal gyrus; MOG, middle occipital gyrus; SMA, supplementary motor area; ID, insomnia disorder.

Figure 5 Scatter plots for the relationship between clinical scales and the FC of the insomnia disorder group. (A) Positive correlation (P=0.02) of FC strength between SMA.R and PHG.L with PSQI. (B) Positive correlation (P=0.01) of the FC strength between MFG.R and TPOsup.R with PSQI. (C) Positive correlation (P=0.01) of the FC strength between TPOsup.R and ITG.L with PSQI. (D) Positive correlation (P=0.04) of FC strength between MOG.R and ITG.R with PSQI. (E) Positive correlation (P=0.008) of FC strength between MOG.R and ITG.R with SAS. (F) Positive correlation (P=0.005) of the FC strength between MOG.R and ITG.R with SDS. PSQI, pittsburgh sleep quality index; SMA, supplementary motor area; PHG, parahippocampal gyrus; MFG, middle frontal gyrus; TPOsup, temporal pole: superior temporal gyrus; ITG, inferior temporal gyrus; MOG, middle occipital gyrus; L, left cerebral hemisphere; R, right cerebral hemisphere; SAS, self-rating anxiety scale; SDS, self-rating depression scale; FC, functional connectivity;

Discussion

Performance evaluation of MTGCAIN

The model comparison in Table 2 demonstrates the superiority of MTGCAIN and that the SVM falls considerably short in extracting features, as its four metrics were the worst compared to the other methods. This indicates that the conventional machine learning approach is limited in capturing complicated multiple-channel dynamic features. The BrainGNN and GCN performed similarly in terms of the four metrics, but both were limited in capturing the global graph structural information. The MTGCAIN’s readout block can capture graph features from a global perspective to improve feature representation. The STAGIN is suitable for dealing with a single-modal input pattern and showed a relatively superior performance. However, it is still insufficient for learning multimodal patterns. The MTGCAIN uses multimodal inputs to solve this problem. The IMAGIN incorporates multimodal inputs and demonstrated obvious performance enhancement. The significant improvement in its AUC indicated that the usage of multimodal inputs enables more accurate discrimination between the positive and negative samples. However, its capability in terms of information interaction and deep feature processing still needs to be explored. The attention mechanism of MT in MTGCAIN can fully integrate the deep features of multiple modalities for better feature representation. Owing to its powerful information interaction capability between different modalities, the MTGCAIN exhibited favorable performance in predicting ID classification. Furthermore, the false-positive and false-negative rates of MTGCAIN were 21.25% and 16%, respectively. Its higher false-positive rate indicates that HCs are more likely to be misdiagnosed as ID. Comparatively, the false-negative rate was lower, indicating that IDs were less likely to be misdiagnosed as HCs. In clinical application, attention may need to be paid to the risk of HCs being misdiagnosed as ID by the model.

To examine how λ₁ and λ₂ affect performance, we conducted an ablation study. Due to the high cost of training deep learning models, we used an empirical approach in which we first adjusted λ₂ to fix λ₁ to 0 and then adjusted λ₁ to obtain a determined λ₂. The results are shown in Figure 6. It can be seen that the highest accuracy was achieved with λ₁ = λ₂ =0.00001. The orthogonality and unit constraints can act as a priori knowledge during the training process, which can help guide the model to converge to a more reasonable solution. In addition, these constraints can reduce the parameter space of the model and help to find the convergence point faster. Furthermore, these constraints can also reduce the magnitude of weight updates and improve the stability of the model.

Figure 6 Effect of loss weights λ₁ and λ₂ on accuracy.

Regional distribution of the abnormal regions in patients with ID

As can be seen from Table 4 and Figure 3, the brain regions involved in sensory processing, cognition, meditation, and information exchange are closely related to ID (19). Interestingly, there were four ID relevant brain regions belonging to the subnetwork of the DMN. This indicated that the DMN subnetwork plays a critical role in regulating healthy sleep. Indeed, Mak et al. found that the balanced regulation of the aforementioned regions is crucial for maintaining healthy sleep, emotion, and cognition, whereas disrupted balance may be associated closely with ID (45). According to Wang et al., excessive arousal of the DMN and heightened sensitivity to visual and auditory stimuli may be the primary mechanisms underlying insomnia, leading simultaneously to cognitive impairment and emotional dysregulation (46). More detailed evidence concerning the nine brain regions most relevant to ID can be found in the references in the last column of Table 4.

Considering the diversity of abnormal brain regions, a comprehensive treatment approach may be more effective in clinic. Combining medication, cognitive behavioral therapy, and lifestyle interventions to address different brain networks in a holistic manner increases the likelihood of treatment success.

Role analysis of ID relevant FC

It was observed that the four FCs had strong synergistic effects (Figure 4), particularly in the ID patient group. The mean value of FC strength in the ID patient group was higher than that of the HC group. This indicated that the exchange of information between brain regions is more powerful and frequent in individuals with ID. These differences may reflect specific alterations in information processing and the organization of brain functions in individuals with ID and further supports the crucial role of FC in the development of ID. The potential presence of abnormally high FC strength may be regarded as one of the biomarkers of ID, which can assist us in enhancing our comprehension of the neurological foundation of ID.

In the subnetwork of the SMN and EAN, there was a significant correlation (P=0.02) between the strength of the SMA.R-PHG.L FC and the PSQI, as demonstrated in Table 5 and Figure 5A. This observation is consistent with previous studies (4,39,40) reporting a correlation between SMA damage and ID. Furthermore, in patients with ID, we observed a significant correlation of PSQI with the strength of the TPOsup.R-MFG.R FC (P=0.01) and the TPOsup.R-ITG.L FC (P=0.01). Similarly, Huang et al. has reported increased FC strength between TPOsup and the left pallidum in patients with ID (41). These findings suggest that there exists a potential association between the abnormalities in the SMA and TPOsup regions and ID, with the literature (31) also suggesting an additional link between the ITG.R and comorbid depression in patients with ID. In this study, we found a significant correlation between the ITG.R-MOG.R FC strength and the PSQI (P=0.05), and the strength of the ITG.R-MOG.R FC was correlated with both anxiety (P=0.008) and depression (P=0.005) in patients with ID. The supports the speculation of Baglioni et al. regarding the existence of a partial overlap in the pathological mechanisms of ID, depression, and anxiety (47). Therefore, our most significant finding is that the FC between the ITG.R and MOG.R has the potential to serve as a neuroimaging biomarker for comorbid insomnia with depression and anxiety.

Insomnia is a multifaceted pathological condition that can potentially involve disruptions in FC among various brain regions. The maintenance of healthy sleep patterns relies upon a complex interplay of neuroregulatory mechanisms. It is evident from the Figure 5 that the scale scores were positively correlated with FC strength. In other words, the greater the strength of abnormal FC observed in insomnia patients, the poorer their sleep quality becomes. This indirectly explains why the FC strength in the ID patient group was higher compared to that of HC group. Specifically, in the results presented in Figure 5, we can observe a positive correlation of the strength of the SMA.R-PHG.L FC, TPOsup.R-MFG.R FC, TPOsup.R-ITG.L FC, and ITG.R-MOG.R FC with the PSQI scores. Furthermore, it can be seen that the FC strength of ITG.R with MOG.R was significantly correlated with anxiety and depression for patients with ID. These findings indicate that an increase in FC between brain regions associated with insomnia may result in deteriorated sleep quality. Such enhanced FC may lead to excessive activation between brain regions, thus affecting the normal sleep processes. Further research concerning the mechanisms and regulation of these aberrant functional connections would aid in a deeper understanding of the neurobiological underpinnings of insomnia and provide valuable guidance for developing individualized treatment strategies. For example, in response to a detected abnormal FC strength, a targeted modulation approach could be considered to adjust brain activity through neurofeedback, neuromodulation, or other interventions. Changes in functional brain connectivity can be tracked during the course of treatment to assess the effectiveness of the treatment and to make timely adjustments to the treatment program.

Conclusions

In this study, we proposed a novel deep network, MTGCAIN, to characterize the brain activity based on the analysis of rs-fMRI images. By integrating the GCA and MT, the proposed model demonstrated a superior capability in identifying the ID relevant abnormal brain regions and FC. The GCA is helpful for capturing brain node and edge features, while the MT can realize efficient exchange and aggregate of modality-specific information. Through comparative and ablation experiments, we showed that the MTGCAIN has promising prospects in detecting biomarkers and identifying the pathogenic factors in patients with ID.

Based on the analysis of experimental results, it was found that the DMN exhibits a stronger connection with ID compared to other subnetworks. In addition, the FC strength of MOG.R and ITG.R was associated with anxiety and depression in patients with ID. This means that in the diagnosis of ID in clinic, these brain regions and FCs should be given more attention. The MTGCAIN holds promise in being for multimodal fusion with other imaging modalities (e.g., diffusion tensor imaging). Combining multiple imaging data can help to mitigate the effect of noise from a single modality on the model, giving the model a more accurate and stable diagnostic capability. Therefore, in our next study, we will increase dataset size and add image modalities to improve the generalizability of the MTGCAIN and further our understanding of ID mechanisms.

Acknowledgments

Funding: This work was supported by the Medical and Health Research Projects of Qingdao (Grant No. 2021-WJZD192).

Footnote

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-23-1594/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Institutional Review Board at the Affiliated Hospital of Qingdao University (No. QYFY WZLL 28505) and informed consent was taken from all the patients.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Kyle SD, Siriwardena AN, Espie CA, Yang Y, Petrou S, Ogburn E, Begum N, Maurer LF, Robinson B, Gardner C, Lee V, Armstrong S, Pattinson J, Mort S, Temple E, Harris V, Yu LM, Bower P, Aveyard P. Clinical and cost-effectiveness of nurse-delivered sleep restriction therapy for insomnia in primary care (HABIT): a pragmatic, superiority, open-label, randomised controlled trial. Lancet 2023;402:975-87. [Crossref] [PubMed]
Reynolds CF 3rd, O'Hara R. DSM-5 sleep-wake disorders classification: overview for use in clinical practice. Am J Psychiatry 2013;170:1099-101. [Crossref] [PubMed]
Zhou F, Zhao Y, Huang M, Zeng X, Wang B, Gong H. Disrupted interhemispheric functional connectivity in chronic insomnia disorder: a resting-state fMRI study. Neuropsychiatr Dis Treat 2018;14:1229-40. [Crossref] [PubMed]
Zhou F, Huang S, Gao L, Zhuang Y, Ding S, Gong H. Temporal regularity of intrinsic cerebral activity in patients with chronic primary insomnia: a brain entropy study using resting-state fMRI. Brain Behav 2016;6:e00529. [Crossref] [PubMed]
Dai XJ, Liu BX, Ai S, Nie X, Xu Q, Hu J, Zhang Q, Xu Y, Zhang Z, Lu G. Altered inter-hemispheric communication of default-mode and visual networks underlie etiology of primary insomnia : Altered inter-hemispheric communication underlie etiology of insomnia. Brain Imaging Behav 2020;14:1430-44. [Crossref] [PubMed]
Li G, Chen Y, Chaudhary S, Li CS, Hao D, Yang L, Li CR. Sleep dysfunction mediates the relationship between hypothalamic-insula connectivity and anxiety-depression symptom severity bidirectionally in young adults. Neuroimage 2023;279:120340. [Crossref] [PubMed]
Kim E, Kim S, Kim Y, Cha H, Lee HJ, Lee T, Chang Y. Connectome-based predictive models using resting-state fMRI for studying brain aging. Exp Brain Res 2022;240:2389-400. [Crossref] [PubMed]
Ma C, Tian F, Ma MG, Su HW, Fan JC, Li ZH, Ren YD. Preferentially Disrupted Core Hubs Within the Default-Mode Network in Patients With End-Stage Renal Disease: A Resting-State Functional Magnetic Resonance Imaging Study. Front Neurol 2020;11:1032. [Crossref] [PubMed]
Zhang Y, Huang Y, Liu N, Wang Z, Wu J, Li W, Xia J, Liu Z, Li Y, Hao Y, Huo J. Abnormal interhemispheric functional connectivity in patients with primary dysmenorrhea: a resting-state functional MRI study. Quant Imaging Med Surg 2022;12:1958-67. [Crossref] [PubMed]
Wang M, Shao W, Hao X, Huang S, Zhang D. Identify connectome between genotypes and brain network phenotypes via deep self-reconstruction sparse canonical correlation analysis. Bioinformatics 2022;38:2323-32. [Crossref] [PubMed]
Dehghani A, Soltanian-Zadeh H, Hossein-Zadeh GA. Neural modulation enhancement using connectivity-based EEG neurofeedback with simultaneous fMRI for emotion regulation. Neuroimage 2023;279:120320. [Crossref] [PubMed]
Xu S, Faust O, Seoni S, Chakraborty S, Barua PD, Loh HW, Elphick H, Molinari F, Acharya UR. A review of automated sleep disorder detection. Comput Biol Med 2022;150:106100. [Crossref] [PubMed]
Almuhammadi WS, Aboalayon KAI, Faezipour M. Efficient obstructive sleep apnea classification based on EEG signals. 2015 Long Island Systems, Applications and Technology, Farmingdale, NY, USA, 2015:1-6.
Shahin M, Mulaffer L, Penzel T, Ahmed B. A Two Stage Approach for the Automatic Detection of Insomnia. Annu Int Conf IEEE Eng Med Biol Soc 2018;2018:466-9. [Crossref] [PubMed]
Qu W, Kao CH, Hong H, Chi Z, Grunstein R, Gordon C, Wang Z. Single-channel EEG based insomnia detection with domain adaptation. Comput Biol Med 2021;139:104989. [Crossref] [PubMed]
Kusmakar S, Karmakar C, Zhu Y, Shelyag S, Drummond SPA, Ellis JG, Angelova M. A machine learning model for multi-night actigraphic detection of chronic insomnia: development and validation of a pre-screening tool. R Soc Open Sci 2021;8:202264. [Crossref] [PubMed]
Wang Y, Yang A, Song Z, Liu B, Chen Y, Lv K, Ma G, Tang X. Investigation of functional connectivity in Bell's palsy using functional magnetic resonance imaging: prospective cross-sectional study. Quant Imaging Med Surg 2023;13:4676-86. [Crossref] [PubMed]
Li Y, Wang E, Zhang H, Dou S, Liu L, Tong L, Lei Y, Wang M, Xu J, Shi D, Zhang Q. Functional connectivity changes between parietal and prefrontal cortices in primary insomnia patients: evidence from resting-state fMRI. Eur J Med Res 2014;19:32. [Crossref] [PubMed]
Fasiello E, Gorgoni M, Scarpelli S, Alfonsi V, Ferini Strambi L, De Gennaro L. Functional connectivity changes in insomnia disorder: A systematic review. Sleep Med Rev 2022;61:101569. [Crossref] [PubMed]
Su H, Zuo C, Zhang H, Jiao F, Zhang B, Tang W, Geng D, Guan Y, Shi S. Regional cerebral metabolism alterations affect resting-state functional connectivity in major depressive disorder. Quant Imaging Med Surg 2018;8:910-24. [Crossref] [PubMed]
Lee MH, Kim N, Yoo J, Kim HK, Son YD, Kim YB, Oh SM, Kim S, Lee H, Jeon JE, Lee YJ. Multitask fMRI and machine learning approach improve prediction of differential brain activity pattern in patients with insomnia disorder. Sci Rep 2021;11:9402. [Crossref] [PubMed]
Shahid MLUR, Mir J, Shaukat F, Saleem MK, Tariq MAUR, Nouman A. Classification of Pharynx from MRI Using a Visual Analysis Tool to Study Obstructive Sleep Apnea. Curr Med Imaging 2021;17:613-22. [Crossref] [PubMed]
Ma X, Wu D, Mai Y, Xu G, Tian J, Jiang G. Functional connectome fingerprint of sleep quality in insomnia patients: Individualized out-of-sample prediction using machine learning. Neuroimage Clin 2020;28:102439. [Crossref] [PubMed]
Bai P, Wang Y, Zhao F, Liu Q, Wang C, Liu J, Qiao Y, Ma C, Ren Y. Investigation of the correlation between brain functional connectivity and ESRD based on low-order and high-order feature analysis of rs-fMRI. Med Phys 2023;50:3873-84. [Crossref] [PubMed]
Zou Y, Tang W, Qiao X, Li J. Aberrant modulations of static functional connectivity and dynamic functional network connectivity in chronic migraine. Quant Imaging Med Surg 2021;11:2253-64. [Crossref] [PubMed]
Reimann GM, Küppers V, Camilleri JA, Hoffstaedter F, Langner R, Laird AR, Fox PT, Spiegelhalder K, Eickhoff SB, Tahmasian M. Convergent abnormality in the subgenual anterior cingulate cortex in insomnia disorder: A revisited neuroimaging meta-analysis of 39 studies. Sleep Med Rev 2023;71:101821. [Crossref] [PubMed]
Zhang H, Song R, Wang L, Zhang L, Wang D, Wang C, Zhang W. Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks. IEEE Trans Med Imaging 2023;42:444-55. [Crossref] [PubMed]
Li X, Zhou Y, Dvornek N, Zhang M, Gao S, Zhuang J, Scheinost D, Staib LH, Ventola P, Duncan JS. BrainGNN: Interpretable Brain Graph Neural Network for fMRI Analysis. Med Image Anal 2021;74:102233. [Crossref] [PubMed]
Huang Y, Chung ACS. Disease prediction with edge-variational graph convolutional networks. Med Image Anal 2022;77:102375. [Crossref] [PubMed]
Zhou HY, Yu Y, Wang C, Zhang S, Gao Y, Pan J, Shao J, Lu G, Zhang K, Li W. A transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics. Nat Biomed Eng 2023;7:743-55. [Crossref] [PubMed]
Li G, Zhang X, Zhang J, Wang E, Zhang H, Li Y. Magnetic resonance study on the brain structure and resting-state brain functional connectivity in primary insomnia patients. Medicine (Baltimore) 2018;97:e11944. [Crossref] [PubMed]
Jia XZ, Wang J, Sun HY, Zhang H, Liao W, Wang Z, Yan CG, Song XW, Zang YF. RESTplus: an improved toolkit for resting-state functional magnetic resonance imaging data processing. Sci Bull (Beijing) 2019;64:953-4. [Crossref] [PubMed]
Li C, Dong M, Yin Y, Hua K, Fu S, Jiang G. Abnormal whole-brain functional connectivity in patients with primary insomnia. Neuropsychiatr Dis Treat 2017;13:427-35. [Crossref] [PubMed]
Sun A, Chen N, He L, Zhang J. Research on migraine time-series features classification based on small-sample functional magnetic resonance imaging data. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi 2023;40:110-7. [Crossref] [PubMed]
Kim BH, Ye JC, Kim JJ. Learning dynamic graph representation of brain connectome with spatio-temporal attention. Advances in Neural Information Processing Systems 2021;34:4314-27.
KipfTNWellingM. Semi-Supervised Classification with Graph Convolutional Networks. arXiv:1609.02907.
Orme-Rogers J, Srivastava A. Spatio-Temporal Attention in Multi-Granular Brain Chronnectomes For Detection of Autism Spectrum Disorder. ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023:1-5.
Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, Mazoyer B, Joliot M. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage 2002;15:273-89. [Crossref] [PubMed]
Huang S, Zhou F, Jiang J, Huang M, Zeng X, Ding S, Gong H. Regional impairment of intrinsic functional connectivity strength in patients with chronic primary insomnia. Neuropsychiatr Dis Treat 2017;13:1449-62. [Crossref] [PubMed]
Pang R, Zhan Y, Zhang Y, Guo R, Wang J, Guo X, Liu Y, Wang Z, Li K. Aberrant Functional Connectivity Architecture in Participants with Chronic Insomnia Disorder Accompanying Cognitive Dysfunction: A Whole-Brain, Data-Driven Analysis. Front Neurosci 2017;11:259. [Crossref] [PubMed]
Huang Z, Liang P, Jia X, Zhan S, Li N, Ding Y, Lu J, Wang Y, Li K. Abnormal amygdala connectivity in patients with primary insomnia: evidence from resting state fMRI. Eur J Radiol 2012;81:1288-95. [Crossref] [PubMed]
Ma X, Jiang G, Fu S, Fang J, Wu Y, Liu M, Xu G, Wang T. Enhanced Network Efficiency of Functional Brain Networks in Primary Insomnia Patients. Front Psychiatry 2018;9:46. [Crossref] [PubMed]
Yan CQ, Wang X, Huo JW, Zhou P, Li JL, Wang ZY, Zhang J, Fu QN, Wang XR, Liu CZ, Liu QQ. Abnormal Global Brain Functional Connectivity in Primary Insomnia Patients: A Resting-State Functional MRI Study. Front Neurol 2018;9:856. [Crossref] [PubMed]
Liu X, Zheng J, Liu BX, Dai XJ. Altered connection properties of important network hubs may be neural risk factors for individuals with primary insomnia. Sci Rep 2018;8:5891. [Crossref] [PubMed]
Mak LE, Minuzzi L, MacQueen G, Hall G, Kennedy SH, Milev R. The Default Mode Network in Healthy Individuals: A Systematic Review and Meta-Analysis. Brain Connect 2017;7:25-33. [Crossref] [PubMed]
Wang H, Huang Y, Li M, Yang H, An J, Leng X, Xu D, Qiu S. Regional brain dysfunction in insomnia after ischemic stroke: A resting-state fMRI study. Front Neurol 2022;13:1025174. [Crossref] [PubMed]
Baglioni C, Spiegelhalder K, Lombardo C, Riemann D. Sleep and emotions: a focus on insomnia. Sleep Med Rev 2010;14:227-38. [Crossref] [PubMed]

Cite this article as: Wang Y, Ren Y, Bi Y, Zhao F, Bai X, Wei L, Liu W, Ma H, Bai P. Multimodal transformer graph convolution attention isomorphism network (MTCGAIN): a novel deep network for detection of insomnia disorder. Quant Imaging Med Surg 2024;14(5):3350-3365. doi: 10.21037/qims-23-1594

Multimodal transformer graph convolution attention isomorphism network (MTCGAIN): a novel deep network for detection of insomnia disorder

Introduction

Methods

Data acquisition and preprocessing

Table 1

Construction of the dynamic graph

Construction of the MTGCAIN

The graph convolution block

The readout block

The encoder block

The output block

Loss function design

Implementation details

Results

Model comparison

Table 2

Table 3

Abnormal regions explored with the MTGCAIN

Table 4

Abnormal FC as characterized by the MTGCAIN

Table 5

Discussion

Performance evaluation of MTGCAIN

Regional distribution of the abnormal regions in patients with ID

Role analysis of ID relevant FC

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share