Convolutional neural networks (CNNs) are known for their powerful feature extraction ability, and have achieved great success in a variety of image processing tasks. However, convolution filters only extract local features and neglect long-range self-similarity information, which is the vi-tal prior information commonly existing in image data. To this end, we put forward a new backbone neural network: vision graph U-Net (VGU-Net), which is the first model to construct multi-scale graph structures through the hierarchical down-sampling layers of the U-Net architecture. The graph structure is constructed by the self-attention mechanism. By replacing CNNs in the bottleneck layer and skip connection layers with the graph convolution networks (GCNs), the multi-scale graph structure visualization allows an interpretation of long-range interactions. We extend the VGU-Net backbone model for the widely considered compressed sensing MR image reconstruction task and propose a knowledge-driven deep unrolling scheme based on the half-quadratic splitting algorithm, which combines the interpretability of knowledge-driven model with the versatility of data-driven deep learning method to achieve remarkable reconstruction results. Moreover, we verify the segmentation ability of the VGU-Net backbone model on the multi-modality brain tumor segmentation dataset and white blood cell image segmentation dataset, and both achieve state-of-the-art performance. The code is publicly available at https://github.com/jyh6681/VGU-Net.

Vision graph U-Net: Geometric learning enhanced encoder for medical image segmentation and restoration / Jiang, Y.; Ding, Q.; Wang, Y. G.; Lio, P.; Zhang, X.. - In: INVERSE PROBLEMS AND IMAGING. - ISSN 1930-8337. - 18:3(2024), pp. 672-689. [10.3934/ipi.2023049]

Vision graph U-Net: Geometric learning enhanced encoder for medical image segmentation and restoration

Lio P.;
2024

Abstract

Convolutional neural networks (CNNs) are known for their powerful feature extraction ability, and have achieved great success in a variety of image processing tasks. However, convolution filters only extract local features and neglect long-range self-similarity information, which is the vi-tal prior information commonly existing in image data. To this end, we put forward a new backbone neural network: vision graph U-Net (VGU-Net), which is the first model to construct multi-scale graph structures through the hierarchical down-sampling layers of the U-Net architecture. The graph structure is constructed by the self-attention mechanism. By replacing CNNs in the bottleneck layer and skip connection layers with the graph convolution networks (GCNs), the multi-scale graph structure visualization allows an interpretation of long-range interactions. We extend the VGU-Net backbone model for the widely considered compressed sensing MR image reconstruction task and propose a knowledge-driven deep unrolling scheme based on the half-quadratic splitting algorithm, which combines the interpretability of knowledge-driven model with the versatility of data-driven deep learning method to achieve remarkable reconstruction results. Moreover, we verify the segmentation ability of the VGU-Net backbone model on the multi-modality brain tumor segmentation dataset and white blood cell image segmentation dataset, and both achieve state-of-the-art performance. The code is publicly available at https://github.com/jyh6681/VGU-Net.
2024
geometric deep learning; knowledge-driven deep unrolling scheme; multi-modality medical image segmentation; self-attention; Vision graph U-Net
01 Pubblicazione su rivista::01a Articolo in rivista
Vision graph U-Net: Geometric learning enhanced encoder for medical image segmentation and restoration / Jiang, Y.; Ding, Q.; Wang, Y. G.; Lio, P.; Zhang, X.. - In: INVERSE PROBLEMS AND IMAGING. - ISSN 1930-8337. - 18:3(2024), pp. 672-689. [10.3934/ipi.2023049]
File allegati a questo prodotto
File Dimensione Formato  
Jiang_Vision_2024.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 3.68 MB
Formato Adobe PDF
3.68 MB Adobe PDF   Contatta l'autore
Jiang_postpirnt_Vision_graph_2024.pdf

accesso aperto

Note: Doi: 10.3934/ipi.2023049
Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Creative commons
Dimensione 3.66 MB
Formato Adobe PDF
3.66 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1719875
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact