FN Clarivate Analytics Web of Science
VR 1.0
AU Zhou, Yanfeng
Li, Lingrui
Wang, Chenlong
Song, Le
Yang, Ge
FN Clarivate Analytics Web of Science VR 1.0 AU Zhou、Yanfeng Li、Lingrui Wang、Chenlong Song、Le Yang、Ge
GobletNet: Wavelet-Based High-Frequency Fusion Network for Semantic
Segmentation of Electron Microscopy Images.
GobletNet:基于小波的高频融合网络,用于电子显微镜图像的语义分割。
Semantic segmentation of electron microscopy (EM) images is crucial for
nanoscale analysis. With the development of deep neural networks (DNNs),
semantic segmentation of EM images has achieved remarkable success.
However, current EM image segmentation models are usually extensions or
adaptations of natural or biomedical models. They lack the full
exploration and utilization of the intrinsic characteristics of EM
images. Furthermore, they are often designed only for several specific
segmentation objects and lack versatility. In this study, we
quantitatively analyze the characteristics of EM images compared with
those of natural and other biomedical images via the wavelet transform.
To better utilize these characteristics, we design a high-frequency (HF)
fusion network, GobletNet, which outperforms state-of-the-art models by
a large margin in the semantic segmentation of EM images. We use the
wavelet transform to generate HF images as extra inputs and use an extra
encoding branch to extract HF information. Furthermore, we introduce a
fusion-attention module (FAM) into GobletNet to facilitate better
absorption and fusion of information from raw images and HF images.
Extensive benchmarking on seven public EM datasets (EPFL, CREMI,
SNEMI3D, UroCell, MitoEM, Nanowire and BetaSeg) demonstrates the
effectiveness of our model. The code is available at
https://github.com/Yanfeng-Zhou/GobletNet.
电子显微镜 (EM) 图像的语义分割对于纳米级分析至关重要。随着深度神经网络(DNN)的发展,EM 图像的语义分割取得了显着的成功。然而,当前的电磁图像分割模型通常是自然或生物医学模型的扩展或改编。缺乏对电磁图像内在特征的充分探索和利用。此外,它们通常仅针对几个特定的分割对象而设计,缺乏通用性。在本研究中,我们通过小波变换定量分析了电磁图像与自然图像和其他生物医学图像的特征。为了更好地利用这些特性,我们设计了一种高频(HF)融合网络 GobletNet,它在 EM 图像的语义分割方面远远优于最先进的模型。我们使用小波变换生成 HF 图像作为额外输入,并使用额外的编码分支来提取 HF 信息。此外,我们在 GobletNet 中引入了融合注意力模块(FAM),以促进更好地吸收和融合原始图像和 HF 图像的信息。对七个公共 EM 数据集(EPFL、CREMI、SNEMI3D、UroCell、MitoEM、Nanowire 和 BetaSeg)的广泛基准测试证明了我们模型的有效性。代码可在 https://github.com/Yanfeng-Zhou/GobletNet 获取。
EI 1558-254X
DA 2024-10-06
UT MEDLINE:39365717
PM 39365717
ER
EI 1558-254X DA 2024-10-06 UT MEDLINE:39365717 PM 39365717 ER
AU Zhang, Runshi
Mo, Hao
Wang, Junchen
Jie, Bimeng
He, Yang
Jin, Nenghao
Zhu, Liang
张AU、莫润世、王浩、杰俊辰、何必萌、金杨、朱能浩、梁
UTSRMorph: A Unified Transformer and Superresolution Network for
Unsupervised Medical Image Registration.
UTSRMorph:用于无监督医学图像配准的统一变压器和超分辨率网络。
Complicated image registration is a key issue in medical image analysis,
and deep learning-based methods have achieved better results than
traditional methods. The methods include ConvNet-based and
Transformer-based methods. Although ConvNets can effectively utilize
local information to reduce redundancy via small neighborhood
convolution, the limited receptive field results in the inability to
capture global dependencies. Transformers can establish long-distance
dependencies via a self-attention mechanism; however, the intense
calculation of the relationships among all tokens leads to high
redundancy. We propose a novel unsupervised image registration method
named the unified Transformer and superresolution (UTSRMorph) network,
which can enhance feature representation learning in the encoder and
generate detailed displacement fields in the decoder to overcome these
problems. We first propose a fusion attention block to integrate the
advantages of ConvNets and Transformers, which inserts a ConvNet-based
channel attention module into a multihead self-attention module. The
overlapping attention block, a novel cross-attention method, uses
overlapping windows to obtain abundant correlations with match
information of a pair of images. Then, the blocks are flexibly stacked
into a new powerful encoder. The decoder generation process of a
high-resolution deformation displacement field from low-resolution
features is considered as a superresolution process. Specifically, the
superresolution module was employed to replace interpolation upsampling,
which can overcome feature degradation. UTSRMorph was compared to
state-of-the-art registration methods in the 3D brain MR (OASIS, IXI)
and MR-CT datasets (abdomen, craniomaxillofacial). The qualitative and
quantitative results indicate that UTSRMorph achieves relatively better
performance. The code and datasets used are publicly available at
https://github.com/Runshi-Zhang/UTSRMorph.
复杂的图像配准是医学图像分析的关键问题,基于深度学习的方法取得了比传统方法更好的结果。这些方法包括基于 ConvNet 和基于 Transformer 的方法。尽管ConvNets可以通过小邻域卷积有效地利用局部信息来减少冗余,但有限的感受野导致无法捕获全局依赖性。 Transformer 可以通过 self-attention 机制建立长距离依赖;然而,对所有令牌之间关系的密集计算导致了高度冗余。我们提出了一种新颖的无监督图像配准方法,称为统一变换器和超分辨率(UTSRMorph)网络,它可以增强编码器中的特征表示学习并在解码器中生成详细的位移场以克服这些问题。我们首先提出了一种融合注意力模块来整合ConvNets和Transformers的优点,它将基于ConvNet的通道注意力模块插入到多头自注意力模块中。重叠注意力块是一种新颖的交叉注意力方法,它使用重叠窗口来获得与一对图像的匹配信息的丰富相关性。然后,这些块被灵活地堆叠到一个新的强大编码器中。从低分辨率特征生成高分辨率变形位移场的解码器生成过程被认为是超分辨率过程。具体来说,采用超分辨率模块来代替插值上采样,这可以克服特征退化。将 UTSRMorph 与 3D 大脑 MR(OASIS、IXI)和 MR-CT 数据集(腹部、颅颌面)中最先进的配准方法进行了比较。 定性和定量结果表明UTSRMorph取得了相对较好的性能。使用的代码和数据集可在 https://github.com/Runshi-Zhang/UTSRMorph 上公开获取。
AU Siebert, Hanna
Grossbrohmer, Christoph
Hansen, Lasse
Heinrich, Mattias P
AU Siebert、Hanna Grossbrohmer、Christoph Hansen、Lasse Heinrich、Mattias P
ConvexAdam: Self-Configuring Dual-Optimisation-Based 3D Multitask
Medical Image Registration.
ConvexAdam:基于双优化的自配置 3D 多任务医学图像配准。
Registration of medical image data requires methods that can align
anatomical structures precisely while applying smooth and plausible
transformations. Ideally, these methods should furthermore operate
quickly and apply to a wide variety of tasks. Deep learning-based image
registration methods usually entail an elaborate learning procedure with
the need for extensive training data. However, they often struggle with
versatility when aiming to apply the same approach across various
anatomical regions and different imaging modalities. In this work, we
present a method that extracts semantic or hand-crafted image features
and uses a coupled convex optimisation followed by Adam-based instance
optimisation for multitask medical image registration. We make use of
pre-trained semantic feature extraction models for the individual
datasets and combine them with our fast dual optimisation procedure for
deformation field computation. Furthermore, we propose a very fast
automatic hyperparameter selection procedure that explores many settings
and ranks them on validation data to provide a self-configuring image
registration framework. With our approach, we can align image data for
various tasks with little learning. We conduct experiments on all
available Learn2Reg challenge datasets and obtain results that are to be
positioned in the upper ranks of the challenge leaderboards.
github.com/multimodallearning/convexAdam.
医学图像数据的配准需要能够精确对齐解剖结构同时应用平滑且合理的变换的方法。理想情况下,这些方法应该能够快速运行并适用于各种任务。基于深度学习的图像配准方法通常需要复杂的学习过程,并且需要大量的训练数据。然而,当他们的目标是在不同的解剖区域和不同的成像模式中应用相同的方法时,他们经常会遇到多功能性的问题。在这项工作中,我们提出了一种提取语义或手工制作的图像特征的方法,并使用耦合凸优化和基于 Adam 的实例优化来进行多任务医学图像配准。我们对各个数据集使用预先训练的语义特征提取模型,并将它们与我们用于变形场计算的快速双重优化程序相结合。此外,我们提出了一种非常快速的自动超参数选择过程,该过程探索许多设置并根据验证数据对它们进行排名,以提供自配置图像配准框架。通过我们的方法,我们只需很少的学习就可以为各种任务调整图像数据。我们对所有可用的 Learn2Reg 挑战数据集进行实验,并获得将位于挑战排行榜上位的结果。 github.com/multimodallearning/convexAdam。
AU Hu, Yan
Wang, Jun
Zhu, Hao
Li, Juncheng
Shi, Jun
胡AU、王艳、朱军、李浩、施俊成、Jun
Cost-Sensitive Weighted Contrastive Learning Based on Graph
Convolutional Networks for Imbalanced Alzheimer's Disease Staging
基于图卷积网络的成本敏感加权对比学习,用于治疗不平衡的阿尔茨海默病分期
Identifying the progression stages of Alzheimer's disease (AD) can be
considered as an imbalanced multi-class classification problem in
machine learning. It is challenging due to the class imbalance issue and
the heterogeneity of the disease. Recently, graph convolutional networks
(GCNs) have been successfully applied in AD classification. However,
these works did not handle the class imbalance issue in classification.
Besides, they ignore the heterogeneity of the disease. To this end, we
propose a novel cost-sensitive weighted contrastive learning method
based on graph convolutional networks (CSWCL-GCNs) for imbalanced AD
staging using resting-state functional magnetic resonance imaging
(rs-fMRI). The proposed method is developed on a multi-view graph
constructed by the functional connectivity (FC) and high-order
functional connectivity (HOFC) features of the subjects. A novel
cost-sensitive weighted contrastive learning procedure is proposed to
capture discriminative information from the minority classes,
encouraging the samples in the minority class to provide adequate
supervision. Considering the heterogeneity of the disease, the weights
of the negative pairs are introduced into contrastive learning and they
are computed based on the distance to class prototypes, which are
automatically learned from the training data. Meanwhile, the
cost-sensitive mechanism is further introduced into contrastive learning
to handle the class imbalance issue. The proposed CSWCL-GCN is evaluated
on 720 subjects (including 184 NCs, 40 SMC patients, 208 EMCI patients,
172 LMCI patients and 116 AD patients) from the ADNI (Alzheimer's
Disease Neuroimaging Initiative). Experimental results show that the
proposed CSWCL-GCN outperforms state-of-the-art methods on the ADNI
database.
识别阿尔茨海默病(AD)的进展阶段可以被视为机器学习中的不平衡多类分类问题。由于类别不平衡问题和疾病的异质性,这具有挑战性。最近,图卷积网络(GCN)已成功应用于AD分类。然而,这些工作并没有解决分类中的类别不平衡问题。此外,他们忽视了疾病的异质性。为此,我们提出了一种基于图卷积网络(CSWCL-GCN)的新型成本敏感加权对比学习方法,用于使用静息态功能磁共振成像(rs-fMRI)进行不平衡的 AD 分期。该方法是在由主体的功能连接(FC)和高阶功能连接(HOFC)特征构建的多视图图上开发的。提出了一种新颖的成本敏感加权对比学习程序来捕获少数类别的判别信息,鼓励少数类别中的样本提供充分的监督。考虑到疾病的异质性,将负对的权重引入对比学习中,并根据与类原型的距离来计算它们,这是从训练数据中自动学习的。同时,将成本敏感机制进一步引入对比学习中,以处理类别不平衡问题。拟议的 CSWCL-GCN 在 ADNI(阿尔茨海默病神经影像计划)的 720 名受试者(包括 184 名 NC、40 名 SMC 患者、208 名 EMCI 患者、172 名 LMCI 患者和 116 名 AD 患者)上进行了评估。实验结果表明,所提出的 CSWCL-GCN 优于 ADNI 数据库上最先进的方法。
AU Qiu, Zifeng
Yang, Peng
Xiao, Chunlun
Wang, Shuqiang
Xiao, Xiaohua
Qin, Jing
Liu, Chuan-Ming
Wang, Tianfu
Lei, Baiying
邱秋、杨子峰、肖鹏、王春伦、肖书强、秦晓华、刘静、王传明、雷天福、白英
3D Multimodal Fusion Network With Disease-Induced Joint Learning for
Early Alzheimer's Disease Diagnosis
具有疾病诱导联合学习的 3D 多模态融合网络用于早期阿尔茨海默病诊断
Multimodal neuroimaging provides complementary information critical for
accurate early diagnosis of Alzheimer's disease (AD). However, the
inherent variability between multimodal neuroimages hinders the
effective fusion of multimodal features. Moreover, achieving reliable
and interpretable diagnoses in the field of multimodal fusion remains
challenging. To address them, we propose a novel multimodal diagnosis
network based on multi-fusion and disease-induced learning (MDL-Net) to
enhance early AD diagnosis by efficiently fusing multimodal data.
Specifically, MDL-Net proposes a multi-fusion joint learning (MJL)
module, which effectively fuses multimodal features and enhances the
feature representation from global, local, and latent learning
perspectives. MJL consists of three modules, global-aware learning
(GAL), local-aware learning (LAL), and outer latent-space learning (LSL)
modules. GAL via a self-adaptive Transformer (SAT) learns the global
relationships among the modalities. LAL constructs local-aware
convolution to learn the local associations. LSL module introduces
latent information through outer product operation to further enhance
feature representation. MDL-Net integrates the disease-induced
region-aware learning (DRL) module via gradient weight to enhance
interpretability, which iteratively learns weight matrices to identify
AD-related brain regions. We conduct the extensive experiments on public
datasets and the results confirm the superiority of our proposed method.
Our code will be available at: https://github.com/qzf0320/MDL-Net.
多模态神经影像提供了对于阿尔茨海默病 (AD) 的准确早期诊断至关重要的补充信息。然而,多模态神经图像之间固有的变异性阻碍了多模态特征的有效融合。此外,在多模态融合领域实现可靠且可解释的诊断仍然具有挑战性。为了解决这些问题,我们提出了一种基于多融合和疾病诱导学习(MDL-Net)的新型多模态诊断网络,通过有效融合多模态数据来增强早期 AD 诊断。具体来说,MDL-Net提出了一种多融合联合学习(MJL)模块,该模块有效地融合了多模态特征,并从全局、局部和潜在学习的角度增强了特征表示。 MJL 由三个模块组成,全局感知学习(GAL)、局部感知学习(LAL)和外部潜在空间学习(LSL)模块。 GAL 通过自适应 Transformer (SAT) 学习模态之间的全局关系。 LAL 构建局部感知卷积来学习局部关联。 LSL模块通过外积运算引入潜在信息,进一步增强特征表示。 MDL-Net 通过梯度权重集成疾病诱导区域感知学习 (DRL) 模块以增强可解释性,迭代学习权重矩阵来识别 AD 相关的大脑区域。我们对公共数据集进行了广泛的实验,结果证实了我们提出的方法的优越性。我们的代码将在以下位置提供:https://github.com/qzf0320/MDL-Net。
AU Lu, Ziru
Zhang, Yizhe
Zhou, Yi
Wu, Ye
Zhou, Tao
AU Lu、张自如、周一哲、吴一、周晔、涛
Domain-interactive Contrastive Learning and Prototype-guided
Self-training for Cross-domain Polyp Segmentation.
用于跨域息肉分割的域交互式对比学习和原型引导的自我训练。
Accurate polyp segmentation plays a critical role from colonoscopy
images in the diagnosis and treatment of colorectal cancer. While deep
learning-based polyp segmentation models have made significant progress,
they often suffer from performance degradation when applied to unseen
target domain datasets collected from different imaging devices. To
address this challenge, unsupervised domain adaptation (UDA) methods
have gained attention by leveraging labeled source data and unlabeled
target data to reduce the domain gap. However, existing UDA methods
primarily focus on capturing class-wise representations, neglecting
domain-wise representations. Additionally, uncertainty in pseudo labels
could hinder the segmentation performance. To tackle these issues, we
propose a novel Domain-interactive Contrastive Learning and
Prototype-guided Self-training (DCL-PS) framework for cross-domain polyp
segmentation. Specifically, domaininteractive contrastive learning (DCL)
with a domain-mixed prototype updating strategy is proposed to
discriminate class-wise feature representations across domains. Then, to
enhance the feature extraction ability of the encoder, we present a
contrastive learning-based cross-consistency training (CL-CCT) strategy,
which is imposed on both the prototypes obtained by the outputs of the
main decoder and perturbed auxiliary outputs. Furthermore, we propose a
prototype-guided self-training (PS) strategy, which dynamically assigns
a weight for each pixel during selftraining, filtering out unreliable
pixels and improving the quality of pseudo-labels. Experimental results
demonstrate the superiority of DCL-PS in improving polyp segmentation
performance in the target domain. The code will be released at
https://github.com/taozh2017/DCLPS.
结肠镜图像的准确息肉分割在结直肠癌的诊断和治疗中起着至关重要的作用。虽然基于深度学习的息肉分割模型取得了重大进展,但当应用于从不同成像设备收集的看不见的目标域数据集时,它们常常会出现性能下降的问题。为了应对这一挑战,无监督域适应(UDA)方法通过利用标记的源数据和未标记的目标数据来缩小域差距而受到关注。然而,现有的 UDA 方法主要侧重于捕获类级表示,而忽略了域级表示。此外,伪标签的不确定性可能会阻碍分割性能。为了解决这些问题,我们提出了一种新的域交互式对比学习和原型引导自我训练(DCL-PS)框架,用于跨域息肉分割。具体来说,提出了具有域混合原型更新策略的域交互式对比学习(DCL)来区分跨域的类特征表示。然后,为了增强编码器的特征提取能力,我们提出了一种基于对比学习的交叉一致性训练(CL-CCT)策略,该策略应用于由主解码器的输出和扰动的辅助输出获得的原型。此外,我们提出了一种原型引导的自训练(PS)策略,该策略在自训练过程中动态为每个像素分配权重,过滤掉不可靠的像素并提高伪标签的质量。实验结果证明了 DCL-PS 在提高目标域息肉分割性能方面的优越性。代码将发布在https://github.com/taozh2017/DCLPS。
AU Tian, Xiang
Ye, Jian'an
Zhang, Tao
Zhang, Liangliang
Liu, Xuechao
Fu, Feng
Shi, Xuetao
Xu, Canhua
区田、叶向、张建安、张涛、刘亮亮、付雪超、石峰、徐雪涛、灿华
Multi-Path Fusion in SFCF-Net for Enhanced Multi-Frequency Electrical
Impedance Tomography
SFCF-Net 中的多路径融合用于增强型多频电阻抗断层扫描
Multi-frequency electrical impedance tomography (mfEIT) offers a
nondestructive imaging technology that reconstructs the distribution of
electrical characteristics within a subject based on the impedance
spectral differences among biological tissues. However, the technology
faces challenges in imaging multi-class lesion targets when the
conductivity of background tissues is frequency-dependent. To address
these issues, we propose a spatial-frequency cross-fusion network
(SFCF-Net) imaging algorithm, built on a multi-path fusion structure.
This algorithm uses multi-path structures and hyper-dense connections to
capture both spatial and frequency correlations between multi-frequency
conductivity images, which achieves differential imaging for lesion
targets of multiple categories through cross-fusion of information.
According to both simulation and physical experiment results, the
proposed SFCF-Net algorithm shows an excellent performance in terms of
lesion imaging and category discrimination compared to the weighted
frequency-difference, U-Net, and MMV-Net algorithms. The proposed
algorithm enhances the ability of mfEIT to simultaneously obtain both
structural and spectral information from the tissue being examined and
improves the accuracy and reliability of mfEIT, opening new avenues for
its application in clinical diagnostics and treatment monitoring.
多频电阻抗断层扫描 (mfEIT) 提供了一种无损成像技术,可根据生物组织之间的阻抗谱差异重建对象内的电特性分布。然而,当背景组织的电导率依赖于频率时,该技术在对多类病变目标进行成像时面临挑战。为了解决这些问题,我们提出了一种基于多路径融合结构的空间频率交叉融合网络(SFCF-Net)成像算法。该算法利用多路径结构和超密集连接来捕获多频电导率图像之间的空间和频率相关性,通过信息的交叉融合实现多类别病变目标的差分成像。根据仿真和物理实验结果,与加权频差、U-Net 和 MMV-Net 算法相比,所提出的 SFCF-Net 算法在病变成像和类别区分方面表现出优异的性能。所提出的算法增强了mfEIT同时从被检查组织获取结构和光谱信息的能力,提高了mfEIT的准确性和可靠性,为其在临床诊断和治疗监测中的应用开辟了新途径。
AU Wen, Chi
Ye, Mang
Li, He
Chen, Ting
Xiao, Xuan
区文、池野、李芒、何晨、肖霆、轩
Concept-based Lesion Aware Transformer for Interpretable Retinal Disease
Diagnosis.
基于概念的病变感知变压器,用于可解释的视网膜疾病诊断。
Existing deep learning methods have achieved remarkable results in
diagnosing retinal diseases, showcasing the potential of advanced AI in
ophthalmology. However, the black-box nature of these methods obscures
the decision-making process, compromising their trustworthiness and
acceptability. Inspired by the concept-based approaches and recognizing
the intrinsic correlation between retinal lesions and diseases, we
regard retinal lesions as concepts and propose an inherently
interpretable framework designed to enhance both the performance and
explainability of diagnostic models. Leveraging the transformer
architecture, known for its proficiency in capturing long-range
dependencies, our model can effectively identify lesion features. By
integrating with image-level annotations, it achieves the alignment of
lesion concepts with human cognition under the guidance of a retinal
foundation model. Furthermore, to attain interpretability without losing
lesion-specific information, our method employs a classifier built on a
cross-attention mechanism for disease diagnosis and explanation, where
explanations are grounded in the contributions of human-understandable
lesion concepts and their visual localization. Notably, due to the
structure and inherent interpretability of our model, clinicians can
implement concept-level interventions to correct the diagnostic errors
by simply adjusting erroneous lesion predictions. Experiments conducted
on four fundus image datasets demonstrate that our method achieves
favorable performance against state-of-the-art methods while providing
faithful explanations and enabling conceptlevel interventions. Our code
is publicly available at https://github.com/Sorades/CLAT.
现有的深度学习方法在诊断视网膜疾病方面取得了显着的成果,展示了先进人工智能在眼科领域的潜力。然而,这些方法的黑箱性质掩盖了决策过程,损害了其可信度和可接受性。受到基于概念的方法的启发并认识到视网膜病变与疾病之间的内在相关性,我们将视网膜病变视为概念,并提出了一个本质上可解释的框架,旨在增强诊断模型的性能和可解释性。利用以其擅长捕获远程依赖性而闻名的变压器架构,我们的模型可以有效地识别病变特征。通过与图像级注释相结合,在视网膜基础模型的指导下实现了病变概念与人类认知的一致性。此外,为了在不丢失病变特定信息的情况下获得可解释性,我们的方法采用了基于交叉注意机制的分类器来进行疾病诊断和解释,其中解释基于人类可理解的病变概念及其视觉定位的贡献。值得注意的是,由于我们模型的结构和固有的可解释性,临床医生可以通过简单地调整错误的病变预测来实施概念级干预来纠正诊断错误。在四个眼底图像数据集上进行的实验表明,我们的方法相对于最先进的方法取得了良好的性能,同时提供了忠实的解释并实现了概念级干预。我们的代码可在 https://github.com/Sorades/CLAT 上公开获取。
AU Zhang, Ke
Yang, Yan
Yu, Jun
Fan, Jianping
Jiang, Hanliang
Huang, Qingming
Han, Weidong
张AU、柯阳、于彦、范军、蒋建平、黄汉良、韩清明、卫东
Attribute Prototype-guided Iterative Scene Graph for Explainable
Radiology Report Generation.
用于生成可解释的放射学报告的属性原型引导的迭代场景图。
The potential of automated radiology report generation in alleviating
the time-consuming tasks of radiologists is increasingly being
recognized in medical practice. Existing report generation methods have
evolved from using image-level features to the latest approach of
utilizing anatomical regions, significantly enhancing interpretability.
However, directly and simplistically using region features for report
generation compromises the capability of relation reasoning and
overlooks the common attributes potentially shared across regions. To
address these limitations, we propose a novel region-based Attribute
Prototype-guided Iterative Scene Graph generation framework (AP-ISG) for
report generation, utilizing scene graph generation as an auxiliary task
to further enhance interpretability and relational reasoning capability.
The core components of AP-ISG are the Iterative Scene Graph Generation
(ISGG) module and the Attribute Prototype-guided Learning (APL) module.
Specifically, ISSG employs an autoregressive scheme for structural edge
reasoning and a contextualization mechanism for relational reasoning.
APL enhances intra-prototype matching and reduces inter-prototype
semantic overlap in the visual space to fully model the potential
attribute commonalities among regions. Extensive experiments on the
MIMIC-CXR with Chest ImaGenome datasets demonstrate the superiority of
AP-ISG across multiple metrics.
自动生成放射学报告在减轻放射科医生耗时任务方面的潜力在医疗实践中越来越得到认可。现有的报告生成方法已经从使用图像级特征发展到利用解剖区域的最新方法,显着增强了可解释性。然而,直接简单地使用区域特征来生成报告会损害关系推理的能力,并忽略跨区域可能共享的共同属性。为了解决这些限制,我们提出了一种新颖的基于区域的属性原型引导的迭代场景图生成框架(AP-ISG)用于报告生成,利用场景图生成作为辅助任务来进一步增强可解释性和关系推理能力。 AP-ISG的核心组件是迭代场景图生成(ISGG)模块和属性原型引导学习(APL)模块。具体来说,ISSG 采用自回归方案进行结构边缘推理,并采用上下文化机制进行关系推理。 APL增强了原型内匹配并减少了视觉空间中原型间语义重叠,以充分建模区域之间潜在的属性共性。使用 Chest ImaGenome 数据集对 MIMIC-CXR 进行的大量实验证明了 AP-ISG 在多个指标上的优越性。
AU Huang, Zhili
Sun, Jingyi
Shao, Yifan
Wang, Zixuan
Wang, Su
Li, Qiyong
Li, Jinsong
Yu, Qian
AU Huang, 孙志立, 邵静一, 王一凡, 王子轩, 苏力, 李启勇, 于劲松, 钱
PolarFormer: A Transformer-based Method for Multi-lesion Segmentation in
Intravascular OCT.
PolarFormer:一种基于 Transformer 的血管内 OCT 多病灶分割方法。
Several deep learning-based methods have been proposed to extract
vulnerable plaques of a single class from intravascular optical
coherence tomography (OCT) images. However, further research is limited
by the lack of publicly available large-scale intravascular OCT datasets
with multi-class vulnerable plaque annotations. Additionally,
multi-class vulnerable plaque segmentation is extremely challenging due
to the irregular distribution of plaques, their unique geometric shapes,
and fuzzy boundaries. Existing methods have not adequately addressed the
geometric features and spatial prior information of vulnerable plaques.
To address these issues, we collected a dataset containing 70 pullback
data and developed a multi-class vulnerable plaque segmentation model,
called PolarFormer, that incorporates the prior knowledge of vulnerable
plaques in spatial distribution. The key module of our proposed model is
Polar Attention, which models the spatial relationship of vulnerable
plaques in the radial direction. Extensive experiments conducted on the
new dataset demonstrate that our proposed method outperforms other
baseline methods. Code and data can be accessed via this link:
https://github.com/sunjingyi0415/IVOCT-segementaion.
已经提出了几种基于深度学习的方法来从血管内光学相干断层扫描(OCT)图像中提取单一类别的易损斑块。然而,由于缺乏具有多类易损斑块注释的公开大规模血管内 OCT 数据集,进一步的研究受到限制。此外,由于斑块的不规则分布、独特的几何形状和模糊的边界,多类易损斑块分割极具挑战性。现有方法尚未充分解决易损斑块的几何特征和空间先验信息。为了解决这些问题,我们收集了包含 70 个回拉数据的数据集,并开发了一个名为 PolarFormer 的多类易损斑块分割模型,该模型结合了易损斑块空间分布的先验知识。我们提出的模型的关键模块是极地注意力,它模拟易损斑块在径向方向的空间关系。对新数据集进行的大量实验表明,我们提出的方法优于其他基线方法。代码和数据可以通过以下链接访问:https://github.com/sunjingyi0415/IVOCT-segementaion。
AU Yang, Yanwu
Ye, Chenfei
Su, Guinan
Zhang, Ziyao
Chang, Zhikai
Chen, Hairui
Chan, Piu
Yu, Yue
Ma, Ting
欧阳、叶彦武、苏晨飞、张桂楠、常子耀、陈志凯、陈海瑞、于彪、马跃、丁
BrainMass: Advancing Brain Network Analysis for Diagnosis with
Large-scale Self-Supervised Learning.
BrainMass:通过大规模自我监督学习推进大脑网络分析诊断。
Foundation models pretrained on large-scale datasets via self-supervised
learning demonstrate exceptional versatility across various tasks. Due
to the heterogeneity and hard-to-collect medical data, this approach is
especially beneficial for medical image analysis and neuroscience
research, as it streamlines broad downstream tasks without the need for
numerous costly annotations. However, there has been limited
investigation into brain network foundation models, limiting their
adaptability and generalizability for broad neuroscience studies. In
this study, we aim to bridge this gap. In particular, (1) we curated a
comprehensive dataset by collating images from 30 datasets, which
comprises 70,781 samples of 46,686 participants. Moreover, we introduce
pseudo-functional connectivity (pFC) to further generates millions of
augmented brain networks by randomly dropping certain timepoints of the
BOLD signal. (2) We propose the BrainMass framework for brain network
self-supervised learning via mask modeling and feature alignment.
BrainMass employs Mask-ROI Modeling (MRM) to bolster intra-network
dependencies and regional specificity. Furthermore, Latent
Representation Alignment (LRA) module is utilized to regularize
augmented brain networks of the same participant with similar
topological properties to yield similar latent representations by
aligning their latent embeddings. Extensive experiments on eight
internal tasks and seven external brain disorder diagnosis tasks show
BrainMass's superior performance, highlighting its significant
generalizability and adaptability. Nonetheless, BrainMass demonstrates
powerful few/zero-shot learning abilities and exhibits meaningful
interpretation to various diseases, showcasing its potential use for
clinical applications.
通过自我监督学习在大规模数据集上进行预训练的基础模型在各种任务中表现出了卓越的多功能性。由于医疗数据的异质性和难以收集,这种方法对于医学图像分析和神经科学研究特别有益,因为它简化了广泛的下游任务,而不需要大量昂贵的注释。然而,对脑网络基础模型的研究有限,限制了它们对广泛神经科学研究的适应性和普遍性。在这项研究中,我们的目标是弥合这一差距。特别是,(1) 我们通过整理 30 个数据集的图像来整理一个综合数据集,其中包括 46,686 名参与者的 70,781 个样本。此外,我们引入伪功能连接(pFC),通过随机丢弃 BOLD 信号的某些时间点来进一步生成数百万个增强的大脑网络。 (2) 我们提出了通过掩模建模和特征对齐进行脑网络自监督学习的 BrainMass 框架。 BrainMass 采用 Mask-ROI 建模 (MRM) 来增强网络内依赖性和区域特异性。此外,潜在表示对齐(LRA)模块用于规范具有相似拓扑属性的同一参与者的增强大脑网络,以通过对齐其潜在嵌入来产生相似的潜在表示。对8个内部任务和7个外部脑部疾病诊断任务的大量实验显示了BrainMass的优越性能,凸显了其显着的通用性和适应性。尽管如此,BrainMass 展示了强大的少样本/零样本学习能力,并对各种疾病表现出有意义的解释,展示了其在临床应用中的潜在用途。
AU Jang, Se-In
Pan, Tinsu
Li, Ye
Heidari, Pedram
Chen, Junyu
Li, Quanzheng
Gong, Kuang
AU Jang、Se-In Pan、Tinsu Li、Ye Heidari、Pedram Chen、Junyu Li、Quanzheng Kong、Kuang
Spach Transformer: Spatial and Channel-Wise Transformer Based on Local
and Global Self-Attentions for PET Image Denoising
Spach Transformer:基于局部和全局自注意力的空间和通道变换器,用于 PET 图像去噪
Position emission tomography (PET) is widely used in clinics and
research due to its quantitative merits and high sensitivity, but
suffers from low signal-to-noise ratio (SNR). Recently convolutional
neural networks (CNNs) have been widely used to improve PET image
quality. Though successful and efficient in local feature extraction,
CNN cannot capture long-range dependencies well due to its limited
receptive field. Global multi-head self-attention (MSA) is a popular
approach to capture long-range information. However, the calculation of
global MSA for 3D images has high computational costs. In this work, we
proposed an efficient spatial and channel-wise encoder-decoder
transformer, Spach Transformer, that can leverage spatial and channel
information based on local and global MSAs. Experiments based on
datasets of different PET tracers, i.e., F-18-FDG, F-18-ACBC,
F-18-DCFPyL, and Ga-68-DOTATATE, were conducted to evaluate the proposed
framework. Quantitative results show that the proposed Spach Transformer
framework outperforms state-of-the-art deep learning architectures.
位置发射断层扫描(PET)因其定量优点和高灵敏度而广泛应用于临床和研究,但其信噪比(SNR)较低。最近,卷积神经网络(CNN)已被广泛用于提高 PET 图像质量。尽管 CNN 在局部特征提取方面成功且高效,但由于其感受野有限,无法很好地捕获远程依赖性。全局多头自注意力(MSA)是一种捕获远程信息的流行方法。然而,3D图像的全局MSA计算具有较高的计算成本。在这项工作中,我们提出了一种高效的空间和通道编码器-解码器变压器 Spach Transformer,它可以利用基于本地和全局 MSA 的空间和通道信息。基于不同 PET 示踪剂(即 F-18-FDG、F-18-ACBC、F-18-DCFPyL 和 Ga-68-DOTATATE)数据集进行实验来评估所提出的框架。定量结果表明,所提出的 Spach Transformer 框架优于最先进的深度学习架构。
AU Penso, Coby
Frenkel, Lior
Goldberger, Jacob
AU Penso、科比·弗兰克尔、利奥尔·戈德伯格、雅各布
Confidence Calibration of a Medical Imaging Classification System That
is Robust to Label Noise
能够鲁棒地标记噪声的医学成像分类系统的置信度校准
A classification model is calibrated if its predicted probabilities of
outcomes reflect their accuracy. Calibrating neural networks is critical
in medical analysis applications where clinical decisions rely upon the
predicted probabilities. Most calibration procedures, such as
temperature scaling, operate as a post processing step by using holdout
validation data. In practice, it is difficult to collect medical image
data with correct labels due to the complexity of the medical data and
the considerable variability across experts. This study presents a
network calibration procedure that is robust to label noise. We draw on
the fact that the confusion matrix of the noisy labels can be expressed
as the matrix product between the confusion matrix of the clean labels
and the label noises. The method is based on estimating the noise level
as part of a noise-robust training method. The noise level is then used
to estimate the network accuracy required by the calibration procedure.
We show that despite the unreliable labels, we can still achieve
calibration results that are on a par with the results of a calibration
procedure using data with reliable labels.
如果分类模型的预测结果概率反映了其准确性,那么分类模型就被校准。校准神经网络在临床决策依赖于预测概率的医学分析应用中至关重要。大多数校准程序(例如温度缩放)通过使用保留验证数据作为后处理步骤运行。在实践中,由于医学数据的复杂性以及专家之间的巨大差异,很难收集具有正确标签的医学图像数据。这项研究提出了一种对标签噪声具有鲁棒性的网络校准程序。我们利用这样的事实:噪声标签的混淆矩阵可以表示为干净标签的混淆矩阵与标签噪声之间的矩阵乘积。该方法基于估计噪声水平,作为抗噪声训练方法的一部分。然后使用噪声水平来估计校准过程所需的网络精度。我们表明,尽管标签不可靠,我们仍然可以获得与使用具有可靠标签的数据的校准程序的结果相同的校准结果。
AU Chen, Qianqian
Zhang, Jiadong
Meng, Runqi
Zhou, Lei
Li, Zhenhui
Feng, Qianjin
Shen, Dinggang
陈AU、张倩倩、孟家栋、周润奇、李雷、冯振辉、沉前进、丁刚
Modality-Specific Information Disentanglement From Multi-Parametric MRI
for Breast Tumor Segmentation and Computer-Aided Diagnosis
用于乳腺肿瘤分割和计算机辅助诊断的多参数 MRI 的模态特定信息分离
Breast cancer is becoming a significant global health challenge, with
millions of fatalities annually. Magnetic Resonance Imaging (MRI) can
provide various sequences for characterizing tumor morphology and
internal patterns, and becomes an effective tool for detection and
diagnosis of breast tumors. However, previous deep-learning based tumor
segmentation methods from multi-parametric MRI still have limitations in
exploring inter-modality information and focusing task-informative
modality/modalities. To address these shortcomings, we propose a
Modality-Specific Information Disentanglement (MoSID) framework to
extract both inter- and intra-modality attention maps as prior knowledge
for guiding tumor segmentation. Specifically, by disentangling
modality-specific information, the MoSID framework provides
complementary clues for the segmentation task, by generating
modality-specific attention maps to guide modality selection and
inter-modality evaluation. Our experiments on two 3D breast datasets and
one 2D prostate dataset demonstrate that the MoSID framework outperforms
other state-of-the-art multi-modality segmentation methods, even in the
cases of missing modalities. Based on the segmented lesions, we further
train a classifier to predict the patients' response to radiotherapy.
The prediction accuracy is comparable to the case of using
manually-segmented tumors for treatment outcome prediction, indicating
the robustness and effectiveness of the proposed segmentation method.
The code is available at https://github.com/Qianqian-Chen/MoSID.
乳腺癌正在成为一项重大的全球健康挑战,每年导致数百万人死亡。磁共振成像(MRI)可以提供表征肿瘤形态和内部模式的各种序列,成为乳腺肿瘤检测和诊断的有效工具。然而,先前基于多参数 MRI 的深度学习肿瘤分割方法在探索模态间信息和聚焦任务信息模态方面仍然存在局限性。为了解决这些缺点,我们提出了一种模态特定信息解缠(MoSID)框架,以提取模态间和模内注意图作为指导肿瘤分割的先验知识。具体来说,通过解开特定于模态的信息,MoSID 框架通过生成特定于模态的注意力图来指导模态选择和模态间评估,为分割任务提供补充线索。我们对两个 3D 乳房数据集和一个 2D 前列腺数据集进行的实验表明,即使在缺少模态的情况下,MoSID 框架也优于其他最先进的多模态分割方法。基于分割的病灶,我们进一步训练分类器来预测患者对放疗的反应。预测精度与使用手动分割肿瘤进行治疗结果预测的情况相当,表明所提出的分割方法的稳健性和有效性。代码可在 https://github.com/Qianqian-Chen/MoSID 获取。
AU Sengupta, Sourya
Anastasio, Mark A.
AU Sengupta、Sourya Anastasio、Mark A.
A Test Statistic Estimation-Based Approach for Establishing
Self-Interpretable CNN-Based Binary Classifiers
一种基于测试统计估计的方法,用于建立可自解释的基于 CNN 的二元分类器
Interpretability is highly desired for deep neural network-based
classifiers, especially when addressing high-stake decisions in medical
imaging. Commonly used post-hoc interpretability methods have the
limitation that they can produce plausible but different interpretations
of a given model, leading to ambiguity about which one to choose. To
address this problem, a novel decision-theory-inspired approach is
investigated to establish a self-interpretable model, given a
pre-trained deep binary black-box medical image classifier. This
approach involves utilizing a self-interpretable encoder-decoder model
in conjunction with a single-layer fully connected network with unity
weights. The model is trained to estimate the test statistic of the
given trained black-box deep binary classifier to maintain a similar
accuracy. The decoder output image, referred to as an equivalency map,
is an image that represents a transformed version of the
to-be-classified image that, when processed by the fixed fully connected
layer, produces the same test statistic value as the original
classifier. The equivalency map provides a visualization of the
transformed image features that directly contribute to the test
statistic value and, moreover, permits quantification of their relative
contributions. Unlike the traditional post-hoc interpretability methods,
the proposed method is self-interpretable, quantitative. Detailed
quantitative and qualitative analyses have been performed with three
different medical image binary classification tasks.
基于深度神经网络的分类器非常需要可解释性,特别是在处理医学成像中的高风险决策时。常用的事后可解释性方法具有局限性,即它们可以对给定模型产生看似合理但不同的解释,从而导致选择哪一种模型的模糊性。为了解决这个问题,研究了一种新颖的决策理论启发方法,在给定预训练的深度二元黑盒医学图像分类器的情况下建立可自我解释的模型。这种方法涉及利用可自解释的编码器-解码器模型以及具有统一权重的单层全连接网络。该模型经过训练来估计给定训练的黑盒深度二元分类器的测试统计量,以保持类似的准确性。解码器输出图像,称为等价图,是表示待分类图像的变换版本的图像,当由固定的全连接层处理时,产生与原始分类器相同的测试统计值。等价图提供了直接贡献于测试统计值的变换图像特征的可视化,此外,还允许量化它们的相对贡献。与传统的事后可解释性方法不同,所提出的方法是可自我解释的、定量的。对三种不同的医学图像二元分类任务进行了详细的定量和定性分析。
C1 Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA
C1 Univ Illinois, Dept Bioengn, Urbana, IL 61801 USA
SN 0278-0062
EI 1558-254X
DA 2024-05-23
UT WOS:001214547800003
PM 38163307
ER
C1 伊利诺伊大学,Elect & Comp Engn,厄巴纳,IL 61801 美国 C1 伊利诺伊大学,生物工程系,厄巴纳,IL 61801 美国 SN 0278-0062 EI 1558-254X DA 2024-05-23 UT WOS:001214547800003 PM 38163307 ER
AU Beuret, Samuel
Heriard-Dubreuil, Baptiste
Martiartu, Naiara Korta
Jaeger, Michael
Thiran, Jean-Philippe
AU Beuret、Samuel Heriard-Dubreuil、Baptiste Martiartu、Naiara Korta Jaeger、Michael Thiran、Jean-Philippe
Windowed Radon Transform for Robust Speed-of-Sound Imaging With
Pulse-Echo Ultrasound
窗口氡变换用于脉冲回波超声的鲁棒声速成像
In recent years, methods estimating the spatial distribution of tissue
speed of sound with pulse-echo ultrasound are gaining considerable
traction. They can address limitations of B-mode imaging, for instance
in diagnosing fatty liver diseases. Current state-of-the-art methods
relate the tissue speed of sound to local echo shifts computed between
images that are beamformed using restricted transmit and receive
apertures. However, the aperture limitation affects the robustness of
phase-shift estimations and, consequently, the accuracy of reconstructed
speed-of-sound maps. Here, we propose a method based on the Radon
transform of image patches able to estimate local phase shifts from
full-aperture images. We validate our technique on simulated, phantom
and in-vivo data acquired on a liver and compare it with a
state-of-the-art method. We show that the proposed method enhances the
stability to changes of beamforming speed of sound and to a reduction of
the number of insonifications. In particular, the deployment of
pulse-echo speed-of-sound estimation methods onto portable ultrasound
devices can be eased by the reduction of the number of insonifications
allowed by the proposed method.
近年来,利用脉冲回波超声估计组织声速空间分布的方法获得了相当大的关注。它们可以解决 B 型成像的局限性,例如在诊断脂肪肝疾病方面。当前最先进的方法将声波的组织速度与使用受限的发射和接收孔径进行波束形成的图像之间计算的局部回声偏移相关联。然而,孔径限制影响相移估计的鲁棒性,从而影响重建声速图的准确性。在这里,我们提出了一种基于图像块 Radon 变换的方法,能够估计全孔径图像的局部相移。我们在肝脏上获得的模拟数据、模型数据和体内数据验证了我们的技术,并将其与最先进的方法进行比较。我们表明,所提出的方法增强了声波束形成速度变化的稳定性并减少了声穿透的数量。特别地,通过减少所提出的方法允许的声穿透的数量,可以简化脉冲回波声速估计方法在便携式超声设备上的部署。
AU Ortiz-Gonzalez, Antonio
Kobler, Erich
Simon, Stefan
Bischoff, Leon
Nowak, Sebastian
Isaak, Alexander
Block, Wolfgang
Sprinkart, Alois M.
Attenberger, Ulrike
Luetkens, Julian A.
Bayro-Corrochano, Eduardo
Effland, Alexander
AU Ortiz-Gonzalez、Antonio Kobler、Erich Simon、Stefan Bischoff、Leon Nowak、Sebastian Isaak、Alexander Block、Wolfgang Sprinkart、Alois M. Attenberger、Ulrike Luetkens、Julian A. Bayro-Corrochano、Eduardo Effland、Alexander
Optical Flow-Guided Cine MRI Segmentation With Learned Corrections
具有学习校正的光流引导电影 MRI 分割
In cardiac cine magnetic resonance imaging (MRI), the heart is
repeatedly imaged at numerous time points during the cardiac cycle.
Frequently, the temporal evolution of a certain region of interest such
as the ventricles or the atria is highly relevant for clinical
diagnosis. In this paper, we devise a novel approach that allows for an
automatized propagation of an arbitrary region of interest (ROI) along
the cardiac cycle from respective annotated ROIs provided by medical
experts at two different points in time, most frequently at the
end-systolic (ES) and the end-diastolic (ED) cardiac phases. At its
core, a 3D TV- $\boldsymbol {L<^>{1}}$ -based optical flow algorithm
computes the apparent motion of consecutive MRI images in forward and
backward directions. Subsequently, the given terminal annotated masks
are propagated by this bidirectional optical flow in 3D, which results,
however, in improper initial estimates of the segmentation masks due to
numerical inaccuracies. These initially propagated segmentation masks
are then refined by a 3D U-Net-based convolutional neural network (CNN),
which was trained to enforce consistency with the forward and backward
warped masks using a novel loss function. Moreover, a penalization term
in the loss function controls large deviations from the initial
segmentation masks. This method is benchmarked both on a new dataset
with annotated single ventricles containing patients with severe heart
diseases and on a publicly available dataset with different annotated
ROIs. We emphasize that our novel loss function enables fine-tuning the
CNN on a single patient, thereby yielding state-of-the-art results along
the complete cardiac cycle.
在心脏电影磁共振成像 (MRI) 中,心脏在心动周期的多个时间点重复成像。通常,某个感兴趣区域(例如心室或心房)的时间演变与临床诊断高度相关。在本文中,我们设计了一种新颖的方法,允许根据医学专家在两个不同时间点(最常见的是收缩末期)提供的各自注释的 ROI,沿着心动周期自动传播任意感兴趣区域 (ROI) (ES) 和舒张末期 (ED) 心脏阶段。其核心是基于 3D TV- $\boldsymbol {L<^>{1}}$ 的光流算法计算连续 MRI 图像向前和向后方向的表观运动。随后,给定的终端注释掩模通过 3D 中的双向光流传播,然而,由于数值不准确,这导致分割掩模的初始估计不正确。然后,这些最初传播的分割掩模由基于 3D U-Net 的卷积神经网络 (CNN) 进行细化,该网络经过训练,可使用新颖的损失函数来强制与前向和后向扭曲掩模保持一致。此外,损失函数中的惩罚项控制与初始分割掩模的大偏差。该方法在包含严重心脏病患者的带注释单心室的新数据集和具有不同注释 ROI 的公开数据集上进行基准测试。我们强调,我们的新颖损失函数能够对单个患者的 CNN 进行微调,从而在整个心动周期中产生最先进的结果。
AU Qiao, Mengyun
Wang, Shuo
Qiu, Huaqi
de Marvao, Antonio
O'Regan, Declan P.
Rueckert, Daniel
Bai, Wenjia
AU Qiao, 王梦云, 邱硕, Huaqi de Marvao, Antonio O'Regan, Declan P. Rueckert, Daniel Bai, Wenjia
CHeart: A Conditional Spatio-Temporal Generative Model for Cardiac
Anatomy
CHart:心脏解剖学条件时空生成模型
Two key questions in cardiac image analysis are to assess the anatomy
and motion of the heart from images; and to understand how they are
associated with non-imaging clinical factors such as gender, age and
diseases. While the first question can often be addressed by image
segmentation and motion tracking algorithms, our capability to model and
answer the second question is still limited. In this work, we propose a
novel conditional generative model to describe the 4D spatio-temporal
anatomy of the heart and its interaction with non-imaging clinical
factors. The clinical factors are integrated as the conditions of the
generative modelling, which allows us to investigate how these factors
influence the cardiac anatomy. We evaluate the model performance in
mainly two tasks, anatomical sequence completion and sequence
generation. The model achieves high performance in anatomical sequence
completion, comparable to or outperforming other state-of-the-art
generative models. In terms of sequence generation, given clinical
conditions, the model can generate realistic synthetic 4D sequential
anatomies that share similar distributions with the real data.
心脏图像分析中的两个关键问题是从图像中评估心脏的解剖结构和运动;并了解它们与性别、年龄和疾病等非影像学临床因素的关系。虽然第一个问题通常可以通过图像分割和运动跟踪算法来解决,但我们建模和回答第二个问题的能力仍然有限。在这项工作中,我们提出了一种新颖的条件生成模型来描述心脏的 4D 时空解剖结构及其与非成像临床因素的相互作用。临床因素被整合为生成模型的条件,这使我们能够研究这些因素如何影响心脏解剖结构。我们主要在两个任务中评估模型性能:解剖序列完成和序列生成。该模型在解剖序列完成方面实现了高性能,可与其他最先进的生成模型相媲美或优于其他最先进的生成模型。在序列生成方面,在给定临床条件下,该模型可以生成与真实数据具有相似分布的真实合成 4D 序列解剖结构。
AU Gao, Qi
Li, Zilong
Zhang, Junping
Zhang, Yi
Shan, Hongming
AU 高、李奇、张子龙、张军平、单毅、洪明
CoreDiff: Contextual Error-Modulated Generalized Diffusion Model for
Low-Dose CT Denoising and Generalization
CoreDiff:用于低剂量 CT 去噪和泛化的上下文误差调制广义扩散模型
Low-dose computed tomography (CT) images suffer from noise and artifacts
due to photon starvation and electronic noise. Recently, some works have
attempted to use diffusion models to address the over-smoothness and
training instability encountered by previous deep-learning-based
denoising models. However, diffusion models suffer from long inference
time due to a large number of sampling steps involved. Very recently,
cold diffusion model generalizes classical diffusion models and has
greater flexibility. Inspired by cold diffusion, this paper presents a
novel COntextual eRror-modulated gEneralized Diffusion model for
low-dose CT (LDCT) denoising, termed CoreDiff. First, CoreDiff utilizes
LDCT images to displace the random Gaussian noise and employs a novel
mean-preserving degradation operator to mimic the physical process of CT
degradation, significantly reducing sampling steps thanks to the
informative LDCT images as the starting point of the sampling process.
Second, to alleviate the error accumulation problem caused by the
imperfect restoration operator in the sampling process, we propose a
novel ContextuaL Error-modulAted Restoration Network (CLEAR-Net), which
can leverage contextual information to constrain the sampling process
from structural distortion and modulate time step embedding features for
better alignment with the input at the next time step. Third, to rapidly
generalize the trained model to a new, unseen dose level with as few
resources as possible, we devise a one-shot learning framework to make
CoreDiff generalize faster and better using only one single LDCT image
(un)paired with normal-dose CT (NDCT). Extensive experimental results on
four datasets demonstrate that our CoreDiff outperforms competing
methods in denoising and generalization performance, with clinically
acceptable inference time. Source code is made available at
https://github.com/qgao21/CoreDiff.
由于光子匮乏和电子噪声,低剂量计算机断层扫描 (CT) 图像会受到噪声和伪影的影响。最近,一些工作尝试使用扩散模型来解决先前基于深度学习的去噪模型遇到的过度平滑和训练不稳定的问题。然而,由于涉及大量采样步骤,扩散模型的推理时间较长。最近,冷扩散模型概括了经典扩散模型并具有更大的灵活性。受冷扩散的启发,本文提出了一种用于低剂量 CT (LDCT) 降噪的新型上下文误差调制广义扩散模型,称为 CoreDiff。首先,CoreDiff 利用 LDCT 图像来取代随机高斯噪声,并采用新颖的保均退化算子来模拟 CT 退化的物理过程,由于信息丰富的 LDCT 图像作为采样过程的起点,显着减少了采样步骤。其次,为了缓解采样过程中不完美恢复算子引起的误差累积问题,我们提出了一种新颖的上下文误差调制恢复网络(CLEAR-Net),它可以利用上下文信息来约束采样过程的结构失真和调制时间步嵌入特征,以便更好地与下一个时间步的输入对齐。第三,为了用尽可能少的资源将训练好的模型快速泛化到新的、未见过的剂量水平,我们设计了一个一次性学习框架,使 CoreDiff 只使用一张(未)与正常图像配对的 LDCT 图像来更快更好地泛化。剂量CT(NDCT)。 四个数据集的广泛实验结果表明,我们的 CoreDiff 在去噪和泛化性能方面优于竞争方法,并且具有临床可接受的推理时间。源代码可在 https://github.com/qgao21/CoreDiff 获取。
AU Zhang, Ruipeng
Qin, Binjie
Zhao, Jun
Zhu, Yueqi
Lv, Yisong
Ding, Song
张AU、秦瑞鹏、赵斌杰、朱军、吕悦琪、丁一松、宋
Locating X-Ray Coronary Angiogram Keyframes via Long Short-Term
Spatiotemporal Attention With Image-to-Patch Contrastive Learning
通过图像到斑块对比学习的长短期时空注意力定位 X 射线冠状动脉造影关键帧
Locating the start, apex and end keyframes of moving contrast agents for
keyframe counting in X-ray coronary angiography (XCA) is very important
for the diagnosis and treatment of cardiovascular diseases. To locate
these keyframes from the class-imbalanced and boundary-agnostic
foreground vessel actions that overlap complex backgrounds, we propose
long short-term spatiotemporal attention by integrating a convolutional
long short-term memory (CLSTM) network into a multiscale Transformer to
learn the segment- and sequence-level dependencies in the
consecutive-frame-based deep features. Image-to-patch contrastive
learning is further embedded between the CLSTM-based long-term
spatiotemporal attention and Transformer-based short-term attention
modules. The imagewise contrastive module reuses the long-term attention
to contrast image-level foreground/background of XCA sequence, while
patchwise contrastive projection selects the random patches of
backgrounds as convolution kernels to project foreground/background
frames into different latent spaces. A new XCA video dataset is
collected to evaluate the proposed method. The experimental results show
that the proposed method achieves a mAP (mean average precision) of
72.45% and a F-score of 0.8296, considerably outperforming the
state-of-the-art methods. The source code is available at
https://github.com/Binjie-Qin/STA-IPCon.
在X射线冠状动脉造影(XCA)中定位移动造影剂的起始、顶点和结束关键帧进行关键帧计数对于心血管疾病的诊断和治疗非常重要。为了从与复杂背景重叠的类不平衡和边界无关的前景血管动作中定位这些关键帧,我们通过将卷积长短期记忆(CLSTM)网络集成到多尺度 Transformer 中来学习分段,从而提出长期短期时空注意力- 基于连续帧的深层特征中的序列级依赖性。图像到补丁对比学习进一步嵌入基于 CLSTM 的长期时空注意力模块和基于 Transformer 的短期注意力模块之间。图像对比模块重用了对XCA序列的图像级前景/背景对比的长期关注,而补丁对比投影则选择背景的随机补丁作为卷积核,将前景/背景帧投影到不同的潜在空间中。收集新的 XCA 视频数据集来评估所提出的方法。实验结果表明,所提出的方法实现了 72.45% 的 mAP(平均平均精度)和 0.8296 的 F 分数,大大优于最先进的方法。源代码可在 https://github.com/Binjie-Qin/STA-IPCon 获取。
AU Wu, Junde
Zhang, Yu
Fang, Huihui
Duan, Lixin
Tan, Mingkui
Yang, Weihua
Wang, Chunhui
Liu, Huiying
Jin, Yueming
Xu, Yanwu
吴AU、张俊德、方宇、段慧慧、谭立新、杨明奎、王伟华、刘春慧、金惠英、徐月明、吴彦
Calibrate the Inter-Observer Segmentation Uncertainty via
Diagnosis-First Principle
通过诊断第一原则校准观察者间的分割不确定性
Many of the tissues/lesions in the medical images may be ambiguous.
Therefore, medical segmentation is typically annotated by a group of
clinical experts to mitigate personal bias. A common solution to fuse
different annotations is the majority vote, e.g., taking the average of
multiple labels. However, such a strategy ignores the difference between
the grader expertness. Inspired by the observation that medical image
segmentation is usually used to assist the disease diagnosis in clinical
practice, we propose the diagnosis-first principle, which is to take
disease diagnosis as the criterion to calibrate the inter-observer
segmentation uncertainty. Following this idea, a framework named
Diagnosis-First segmentation Framework (DiFF) is proposed. Specifically,
DiFF will first learn to fuse the multi-rater segmentation labels to a
single ground-truth which could maximize the disease diagnosis
performance. We dubbed the fused ground-truth as Diagnosis-First
Ground-truth (DF-GT). Then, the Take and Give Model (T&G Model) to
segment DF-GT from the raw image is proposed. With the T&G Model, DiFF
can learn the segmentation with the calibrated uncertainty that
facilitate the disease diagnosis. We verify the effectiveness of DiFF on
three different medical segmentation tasks: optic-disc/optic-cup (OD/OC)
segmentation on fundus images, thyroid nodule segmentation on ultrasound
images, and skin lesion segmentation on dermoscopic images. Experimental
results show that the proposed DiFF can effectively calibrate the
segmentation uncertainty, and thus significantly facilitate the
corresponding disease diagnosis, which outperforms previous
state-of-the-art multi-rater learning methods.
医学图像中的许多组织/病变可能是不明确的。因此,医学分割通常由一组临床专家进行注释,以减少个人偏见。融合不同注释的常见解决方案是多数投票,例如,取多个标签的平均值。然而,这样的策略忽略了评分者专业知识之间的差异。受到临床实践中医学图像分割通常用于辅助疾病诊断的观察的启发,我们提出了诊断优先原则,即以疾病诊断为标准来校准观察者间分割的不确定性。遵循这个想法,提出了一个名为诊断优先分割框架(DiFF)的框架。具体来说,DiFF 将首先学习将多评估者分割标签融合到单个基本事实,这可以最大限度地提高疾病诊断性能。我们将融合的地面实况称为诊断优先地面实况(DF-GT)。然后,提出了从原始图像中分割 DF-GT 的 Take and Give 模型(T&G 模型)。通过 T&G 模型,DiFF 可以学习具有校准不确定性的分割,从而促进疾病诊断。我们验证了 DiFF 在三种不同医学分割任务上的有效性:眼底图像上的视盘/视杯(OD/OC)分割、超声图像上的甲状腺结节分割以及皮肤镜图像上的皮肤病变分割。实验结果表明,所提出的 DiFF 可以有效校准分割不确定性,从而显着促进相应的疾病诊断,优于以前最先进的多评估者学习方法。
Technol, Hefei 230037, Peoples R China
C1 Pazhou Lab, Guangzhou 510005, Peoples R China
C1 Univ Elect Sci & Technol China, Sch Comp Sci & Technol, Chengdu 611731,
Sichuan, Peoples R China
C1 South China Univ Technol, Sch Software Engn, Guangzhou 518055,
Guangdong, Peoples R China
C1 Jinan Univ, Shenzhen Eye Hosp, Big Data & Artificial Intelligence Inst,
Shenzhen 518040, Peoples R China
C1 Harbin Inst Technol, Dept Elect Sci & Technol, Harbin 150001, Peoples R
China
C1 ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
SN 0278-0062
EI 1558-254X
DA 2024-09-18
UT WOS:001307429600009
PM 38669168
ER
合肥 230037, 人民 R 中国 C1 琶洲实验室, 广州 510005, 人民 R 中国 C1 科技大学, 科学计算科学与技术, 四川成都 611731, 人民 R 中国 C1 华南理工大学, 科学软件工程,广州 518055, 广东省, 人民路 C1 暨南大学, 深圳眼科医院大数据与人工智能研究所, 深圳 518040, 人民路 C1 哈尔滨理工学院, 哈尔滨 150001, 人民路 C1 ASTAR, 研究所Infocomm Res,新加坡 138632,新加坡 SN 0278-0062 EI 1558-254X DA 2024-09-18 UT WOS:001307429600009 PM 38669168 ER
AU Xiao, Chunlun
Zhu, Anqi
Xia, Chunmei
Qiu, Zifeng
Liu, Yuanlin
Zhao, Cheng
Ren, Weiwei
Wang, Lifan
Dong, Lei
Wang, Tianfu
Guo, Lehang
Lei, Baiying
AU肖, 朱春伦, 夏安琪, 邱春梅, 刘子峰, 赵元林, 任成, 王伟伟, 董立凡, 王磊, 郭天福, 雷乐航, 白英
Attention-Guided Learning with Feature Reconstruction for Skin Lesion
Diagnosis using Clinical and Ultrasound Images.
使用临床和超声图像进行皮肤病变诊断的注意力引导学习和特征重建。
Skin lesion is one of the most common diseases, and most categories are
highly similar in morphology and appearance. Deep learning models
effectively reduce the variability between classes and within classes,
and improve diagnostic accuracy. However, the existing multi-modal
methods are only limited to the surface information of lesions in skin
clinical and dermatoscopic modalities, which hinders the further
improvement of skin lesion diagnostic accuracy. This requires us to
further study the depth information of lesions in skin ultrasound. In
this paper, we propose a novel skin lesion diagnosis network, which
combines clinical and ultrasound modalities to fuse the surface and
depth information of the lesion to improve diagnostic accuracy.
Specifically, we propose an attention-guided learning (AL) module that
fuses clinical and ultrasound modalities from both local and global
perspectives to enhance feature representation. The AL module consists
of two parts, attention-guided local learning (ALL) computes the
intra-modality and inter-modality correlations to fuse multi-scale
information, which makes the network focus on the local information of
each modality, and attention-guided global learning (AGL) fuses global
information to further enhance the feature representation. In addition,
we propose a feature reconstruction learning (FRL) strategy which
encourages the network to extract more discriminative features and
corrects the focus of the network to enhance the model's robustness and
certainty. We conduct extensive experiments and the results confirm the
superiority of our proposed method. Our code is available at:
https://github.com/XCL-hub/AGFnet.
皮肤病变是最常见的疾病之一,大多数类别在形态和外观上高度相似。深度学习模型有效减少类间和类内的变异性,提高诊断准确性。然而,现有的多模态方法仅局限于皮肤临床和皮肤镜模态中皮损的表面信息,阻碍了皮损诊断准确性的进一步提高。这就需要我们进一步研究皮肤超声中病变的深度信息。在本文中,我们提出了一种新颖的皮肤病变诊断网络,该网络结合临床和超声模式来融合病变的表面和深度信息,以提高诊断准确性。具体来说,我们提出了一种注意力引导学习(AL)模块,该模块从局部和全局角度融合临床和超声模式,以增强特征表示。 AL模块由两部分组成,注意力引导局部学习(ALL)计算模态内和模态间相关性以融合多尺度信息,这使得网络专注于每个模态的局部信息,注意力引导局部学习(ALL)全局学习(AGL)融合全局信息以进一步增强特征表示。此外,我们提出了一种特征重建学习(FRL)策略,鼓励网络提取更多判别性特征并纠正网络的焦点,以增强模型的鲁棒性和确定性。我们进行了大量的实验,结果证实了我们提出的方法的优越性。我们的代码位于:https://github.com/XCL-hub/AGFnet。
EI 1558-254X
DA 2024-09-04
UT MEDLINE:39208042
PM 39208042
ER
EI 1558-254X DA 2024-09-04 UT MEDLINE:39208042 PM 39208042 ER
AU Chaudhary, Muhammad F. A.
Gerard, Sarah E.
Christensen, Gary E.
Cooper, Christopher B.
Schroeder, Joyce D.
Hoffman, Eric A.
Reinhardt, Joseph M.
AU Chaudhary、Muhammad FA Gerard、Sarah E. Christensen、Gary E. Cooper、Christopher B. Schroeder、Joyce D. Hoffman、Eric A. Reinhardt、Joseph M.
LungViT: Ensembling Cascade of Texture Sensitive Hierarchical Vision
Transformers for Cross-Volume Chest CT Image-to-Image Translation
LungViT:用于跨体积胸部 CT 图像到图像转换的纹理敏感分层视觉变换器的集成级联
Chest computed tomography (CT) at inspiration is often complemented by
an expiratory CT to identify peripheral airways disease. Additionally,
co-registered inspiratory-expiratory volumes can be used to derive
various markers of lung function. Expiratory CT scans, however, may not
be acquired due to dose or scan time considerations or may be inadequate
due to motion or insufficient exhale; leading to a missed opportunity to
evaluate underlying small airways disease. Here, we propose LungViT- a
generative adversarial learning approach using hierarchical vision
transformers for translating inspiratory CT intensities to corresponding
expiratory CT intensities. LungViT addresses several limitations of the
traditional generative models including slicewise discontinuities,
limited size of generated volumes, and their inability to model texture
transfer at volumetric level. We propose a shifted-window hierarchical
vision transformer architecture with squeeze-and-excitation decoder
blocks for modeling dependencies between features. We also propose a
multiview texture similarity distance metric for texture and style
transfer in 3D. To incorporate global information into the training
process and refine the output of our model, we use ensemble cascading.
LungViT is able to generate large 3D volumes of size $320\times
320\times320$ . We train and validate our model using a diverse cohort
of 1500 subjects with varying disease severity. To assess model
generalizability beyond the development set biases, we evaluate our
model on an out-of-distribution external validation set of 200 subjects.
Clinical validation on internal and external testing sets shows that
synthetic volumes could be reliably adopted for deriving clinical
endpoints of chronic obstructive pulmonary disease.
吸气时胸部计算机断层扫描 (CT) 通常辅以呼气 CT 来识别周围气道疾病。此外,共同记录的吸气-呼气量可用于得出肺功能的各种标志物。然而,由于剂量或扫描时间的考虑,可能无法获取呼气 CT 扫描,或者由于运动或呼气不足而导致扫描不充分;导致错失评估潜在小气道疾病的机会。在这里,我们提出了 LungViT——一种生成对抗性学习方法,使用分层视觉转换器将吸气 CT 强度转换为相应的呼气 CT 强度。 LungViT 解决了传统生成模型的几个限制,包括切片不连续性、生成体积的有限大小以及它们无法在体积水平上模拟纹理传输。我们提出了一种带有挤压和激励解码器块的移位窗口分层视觉变换器架构,用于对特征之间的依赖关系进行建模。我们还提出了一种用于 3D 纹理和风格迁移的多视图纹理相似性距离度量。为了将全局信息纳入训练过程并完善模型的输出,我们使用集成级联。 LungViT 能够生成大小为 $320\times 320\times320$ 的大型 3D 体积。我们使用 1500 名患有不同疾病严重程度的受试者组成的不同队列来训练和验证我们的模型。为了评估模型超越开发集偏差的通用性,我们在包含 200 名受试者的分布外外部验证集上评估我们的模型。内部和外部测试集的临床验证表明,可以可靠地采用合成体积来得出慢性阻塞性肺疾病的临床终点。
AU Luo, Yan
Tian, Yu
Shi, Min
Pasquale, Louis R.
Shen, Lucy Q.
Zebardast, Nazlee
Elze, Tobias
Wang, Mengyu
AU Luo、Yan Tian、于石、Min Pasquale、Louis R. Shen、Lucy Q. Zebardast、Nazlee Elze、Tobias Wang、Mengyu
Harvard Glaucoma Fairness: A Retinal Nerve Disease Dataset for Fairness
Learning and Fair Identity Normalization
哈佛青光眼公平:用于公平学习和公平身份标准化的视网膜神经疾病数据集
Fairness (also known as equity interchangeably) in machine learning is
important for societal well-being, but limited public datasets hinder
its progress. Currently, no dedicated public medical datasets with
imaging data for fairness learning are available, though
underrepresented groups suffer from more health issues. To address this
gap, we introduce Harvard Glaucoma Fairness (Harvard-GF), a retinal
nerve disease dataset including 3,300 subjects with both 2D and 3D
imaging data and balanced racial groups for glaucoma detection. Glaucoma
is the leading cause of irreversible blindness globally with Blacks
having doubled glaucoma prevalence than other races. We also propose a
fair identity normalization (FIN) approach to equalize the feature
importance between different identity groups. Our FIN approach is
compared with various state-of-the-art fairness learning methods with
superior performance in the racial, gender, and ethnicity fairness tasks
with 2D and 3D imaging data, demonstrating the utilities of our dataset
Harvard-GF for fairness learning. To facilitate fairness comparisons
between different models, we propose an equity-scaled performance
measure, which can be flexibly used to compare all kinds of performance
metrics in the context of fairness. The dataset and code are publicly
accessible via https://ophai.hms.harvard.edu/datasets/harvard-gf3300/.
机器学习中的公平(也称为公平)对于社会福祉很重要,但有限的公共数据集阻碍了其进步。目前,没有专门的公共医疗数据集和用于公平学习的成像数据,尽管代表性不足的群体遭受更多的健康问题。为了解决这一差距,我们引入了哈佛青光眼公平 (Harvard-GF),这是一个视网膜神经疾病数据集,包括 3,300 名受试者,具有 2D 和 3D 成像数据以及用于青光眼检测的平衡种族群体。青光眼是全球不可逆性失明的主要原因,黑人的青光眼患病率是其他种族的两倍。我们还提出了一种公平身份标准化(FIN)方法来均衡不同身份组之间的特征重要性。我们的 FIN 方法与各种最先进的公平学习方法进行了比较,这些方法在使用 2D 和 3D 成像数据的种族、性别和民族公平任务中表现出色,证明了我们的数据集Harvard-GF 在公平学习方面的实用性。为了便于不同模型之间的公平性比较,我们提出了一种公平尺度的绩效衡量标准,可以灵活地用于在公平性的背景下比较各种绩效指标。数据集和代码可通过 https://ophai.hms.harvard.edu/datasets/harvard-gf3300/ 公开访问。
AU Zhao, Zihao
Wang, Sheng
Gu, Jinchen
Zhu, Yitao
Mei, Lanzhuju
Zhuang, Zixu
Cui, Zhiming
Wang, Qian
Shen, Dinggang
赵AU、王子豪、顾胜、朱金晨、梅一涛、庄兰珠菊、崔子旭、王志明、沉谦、丁刚
ChatCAD+: Towards a Universal and Reliable Interactive CAD using LLMs.
ChatCAD+:使用LLMs实现通用且可靠的交互式 CAD。
The integration of Computer-Aided Diagnosis (CAD) with Large Language
Models (LLMs) presents a promising frontier in clinical applications,
notably in automating diagnostic processes akin to those performed by
radiologists and providing consultations similar to a virtual family
doctor. Despite the promising potential of this integration, current
works face at least two limitations: (1) From the perspective of a
radiologist, existing studies typically have a restricted scope of
applicable imaging domains, failing to meet the diagnostic needs of
different patients. Also, the insufficient diagnostic capability of LLMs
further undermine the quality and reliability of the generated medical
reports. (2) Current LLMs lack the requisite depth in medical expertise,
rendering them less effective as virtual family doctors due to the
potential unreliability of the advice provided during patient
consultations. To address these limitations, we introduce ChatCAD+, to
be universal and reliable. Specifically, it is featured by two main
modules: (1) Reliable Report Generation and (2) Reliable Interaction.
The Reliable Report Generation module is capable of interpreting medical
images from diverse domains and generate high-quality medical reports
via our proposed hierarchical in-context learning. Concurrently, the
interaction module leverages up-to-date information from reputable
medical websites to provide reliable medical advice. Together, these
designed modules synergize to closely align with the expertise of human
medical professionals, offering enhanced consistency and reliability for
interpretation and advice. The source code is available at GitHub.
计算机辅助诊断 (CAD) 与大型语言模型 ( LLMs ) 的集成在临床应用中展现了一个充满希望的前沿,特别是在类似于放射科医生执行的自动化诊断过程以及提供类似于虚拟家庭医生的咨询方面。尽管这种整合具有广阔的前景,但目前的工作至少面临两个局限性:(1)从放射科医生的角度来看,现有的研究通常适用的成像领域范围有限,无法满足不同患者的诊断需求。此外, LLMs诊断能力不足进一步损害了生成的医疗报告的质量和可靠性。 (2) 目前的LLMs缺乏必要的医学专业知识深度,由于在患者咨询期间提供的建议可能不可靠,导致他们作为虚拟家庭医生的效率较低。为了解决这些限制,我们引入了通用且可靠的 ChatCAD+。具体来说,它具有两个主要模块:(1)可靠的报告生成和(2)可靠的交互。可靠的报告生成模块能够解释来自不同领域的医学图像,并通过我们提出的分层上下文学习生成高质量的医学报告。同时,交互模块利用知名医疗网站的最新信息来提供可靠的医疗建议。这些设计的模块协同作用,与人类医疗专业人员的专业知识紧密结合,提供增强的解释和建议的一致性和可靠性。源代码可在 GitHub 上获取。
AU He, Along
Li, Tao
Yan, Juncheng
Wang, Kai
Fu, Huazhu
区赫、李阿龙、严涛、王俊成、付凯、华珠
Bilateral Supervision Network for Semi-Supervised Medical Image
Segmentation
用于半监督医学图像分割的双边监督网络
Massive high-quality annotated data is required by fully-supervised
learning, which is difficult to obtain for image segmentation since the
pixel-level annotation is expensive, especially for medical image
segmentation tasks that need domain knowledge. As an alternative
solution, semi-supervised learning (SSL) can effectively alleviate the
dependence on the annotated samples by leveraging abundant unlabeled
samples. Among the SSL methods, mean-teacher (MT) is the most popular
one. However, in MT, teacher model's weights are completely determined
by student model's weights, which will lead to the training bottleneck
at the late training stages. Besides, only pixel-wise consistency is
applied for unlabeled data, which ignores the category information and
is susceptible to noise. In this paper, we propose a bilateral
supervision network with bilateral exponential moving average
(bilateral-EMA), named BSNet to overcome these issues. On the one hand,
both the student and teacher models are trained on labeled data, and
then their weights are updated with the bilateral-EMA, and thus the two
models can learn from each other. On the other hand, pseudo labels are
used to perform bilateral supervision for unlabeled data. Moreover, for
enhancing the supervision, we adopt adversarial learning to enforce the
network generate more reliable pseudo labels for unlabeled data. We
conduct extensive experiments on three datasets to evaluate the proposed
BSNet, and results show that BSNet can improve the semi-supervised
segmentation performance by a large margin and surpass other
state-of-the-art SSL methods.
全监督学习需要大量高质量的标注数据,而这对于图像分割来说是很难获得的,因为像素级标注的成本很高,特别是对于需要领域知识的医学图像分割任务。作为替代解决方案,半监督学习(SSL)可以利用大量的未标记样本,有效减轻对注释样本的依赖。在 SSL 方法中,平均教师 (MT) 是最流行的一种。然而,在MT中,教师模型的权重完全由学生模型的权重决定,这将导致训练后期出现训练瓶颈。此外,对于未标记的数据仅应用像素级一致性,这忽略了类别信息并且容易受到噪声的影响。在本文中,我们提出了一种具有双边指数移动平均线(双边-EMA)的双边监督网络,称为 BSNet 来克服这些问题。一方面,学生模型和教师模型都基于标记数据进行训练,然后使用双边 EMA 更新它们的权重,因此两个模型可以相互学习。另一方面,伪标签用于对未标签数据进行双边监督。此外,为了加强监督,我们采用对抗性学习来强制网络为未标记的数据生成更可靠的伪标签。我们在三个数据集上进行了广泛的实验来评估所提出的 BSNet,结果表明 BSNet 可以大幅提高半监督分割性能,并超越其他最先进的 SSL 方法。
AU Li, Yinsheng
Feng, Juan
Xiang, Jun
Li, Zixiao
Liang, Dong
区莉、冯寅生、向娟、李军、梁子晓、董
AIRPORT: A Data Consistency Constrained Deep Temporal Extrapolation
Method To Improve Temporal Resolution In Contrast Enhanced CT Imaging
AIRPORT:一种数据一致性约束的深度时间外推方法,可提高对比增强 CT 成像的时间分辨率
Typical tomographic image reconstruction methods require that the imaged
object is static and stationary during the time window to acquire a
minimally complete data set. The violation of this requirement leads to
temporal-averaging errors in the reconstructed images. For a fixed
gantry rotation speed, to reduce the errors, it is desired to
reconstruct images using data acquired over a narrower angular range,
i.e., with a higher temporal resolution. However, image reconstruction
with a narrower angular range violates the data sufficiency condition,
resulting in severe data-insufficiency-induced errors. The purpose of
this work is to decouple the trade-off between these two types of errors
in contrast-enhanced computed tomography (CT) imaging. We demonstrated
that using the developed data consistency constrained deep temporal
extrapolation method (AIRPORT), the entire time-varying imaged object
can be accurately reconstructed with 40 frames-per-second temporal
resolution, the time window needed to acquire a single projection view
data using a typical C-arm cone-beam CT system. AIRPORT is applicable to
general non-sparse imaging tasks using a single short-scan data
acquisition.
典型的断层扫描图像重建方法要求成像对象在时间窗口内是静止的,以获得最小完整的数据集。违反此要求会导致重建图像中的时间平均误差。对于固定的机架旋转速度,为了减少误差,需要使用在较窄的角度范围内(即,具有较高的时间分辨率)获取的数据来重建图像。然而,具有较窄角度范围的图像重建违反了数据充足性条件,导致严重的数据不足引起的错误。这项工作的目的是消除对比增强计算机断层扫描 (CT) 成像中这两类错误之间的权衡。我们证明,使用开发的数据一致性约束深度时间外推方法(AIRPORT),可以以每秒 40 帧的时间分辨率精确重建整个时变成像对象,这是获取单个投影视图数据所需的时间窗口典型的 C 形臂锥束 CT 系统。 AIRPORT 适用于使用单次短扫描数据采集的一般非稀疏成像任务。
AU Elbatel, Marawan
Marti, Robert
Li, Xiaomeng
AU Elbatel、Marawan Marti、Robert Li、小萌
FoPro-KD: Fourier Prompted Effective Knowledge Distillation for
Long-Tailed Medical Image Recognition
FoPro-KD:傅立叶促进长尾医学图像识别的有效知识蒸馏
Representational transfer from publicly available models is a promising
technique for improving medical image classification, especially in
long-tailed datasets with rare diseases. However, existing methods often
overlook the frequency-dependent behavior of these models, thereby
limiting their effectiveness in transferring representations and
generalizations to rare diseases. In this paper, we propose FoPro-KD, a
novel framework that leverages the power of frequency patterns learned
from frozen pre-trained models to enhance their transferability and
compression, presenting a few unique insights: 1) We demonstrate that
leveraging representations from publicly available pre-trained models
can substantially improve performance, specifically for rare classes,
even when utilizing representations from a smaller pre-trained model. 2)
We observe that pre-trained models exhibit frequency preferences, which
we explore using our proposed Fourier Prompt Generator (FPG), allowing
us to manipulate specific frequencies in the input image, enhancing the
discriminative representational transfer. 3) By amplifying or
diminishing these frequencies in the input image, we enable Effective
Knowledge Distillation (EKD). EKD facilitates the transfer of knowledge
from pre-trained models to smaller models. Through extensive experiments
in long-tailed gastrointestinal image recognition and skin lesion
classification, where rare diseases are prevalent, our FoPro-KD
framework outperforms existing methods, enabling more accessible medical
models for rare disease classification.
来自公开可用模型的表征转移是一种有前途的改进医学图像分类的技术,特别是在罕见疾病的长尾数据集中。然而,现有的方法常常忽视这些模型的频率依赖性行为,从而限制了它们将表征和概括转移到罕见疾病的有效性。在本文中,我们提出了 FoPro-KD,这是一种新颖的框架,它利用从冻结的预训练模型中学习到的频率模式的力量来增强其可转移性和压缩性,并提出了一些独特的见解:1)我们证明了利用公开可用的表示预训练模型可以显着提高性能,特别是对于稀有类别,即使使用较小的预训练模型的表示也是如此。 2)我们观察到预训练的模型表现出频率偏好,我们使用我们提出的傅立叶提示生成器(FPG)进行探索,使我们能够操纵输入图像中的特定频率,从而增强判别性表征转移。 3)通过放大或减少输入图像中的这些频率,我们实现了有效知识蒸馏(EKD)。 EKD 有助于将知识从预训练模型转移到更小的模型。通过在罕见疾病普遍存在的长尾胃肠道图像识别和皮肤病变分类方面进行大量实验,我们的 FoPro-KD 框架优于现有方法,为罕见疾病分类提供了更容易访问的医学模型。
AU Fu, Li-Wei
Liu, Chih-Hao
Jain, Manu
Chen, Chih-Shan Jason
Wu, Yu-Hung
Huang, Sheng-Lung
Chen, Homer H.
AU Fu、Li-Wei Liu、Chih-Hao Jain、Manu Chen、Chih-Shan Jason Wu、Yu-Hung Huang、Sheng-Lung Chen、Homer H.
Training With Uncertain Annotations for Semantic Segmentation of Basal
Cell Carcinoma From Full-Field OCT Images
使用不确定注释对全视野 OCT 图像进行基底细胞癌语义分割的训练
Semantic segmentation of basal cell carcinoma (BCC) from full-field
optical coherence tomography (FF-OCT) images of human skin has received
considerable attention in medical imaging. However, it is challenging
for dermatopathologists to annotate the training data due to OCT's lack
of color specificity. Very often, they are uncertain about the
correctness of the annotations they made. In practice, annotations
fraught with uncertainty profoundly impact the effectiveness of model
training and hence the performance of BCC segmentation. To address this
issue, we propose an approach to model training with uncertain
annotations. The proposed approach includes a data selection strategy to
mitigate the uncertainty of training data, a class expansion to consider
sebaceous gland and hair follicle as additional classes to enhance the
performance of BCC segmentation, and a self-supervised pre-training
procedure to improve the initial weights of the segmentation model
parameters. Furthermore, we develop three post-processing techniques to
reduce the impact of speckle noise and image discontinuities on BCC
segmentation. The mean Dice score of BCC of our model reaches 0.503 +/-
0.003, which, to the best of our knowledge, is the best performance to
date for semantic segmentation of BCC from FF-OCT images.
从人类皮肤的全场光学相干断层扫描(FF-OCT)图像中对基底细胞癌(BCC)进行语义分割在医学成像领域受到了相当大的关注。然而,由于 OCT 缺乏颜色特异性,皮肤病理学家对训练数据进行注释具有挑战性。很多时候,他们不确定自己所做的注释的正确性。在实践中,充满不确定性的注释会深刻影响模型训练的有效性,从而影响 BCC 分割的性能。为了解决这个问题,我们提出了一种使用不确定注释进行模型训练的方法。所提出的方法包括用于减轻训练数据不确定性的数据选择策略、将皮脂腺和毛囊视为附加类以增强 BCC 分割性能的类扩展,以及用于改进初始模型的自监督预训练程序。分割模型参数的权重。此外,我们开发了三种后处理技术来减少散斑噪声和图像不连续性对 BCC 分割的影响。我们模型的 BCC 平均 Dice 分数达到 0.503 +/- 0.003,据我们所知,这是迄今为止对 FF-OCT 图像进行 BCC 语义分割的最佳性能。
AU Zhang, Jianjia
Mao, Haiyang
Wang, Xinran
Guo, Yuan
Wu, Weiwen
张AU、毛健佳、王海洋、郭欣然、吴媛、伟文
Wavelet-Inspired Multi-channel Score-based Model for Limited-angle CT
Reconstruction.
用于有限角度 CT 重建的小波启发多通道基于评分的模型。
Score-based generative model (SGM) has demonstrated great potential in
the challenging limited-angle CT (LA-CT) reconstruction. SGM essentially
models the probability density of the ground truth data and generates
reconstruction results by sampling from it. Nevertheless, direct
application of the existing SGM methods to LA-CT suffers multiple
limitations. Firstly, the directional distribution of the artifacts
attributing to the missing angles is ignored. Secondly, the different
distribution properties of the artifacts in different frequency
components have not been fully explored. These drawbacks would
inevitably degrade the estimation of the probability density and the
reconstruction results. After an in-depth analysis of these factors,
this paper proposes a Wavelet-Inspired Score-based Model (WISM) for
LA-CT reconstruction. Specifically, besides training a typical SGM with
the original images, the proposed method additionally performs the
wavelet transform and models the probability density in each wavelet
component with an extra SGM. The wavelet components preserve the spatial
correspondence with the original image while performing frequency
decomposition, thereby keeping the directional property of the artifacts
for further analysis. On the other hand, different wavelet components
possess more specific contents of the original image in different
frequency ranges, simplifying the probability density modeling by
decomposing the overall density into component-wise ones. The resulting
two SGMs in the image-domain and wavelet-domain are integrated into a
unified sampling process under the guidance of the observation data,
jointly generating high-quality and consistent LA-CT reconstructions.
The experimental evaluation on various datasets consistently verifies
the superior performance of the proposed method over the competing
method.
基于评分的生成模型 (SGM) 在具有挑战性的有限角度 CT (LA-CT) 重建中表现出了巨大的潜力。 SGM本质上是对地面真实数据的概率密度进行建模,并通过对其进行采样来生成重建结果。然而,将现有的 SGM 方法直接应用于 LA-CT 受到多种限制。首先,忽略了归因于缺失角度的伪影的方向分布。其次,不同频率分量中伪影的不同分布特性尚未得到充分探索。这些缺点将不可避免地降低概率密度的估计和重建结果。在对这些因素进行深入分析后,本文提出了一种用于 LA-CT 重建的小波启发评分模型(WISM)。具体来说,除了用原始图像训练典型的SGM之外,所提出的方法还执行小波变换并用额外的SGM对每个小波分量中的概率密度进行建模。小波分量在执行频率分解时保留了与原始图像的空间对应关系,从而保留了伪影的方向特性以供进一步分析。另一方面,不同的小波分量在不同的频率范围内拥有原始图像的更具体的内容,通过将整体密度分解为分量密度来简化概率密度建模。由此产生的图像域和小波域中的两个 SGM 在观测数据的指导下集成到统一的采样过程中,共同生成高质量且一致的 LA-CT 重建。 对各种数据集的实验评估一致验证了所提出的方法相对于竞争方法的优越性能。
AU Zhang, Xiao
Sun, Kaicong
Wu, Dijia
Xiong, Xiaosong
Liu, Jiameng
Yao, Linlin
Li, Shufang
Wang, Yining
Feng, Jun
Shen, Dinggang
张AU、孙晓、吴凯聪、熊迪佳、刘晓松、姚佳萌、李琳琳、王淑芳、冯伊宁、沉军、丁刚
An Anatomy- and Topology-Preserving Framework for Coronary Artery
Segmentation
冠状动脉分割的解剖学和拓扑保留框架
Coronary artery segmentation is critical for coronary artery disease
diagnosis but challenging due to its tortuous course with numerous small
branches and inter-subject variations. Most existing studies ignore
important anatomical information and vascular topologies, leading to
less desirable segmentation performance that usually cannot satisfy
clinical demands. To deal with these challenges, in this paper we
propose an anatomy- and topology-preserving two-stage framework for
coronary artery segmentation. The proposed framework consists of an
anatomical dependency encoding (ADE) module and a hierarchical topology
learning (HTL) module for coarse-to-fine segmentation, respectively.
Specifically, the ADE module segments four heart chambers and aorta, and
thus five distance field maps are obtained to encode distance between
chamber surfaces and coarsely segmented coronary artery. Meanwhile, ADE
also performs coronary artery detection to crop region-of-interest and
eliminate foreground-background imbalance. The follow-up HTL module
performs fine segmentation by exploiting three hierarchical vascular
topologies, i.e., key points, centerlines, and neighbor connectivity
using a multi-task learning scheme. In addition, we adopt a bottom-up
attention interaction (BAI) module to integrate the feature
representations extracted across hierarchical topologies. Extensive
experiments on public and in-house datasets show that the proposed
framework achieves state-of-the-art performance for coronary artery
segmentation.
冠状动脉分割对于冠状动脉疾病的诊断至关重要,但由于其曲折的路线、大量的小分支和受试者间的差异,因此具有挑战性。大多数现有研究忽略了重要的解剖信息和血管拓扑,导致分割性能不太理想,通常无法满足临床需求。为了应对这些挑战,在本文中,我们提出了一种保留解剖学和拓扑结构的冠状动脉分割两阶段框架。所提出的框架由分别用于从粗到细分割的解剖依赖编码(ADE)模块和层次拓扑学习(HTL)模块组成。具体来说,ADE模块分割四个心室和主动脉,从而获得五个距离场图来编码心室表面和粗分割的冠状动脉之间的距离。同时,ADE 还执行冠状动脉检测以裁剪感兴趣区域并消除前景-背景不平衡。后续的 HTL 模块通过利用三个分层血管拓扑(即关键点、中心线和使用多任务学习方案的邻居连接)来执行精细分割。此外,我们采用自下而上的注意力交互(BAI)模块来集成跨层次拓扑提取的特征表示。对公共和内部数据集的广泛实验表明,所提出的框架实现了冠状动脉分割的最先进性能。
AU Tang, Yuqi
Wang, Nanchao
Dong, Zhijie
Lowerison, Matthew
Del Aguila, Angela
Johnston, Natalie
Vu, Tri
Ma, Chenshuo
Xu, Yirui
Yang, Wei
Song, Pengfei
Yao, Junjie
AU Tang, Yuqi Wang, Nanchao Dong,zhijie Lowerison, Matthew Del Aguila, Angela Johnston, Natalie Vu, Tri Ma, Chenshuo Xu, Yirui Yang, Wei Song, Pengfei Yao, Junjie
Non-invasive Deep-Brain Imaging with 3D Integrated Photoacoustic
Tomography and Ultrasound Localization Microscopy (3D-PAULM).
使用 3D 集成光声断层扫描和超声定位显微镜 (3D-PAULM) 进行非侵入性深脑成像。
Photoacoustic computed tomography (PACT) is a proven technology for
imaging hemodynamics in deep brain of small animal models. PACT is
inherently compatible with ultrasound (US) imaging, providing
complementary contrast mechanisms. While PACT can quantify the brain's
oxygen saturation of hemoglobin (sO2), US imaging can probe the blood
flow based on the Doppler effect. Further, by tracking gas-filled
microbubbles, ultrasound localization microscopy (ULM) can map the blood
flow velocity with sub-diffraction spatial resolution. In this work, we
present a 3D deep-brain imaging system that seamlessly integrates PACT
and ULM into a single device, 3D-PAULM. Using a low ultrasound frequency
of 4 MHz, 3D-PAULM is capable of imaging the brain hemodynamic functions
with intact scalp and skull in a totally non-invasive manner. Using
3D-PAULM, we studied the mouse brain functions with ischemic stroke.
Multi-spectral PACT, US B-mode imaging, microbubble-enhanced power
Doppler (PD), and ULM were performed on the same mouse brain with
intrinsic image co-registration. From the multi-modality measurements,
we further quantified blood perfusion, sO2, vessel density, and flow
velocity of the mouse brain, showing stroke-induced ischemia, hypoxia,
and reduced blood flow. We expect that 3D-PAULM can find broad
applications in studying deep brain functions on small animal models.
光声计算机断层扫描 (PACT) 是一种经过验证的技术,用于对小动物模型深部脑部血流动力学进行成像。 PACT 本质上与超声 (US) 成像兼容,提供互补的对比机制。 PACT 可以量化大脑血红蛋白的氧饱和度 (sO2),而 US 成像可以基于多普勒效应探测血流。此外,通过跟踪充气微泡,超声定位显微镜(ULM)可以以亚衍射空间分辨率绘制血流速度图。在这项工作中,我们提出了一种 3D 深脑成像系统,它将 PACT 和 ULM 无缝集成到单个设备 3D-PAULM 中。 3D-PAULM 使用 4 MHz 的低超声频率,能够以完全无创的方式对完整头皮和头骨的大脑血流动力学功能进行成像。使用 3D-PAULM,我们研究了缺血性中风小鼠的大脑功能。多光谱 PACT、US B 模式成像、微泡增强功率多普勒 (PD) 和 ULM 在同一小鼠大脑上进行,并具有内在图像共同配准。通过多模态测量,我们进一步量化了小鼠大脑的血液灌注、sO2、血管密度和流速,显示中风引起的缺血、缺氧和血流量减少。我们期望 3D-PAULM 能够在小动物模型的深部脑功能研究中找到广泛的应用。
EI 1558-254X
DA 2024-10-11
UT MEDLINE:39383084
PM 39383084
ER
EI 1558-254X DA 2024-10-11 UT MEDLINE:39383084 PM 39383084 ER
AU Park, Jinil
Shin, Taehoon
Park, Jang-Yeon
AU Park、申真一、朴泰勋、张妍
Three-Dimensional Variable Slab-Selective Projection Acquisition
Imaging.
三维可变板选择性投影采集成像。
Three-dimensional (3D) projection acquisition (PA) imaging has recently
gained attention because of its advantages, such as achievability of
very short echo time, less sensitivity to motion, and undersampled
acquisition of projections without sacrificing spatial resolution.
However, larger subjects require a stronger Nyquist criterion and are
more likely to be affected by outer-volume signals outside the field of
view (FOV), which significantly degrades the image quality. Here, we
proposed a variable slab-selective projection acquisition (VSS-PA)
method to mitigate the Nyquist criterion and effectively suppress
aliasing streak artifacts in 3D PA imaging. The proposed method involves
maintaining the vertical orientation of the slab-selective gradient for
frequency-selective spin excitation and the readout gradient for data
acquisition. As VSS-PA can selectively excite spins only in the width of
the desired FOV in the projection direction during data acquisition, the
effective size of the scanned object that determines the Nyquist
criterion can be reduced. Additionally, unwanted signals originating
from outside the FOV (e.g., aliasing streak artifacts) can be
effectively avoided. The mitigation of the Nyquist criterion owing to
VSS-PA was theoretically described and confirmed through numerical
simulations and phantom and human lung experiments. These experiments
further showed that the aliasing streak artifacts were nearly
suppressed.
三维 (3D) 投影采集 (PA) 成像最近因其优点而受到关注,例如可实现非常短的回波时间、对运动的敏感性较低以及在不牺牲空间分辨率的情况下对投影进行欠采样采集。然而,较大的拍摄对象需要更强的奈奎斯特准则,并且更有可能受到视场 (FOV) 之外的外部体积信号的影响,从而显着降低图像质量。在这里,我们提出了一种可变平板选择性投影采集(VSS-PA)方法来减轻奈奎斯特准则并有效抑制 3D PA 成像中的混叠条纹伪影。所提出的方法涉及维持用于频率选择性自旋激发的板选择梯度和用于数据采集的读出梯度的垂直方向。由于VSS-PA在数据采集过程中可以选择性地仅在投影方向上所需FOV的宽度内激发自旋,因此可以减小决定奈奎斯特准则的扫描物体的有效尺寸。此外,可以有效避免源自视场外部的不需要的信号(例如,混叠条纹伪影)。通过数值模拟、模型和人肺实验从理论上描述并证实了 VSS-PA 对奈奎斯特准则的缓解。这些实验进一步表明,混叠条纹伪影几乎被抑制。
EI 1558-254X
DA 2024-10-02
UT MEDLINE:39348262
PM 39348262
ER
EI 1558-254X DA 2024-10-02 UT MEDLINE:39348262 PM 39348262 ER
AU Purma, Vishnuvardhan
Srinath, Suhas
Srirangarajan, Seshan
Kakkar, Aanchal
Prathosh, A P
AU Purma、Vishnuvardhan Srinath、Suhas Srirangarajan、Seshan Kakkar、Aanchal Prathosh、美联社
GenSelfDiff-HIS: Generative Self-Supervision Using Diffusion for
Histopathological Image Segmentation.
GenSelfDiff-HIS:使用扩散进行组织病理学图像分割的生成自我监督。
Histopathological image segmentation is a laborious and time-intensive
task, often requiring analysis from experienced pathologists for
accurate examinations. To reduce this burden, supervised
machine-learning approaches have been adopted using large-scale
annotated datasets for histopathological image analysis. However, in
several scenarios, the availability of large-scale annotated data is a
bottleneck while training such models. Self-supervised learning (SSL) is
an alternative paradigm that provides some respite by constructing
models utilizing only the unannotated data which is often abundant. The
basic idea of SSL is to train a network to perform one or many pseudo or
pretext tasks on unannotated data and use it subsequently as the basis
for a variety of downstream tasks. It is seen that the success of SSL
depends critically on the considered pretext task. While there have been
many efforts in designing pretext tasks for classification problems,
there have not been many attempts on SSL for histopathological image
segmentation. Motivated by this, we propose an SSL approach for
segmenting histopathological images via generative diffusion models. Our
method is based on the observation that diffusion models effectively
solve an image-to-image translation task akin to a segmentation task.
Hence, we propose generative diffusion as the pretext task for
histopathological image segmentation. We also utilize a multi-loss
function-based fine-tuning for the downstream task. We validate our
method using several metrics on two publicly available datasets along
with a newly proposed head and neck (HN) cancer dataset containing
Hematoxylin and Eosin (H&E) stained images along with annotations.
组织病理学图像分割是一项费力且耗时的任务,通常需要经验丰富的病理学家进行分析才能进行准确的检查。为了减轻这种负担,已采用监督机器学习方法,使用大规模注释数据集进行组织病理学图像分析。然而,在某些情况下,大规模注释数据的可用性是训练此类模型的瓶颈。自监督学习(SSL)是一种替代范式,它通过仅利用通常丰富的未注释数据构建模型来提供一些喘息机会。 SSL 的基本思想是训练网络对未注释的数据执行一项或多项伪或借口任务,并随后将其用作各种下游任务的基础。可以看出,SSL 的成功关键取决于所考虑的借口任务。虽然人们在设计分类问题的借口任务方面做出了很多努力,但在 SSL 上进行组织病理学图像分割的尝试却很少。受此启发,我们提出了一种 SSL 方法,通过生成扩散模型分割组织病理学图像。我们的方法基于这样的观察:扩散模型有效地解决了类似于分割任务的图像到图像的翻译任务。因此,我们提出生成扩散作为组织病理学图像分割的借口任务。我们还利用基于多重损失函数的微调来进行下游任务。我们使用两个公开可用的数据集以及新提出的头颈(HN)癌症数据集的多个指标来验证我们的方法,该数据集包含苏木精和伊红(H&E)染色图像以及注释。
EI 1558-254X
DA 2024-09-04
UT MEDLINE:39222449
PM 39222449
ER
EI 1558-254X DA 2024-09-04 UT MEDLINE:39222449 PM 39222449 ER
AU Liu, Qiang
Chao, Weian
Wen, Ruyi
Gong, Yubin
Xi, Lei
刘AU、超强、文维安、龚如意、奚玉斌、雷
Optimized Excitation in Microwave-induced Thermoacoustic Imaging for
Artifact Suppression.
用于抑制伪影的微波诱导热声成像中的优化激励。
Microwave-induced thermoacoustic imaging (M-TAI) allows the
visualization of macroscopic and microscopic structures of bio-tissues.
However, it suffers from severe inherent artifacts that might misguide
the subsequent diagnostics and treatments of diseases. To overcome this
limitation, we propose an optimized excitation strategy. In detail, the
strategy integrates dynamically compound specific absorption rate (SAR)
and co-planar configuration of polarization state, incident wave vector
and imaging plane. Starting from the theoretical analysis, we interpret
the underlying mechanism supporting the superiority of the optimized
excitation strategy to achieve an effect equivalent to homogenizing the
deposited electromagnetic energy in bio-tissues. The following numerical
simulations demonstrate that the strategy enables better preservation of
the conductivity weighting of samples while increasing Pearson
correlation coefficient. Furthermore, the in vitro and in vivo M-TAI
experiments validate the effectiveness and robustness of this optimized
excitation strategy in artifact suppression, allowing the simultaneous
identification of both boundary and inside fine structures within
bio-tissues. All the results suggest that the optimized excitation
strategy can be expanded to diverse scenarios, inspiring more suitable
strategies that remarkably suppress the inherent artifacts in M-TAI.
微波诱导热声成像 (M-TAI) 可以实现生物组织的宏观和微观结构的可视化。然而,它存在严重的固有伪影,可能会误导随后的疾病诊断和治疗。为了克服这个限制,我们提出了一种优化的激励策略。具体来说,该策略动态集成了复合比吸收率(SAR)以及偏振态、入射波矢量和成像平面的共面配置。从理论分析开始,我们解释了支持优化激励策略的优越性的基本机制,以实现相当于均匀化生物组织中沉积的电磁能的效果。以下数值模拟表明,该策略能够更好地保留样品的电导率权重,同时增加皮尔逊相关系数。此外,体外和体内 M-TAI 实验验证了这种优化的激励策略在伪影抑制中的有效性和鲁棒性,允许同时识别生物组织内的边界和内部精细结构。所有结果表明,优化的激励策略可以扩展到不同的场景,激发更合适的策略,显着抑制 M-TAI 中的固有伪影。
AU Sogancioglu, Ecem
van Ginneken, Bram
Behrendt, Finn
Bengs, Marcel
Schlaefer, Alexander
Radu, Miron
Xu, Di
Sheng, Ke
Scalzo, Fabien
Marcus, Eric
Papa, Samuele
Teuwen, Jonas
Scholten, Ernst Th.
Schalekamp, Steven
Hendrix, Nils
Jacobs, Colin
Hendrix, Ward
Sanchez, Clara I.
Murphy, Keelin
AU Sogancioglu、Ecem van Ginneken、Bram Behrendt、Finn Bengs、Marcel Schlaefer、Alexander Radu、Miron Xu、Di Shen、Ke Scalzo、Fabien Marcus、Eric Papa、Samuele Teuwen、Jonas Scholten、Ernst Th。沙勒坎普、史蒂文·亨德里克斯、尼尔斯·雅各布斯、科林·亨德里克斯、沃德·桑切斯、克拉拉·墨菲、基林
Nodule Detection and Generation on Chest X-Rays: NODE21 Challenge
胸部 X 射线结节检测和生成:NODE21 挑战
Pulmonary nodules may be an early manifestation of lung cancer, the
leading cause of cancer-related deaths among both men and women.
Numerous studies have established that deep learning methods can yield
high-performance levels in the detection of lung nodules in chest
X-rays. However, the lack of gold-standard public datasets slows down
the progression of the research and prevents benchmarking of methods for
this task. To address this, we organized a public research challenge,
NODE21, aimed at the detection and generation of lung nodules in chest
X-rays. While the detection track assesses state-of-the-art nodule
detection systems, the generation track determines the utility of nodule
generation algorithms to augment training data and hence improve the
performance of the detection systems. This paper summarizes the results
of the NODE21 challenge and performs extensive additional experiments to
examine the impact of the synthetically generated nodule training images
on the detection algorithm performance.
肺结节可能是肺癌的早期表现,肺癌是男性和女性癌症相关死亡的主要原因。大量研究表明,深度学习方法可以在胸部 X 光检查中的肺结节检测中达到高性能水平。然而,缺乏黄金标准的公共数据集会减慢研究的进展,并阻碍对该任务的方法进行基准测试。为了解决这个问题,我们组织了一项公共研究挑战赛 NODE21,旨在检测和生成胸部 X 射线中的肺结节。虽然检测轨迹评估最先进的结节检测系统,但生成轨迹确定结节生成算法在增强训练数据方面的效用,从而提高检测系统的性能。本文总结了 NODE21 挑战的结果,并进行了广泛的额外实验,以检查综合生成的结节训练图像对检测算法性能的影响。
AU Wu, Ruoyou
Li, Cheng
Zou, Juan
Liu, Xinfeng
Zheng, Hairong
Wang, Shanshan
吴AU、李若友、邹成、刘娟、郑新峰、王海蓉、珊珊
Generalizable Reconstruction for Accelerating MR Imaging via Federated
Learning with Neural Architecture Search.
通过神经架构搜索联合学习加速 MR 成像的可推广重建。
Heterogeneous data captured by different scanning devices and imaging
protocols can affect the generalization performance of the deep learning
magnetic resonance (MR) reconstruction model. While a centralized
training model is effective in mitigating this problem, it raises
concerns about privacy protection. Federated learning is a distributed
training paradigm that can utilize multi-institutional data for
collaborative training without sharing data. However, existing federated
learning MR image reconstruction methods rely on models designed
manually by experts, which are complex and computationally expensive,
suffering from performance degradation when facing heterogeneous data
distributions. In addition, these methods give inadequate consideration
to fairness issues, namely ensuring that the model's training does not
introduce bias towards any specific dataset's distribution. To this end,
this paper proposes a generalizable federated neural architecture search
framework for accelerating MR imaging (GAutoMRI). Specifically,
automatic neural architecture search is investigated for effective and
efficient neural network representation learning of MR images from
different centers. Furthermore, we design a fairness adjustment approach
that can enable the model to learn features fairly from inconsistent
distributions of different devices and centers, and thus facilitate the
model to generalize well to the unseen center. Extensive experiments
show that our proposed GAutoMRI has better performances and
generalization ability compared with seven state-of-the-art federated
learning methods. Moreover, the GAutoMRI model is significantly more
lightweight, making it an efficient choice for MR image reconstruction
tasks. The code will be made available at
https://github.com/ternencewu123/GAutoMRI.
不同扫描设备和成像协议捕获的异构数据会影响深度学习磁共振(MR)重建模型的泛化性能。虽然集中式训练模型可以有效缓解这一问题,但它引起了人们对隐私保护的担忧。联邦学习是一种分布式训练范式,可以利用多机构数据进行协作训练,而无需共享数据。然而,现有的联邦学习MR图像重建方法依赖于专家手动设计的模型,模型复杂且计算成本高,在面对异构数据分布时性能会下降。此外,这些方法没有充分考虑公平性问题,即确保模型的训练不会对任何特定数据集的分布引入偏差。为此,本文提出了一种用于加速 MR 成像的通用联邦神经架构搜索框架(GAutoMRI)。具体来说,研究了自动神经架构搜索,以对来自不同中心的 MR 图像进行有效且高效的神经网络表示学习。此外,我们设计了一种公平性调整方法,可以使模型从不同设备和中心的不一致分布中公平地学习特征,从而促进模型很好地泛化到不可见的中心。大量实验表明,与七种最先进的联邦学习方法相比,我们提出的 GAutoMRI 具有更好的性能和泛化能力。此外,GAutoMRI 模型明显更加轻量级,使其成为 MR 图像重建任务的有效选择。该代码将在 https://github 上提供。com/ternencewu123/GAutoMRI。
AU Zhou, Chengfeng
Wang, Jun
Xiang, Suncheng
Liu, Feng
Huang, Hefeng
Qian, Dahong
周AU、王成峰、向军、刘孙成、黄峰、钱鹤峰、大洪
A Simple Normalization Technique Using Window Statistics to Improve the
Out-of-Distribution Generalization on Medical Images
一种利用窗口统计改进医学图像的分布外泛化的简单归一化技术
Since data scarcity and data heterogeneity are prevailing for medical
images, well-trained Convolutional Neural Networks (CNNs) using previous
normalization methods may perform poorly when deployed to a new site.
However, a reliable model for real-world clinical applications should
generalize well both on in-distribution (IND) and out-of-distribution
(OOD) data (e.g., the new site data). In this study, we present a novel
normalization technique called window normalization (WIN) to improve the
model generalization on heterogeneous medical images, which offers a
simple yet effective alternative to existing normalization methods.
Specifically, WIN perturbs the normalizing statistics with the local
statistics computed within a window. This feature-level augmentation
technique regularizes the models well and improves their OOD
generalization significantly. Leveraging its advantage, we propose a
novel self-distillation method called WIN-WIN. WIN-WIN can be easily
implemented with two forward passes and a consistency constraint,
serving as a simple extension to existing methods. Extensive
experimental results on various tasks (6 tasks) and datasets (24
datasets) demonstrate the generality and effectiveness of our methods.
由于医学图像普遍存在数据稀缺和数据异质性,因此使用以前的标准化方法训练有素的卷积神经网络 (CNN) 在部署到新站点时可能表现不佳。然而,真实世界临床应用的可靠模型应该能够很好地推广分布内(IND)和分布外(OOD)数据(例如,新站点数据)。在这项研究中,我们提出了一种称为窗口归一化(WIN)的新颖归一化技术,以改进异构医学图像的模型泛化,它为现有归一化方法提供了一种简单而有效的替代方法。具体来说,WIN 使用窗口内计算的本地统计数据扰乱标准化统计数据。这种特征级增强技术可以很好地规范模型并显着提高其 OOD 泛化能力。利用其优势,我们提出了一种称为 WIN-WIN 的新型自蒸馏方法。 WIN-WIN 可以通过两次前向传递和一致性约束轻松实现,作为现有方法的简单扩展。各种任务(6 个任务)和数据集(24 个数据集)的广泛实验结果证明了我们方法的通用性和有效性。
AU Martinez-Sanchez, Antonio
Lamm, Lorenz
Jasnin, Marion
Phelippeau, Harold
AU 马丁内斯-桑切斯、安东尼奥·拉姆、洛伦兹·贾斯宁、马里昂·费利波、哈罗德
Simulating the cellular context in synthetic datasets for cryo-electron
tomography.
模拟冷冻电子断层扫描合成数据集中的细胞环境。
Cryo-electron tomography (cryo-ET) allows to visualize the cellular
context at macromolecular level. To date, the impossibility of obtaining
a reliable ground truth is limiting the application of deep
learning-based image processing algorithms in this field. As a
consequence, there is a growing demand of realistic synthetic datasets
for training deep learning algorithms. In addition, besides assisting
the acquisition and interpretation of experimental data, synthetic
tomograms are used as reference models for cellular organization
analysis from cellular tomograms. Current simulators in cryo-ET focus on
reproducing distortions from image acquisition and tomogram
reconstruction, however, they can not generate many of the low order
features present in cellular tomograms. Here we propose several
geometric and organization models to simulate low order cellular
structures imaged by cryo-ET. Specifically, clusters of any known
cytosolic or membrane bound macromolecules, membranes with different
geometries as well as different filamentous structures such as
microtubules or actin-like networks. Moreover, we use parametrizable
stochastic models to generate a high diversity of geometries and
organizations to simulate representative and generalized datasets,
including very crowded environments like those observed in native cells.
These models have been implemented in a multiplatform open-source Python
package, including scripts to generate cryo-tomograms with adjustable
sizes and resolutions. In addition, these scripts provide also
distortion-free density maps besides the ground truth in different file
formats for efficient access and advanced visualization. We show that
such a realistic synthetic dataset can be readily used to train
generalizable deep learning algorithms.
冷冻电子断层扫描 (cryo-ET) 可以在大分子水平上可视化细胞环境。迄今为止,无法获得可靠的地面事实限制了基于深度学习的图像处理算法在该领域的应用。因此,对用于训练深度学习算法的真实合成数据集的需求不断增长。此外,除了协助实验数据的获取和解释之外,合成断层图还用作细胞断层图细胞组织分析的参考模型。目前冷冻电子断层扫描中的模拟器专注于再现图像采集和断层扫描重建中的失真,但是它们无法生成细胞断层扫描中存在的许多低阶特征。在这里,我们提出了几种几何和组织模型来模拟冷冻电子断层成像的低阶细胞结构。具体来说,任何已知的胞质或膜结合大分子、具有不同几何形状的膜以及不同的丝状结构(例如微管或肌动蛋白样网络)的簇。此外,我们使用可参数化的随机模型来生成高度多样化的几何形状和组织,以模拟代表性和广义的数据集,包括非常拥挤的环境,例如在本地细胞中观察到的环境。这些模型已在多平台开源 Python 包中实现,包括用于生成大小和分辨率可调的冷冻断层图的脚本。此外,除了不同文件格式的地面实况之外,这些脚本还提供无失真密度图,以实现高效访问和高级可视化。我们证明,这样一个真实的合成数据集可以很容易地用于训练可推广的深度学习算法。
AU Li, Zilong
Gao, Qi
Wu, Yaping
Niu, Chuang
Zhang, Junping
Wang, Meiyun
Wang, Ge
Shan, Hongming
AU Li、高子龙、吴奇、牛亚平、张闯、王军平、王美云、单戈、洪明
Quad-Net: Quad-Domain Network for CT Metal Artifact Reduction
Quad-Net:用于减少 CT 金属伪影的四域网络
Metal implants and other high-density objects in patients introduce
severe streaking artifacts in CT images, compromising image quality and
diagnostic performance. Although various methods were developed for CT
metal artifact reduction over the past decades, including the latest
dual-domain deep networks, remaining metal artifacts are still
clinically challenging in many cases. Here we extend the
state-of-the-art dual-domain deep network approach into a quad-domain
counterpart so that all the features in the sinogram, image, and their
corresponding Fourier domains are synergized to eliminate metal
artifacts optimally without compromising structural subtleties. Our
proposed quad-domain network for MAR, referred to as Quad-Net, takes
little additional computational cost since the Fourier transform is
highly efficient, and works across the four receptive fields to learn
both global and local features as well as their relations. Specifically,
we first design a Sinogram-Fourier Restoration Network (SFR-Net) in the
sinogram domain and its Fourier space to faithfully inpaint
metal-corrupted traces. Then, we couple SFR-Net with an Image-Fourier
Refinement Network (IFR-Net) which takes both an image and its Fourier
spectrum to improve a CT image reconstructed from the SFR-Net output
using cross-domain contextual information. Quad-Net is trained on
clinical datasets to minimize a composite loss function. Quad-Net does
not require precise metal masks, which is of great importance in
clinical practice. Our experimental results demonstrate the superiority
of Quad-Net over the state-of-the-art MAR methods quantitatively,
visually, and statistically. The Quad-Net code is publicly available at
https://github.com/longzilicart/Quad-Net.
患者体内的金属植入物和其他高密度物体会在 CT 图像中引入严重的条纹伪影,从而影响图像质量和诊断性能。尽管过去几十年来开发了各种减少 CT 金属伪影的方法,包括最新的双域深度网络,但在许多情况下,残留的金属伪影在临床上仍然具有挑战性。在这里,我们将最先进的双域深度网络方法扩展到四域对应物,以便协同正弦图、图像及其相应傅里叶域中的所有特征,以最佳方式消除金属伪影,而不影响结构的微妙之处。我们提出的 MAR 四域网络(称为 Quad-Net)几乎不需要额外的计算成本,因为傅里叶变换非常高效,并且跨四个感受野工作以学习全局和局部特征及其关系。具体来说,我们首先在正弦图域及其傅立叶空间中设计正弦图-傅立叶恢复网络(SFR-Net),以忠实地修复金属损坏的痕迹。然后,我们将 SFR-Net 与图像傅里叶细化网络 (IFR-Net) 结合起来,该网络采用图像及其傅里叶频谱来改进使用跨域上下文信息从 SFR-Net 输出重建的 CT 图像。 Quad-Net 在临床数据集上进行训练,以最小化复合损失函数。 Quad-Net不需要精确的金属掩模,这在临床实践中非常重要。我们的实验结果在定量、视觉和统计方面证明了 Quad-Net 相对于最先进的 MAR 方法的优越性。 Quad-Net 代码可在 https://github.com/longzilicart/Quad-Net 上公开获取。
AU Vray, Guillaume
Tomar, Devavrat
Bozorgtabar, Behzad
Thiran, Jean-Philippe
AU Vray、纪尧姆·托马尔、Devavrat Bozorgtabar、Behzad Thiran、Jean-Philippe
Distill-SODA: Distilling Self-Supervised Vision Transformer for
Source-Free Open-Set Domain Adaptation in Computational Pathology
Distill-SODA:蒸馏自监督视觉变压器,用于计算病理学中的无源开放集域适应
Developing computational pathology models is essential for reducing
manual tissue typing from whole slide images, transferring knowledge
from the source domain to an unlabeled, shifted target domain, and
identifying unseen categories. We propose a practical setting by
addressing the above-mentioned challenges in one fell swoop, i.e.,
source-free open-set domain adaptation. Our methodology focuses on
adapting a pre-trained source model to an unlabeled target dataset and
encompasses both closed-set and open-set classes. Beyond addressing the
semantic shift of unknown classes, our framework also deals with a
covariate shift, which manifests as variations in color appearance
between source and target tissue samples. Our method hinges on
distilling knowledge from a self-supervised vision transformer (ViT),
drawing guidance from either robustly pre-trained transformer models or
histopathology datasets, including those from the target domain. In
pursuit of this, we introduce a novel style-based adversarial data
augmentation, serving as hard positives for self-training a ViT,
resulting in highly contextualized embeddings. Following this, we
cluster semantically akin target images, with the source model offering
weak pseudo-labels, albeit with uncertain confidence. To enhance this
process, we present the closed-set affinity score (CSAS), aiming to
correct the confidence levels of these pseudo-labels and to calculate
weighted class prototypes within the contextualized embedding space. Our
approach establishes itself as state-of-the-art across three public
histopathological datasets for colorectal cancer assessment. Notably,
our self-training method seamlessly integrates with open-set detection
methods, resulting in enhanced performance in both closed-set and
open-set recognition tasks.
开发计算病理学模型对于减少整个幻灯片图像的手动组织分型、将知识从源域转移到未标记的、转移的目标域以及识别看不见的类别至关重要。我们提出了一种实用的设置,通过一举解决上述挑战,即无源开放集域适应。我们的方法侧重于使预训练的源模型适应未标记的目标数据集,并涵盖封闭集和开放集类。除了解决未知类别的语义转变之外,我们的框架还处理协变量转变,这表现为源组织样本和目标组织样本之间颜色外观的变化。我们的方法取决于从自监督视觉转换器(ViT)中提取知识,从经过严格预训练的转换器模型或组织病理学数据集(包括来自目标域的数据集)中获取指导。为了实现这一目标,我们引入了一种新颖的基于风格的对抗性数据增强,作为自我训练 ViT 的硬性积极因素,从而产生高度情境化的嵌入。接下来,我们对语义上相似的目标图像进行聚类,源模型提供弱伪标签,尽管置信度不确定。为了增强这个过程,我们提出了闭集亲和力评分(CSAS),旨在纠正这些伪标签的置信水平并计算上下文嵌入空间内的加权类原型。我们的方法在结直肠癌评估的三个公共组织病理学数据集中确立了最先进的方法。 值得注意的是,我们的自我训练方法与开放集检测方法无缝集成,从而提高了封闭集和开放集识别任务的性能。
AU Gui, Shuangchun
Wang, Zhenkun
Chen, Jixiang
Zhou, Xun
Zhang, Chen
Cao, Yi
区桂、王双春、陈振坤、周吉祥、张迅、曹晨、易
MT4MTL-KD: A Multi-Teacher Knowledge Distillation Framework for Triplet
Recognition
MT4MTL-KD:用于三元组识别的多教师知识蒸馏框架
The recognition of surgical triplets plays a critical role in the
practical application of surgical videos. It involves the sub-tasks of
recognizing instruments, verbs, and targets, while establishing precise
associations between them. Existing methods face two significant
challenges in triplet recognition: 1) the imbalanced class distribution
of surgical triplets may lead to spurious task association learning, and
2) the feature extractors cannot reconcile local and global context
modeling. To overcome these challenges, this paper presents a novel
multi-teacher knowledge distillation framework for multi-task triplet
learning, known as MT4MTL-KD. MT4MTL-KD leverages teacher models trained
on less imbalanced sub-tasks to assist multi-task student learning for
triplet recognition. Moreover, we adopt different categories of
backbones for the teacher and student models, facilitating the
integration of local and global context modeling. To further align the
semantic knowledge between the triplet task and its sub-tasks, we
propose a novel feature attention module (FAM). This module utilizes
attention mechanisms to assign multi-task features to specific
sub-tasks. We evaluate the performance of MT4MTL-KD on both the 5-fold
cross-validation and the CholecTriplet challenge splits of the CholecT45
dataset. The experimental results consistently demonstrate the
superiority of our framework over state-of-the-art methods, achieving
significant improvements of up to 6.4% on the cross-validation split.
手术三联体的识别在手术视频的实际应用中起着至关重要的作用。它涉及识别仪器、动词和目标,同时在它们之间建立精确关联的子任务。现有方法在三元组识别中面临两个重大挑战:1)手术三元组的不平衡类别分布可能导致虚假的任务关联学习,2)特征提取器无法协调局部和全局上下文建模。为了克服这些挑战,本文提出了一种用于多任务三元组学习的新型多教师知识蒸馏框架,称为 MT4MTL-KD。 MT4MTL-KD 利用在不平衡性较小的子任务上训练的教师模型来协助多任务学生学习三联体识别。此外,我们为教师和学生模型采用不同类别的骨干网,促进本地和全局上下文建模的集成。为了进一步协调三元组任务及其子任务之间的语义知识,我们提出了一种新颖的特征注意模块(FAM)。该模块利用注意力机制将多任务特征分配给特定的子任务。我们评估了 MT4MTL-KD 在 CholecT45 数据集的 5 倍交叉验证和 CholecTriplet 挑战分割上的性能。实验结果一致证明了我们的框架相对于最先进的方法的优越性,在交叉验证分割上实现了高达 6.4% 的显着改进。
AU Ma, Kai
Wen, Xuyun
Zhu, Qi
Zhang, Daoqiang
区马、文凯、朱旭云、张琪、道强
Ordinal Pattern Tree: A New Representation Method for Brain Network
Analysis
序数模式树:一种新的脑网络分析表示方法
Brain networks, describing the functional or structural interactions of
brain with graph theory, have been widely used for brain imaging
analysis. Currently, several network representation methods have been
developed for describing and analyzing brain networks. However, most of
these methods ignored the valuable weighted information of the edges in
brain networks. In this paper, we propose a new representation method
(i.e., ordinal pattern tree) for brain network analysis. Compared with
the existing network representation methods, the proposed ordinal
pattern tree (OPT) can not only leverage the weighted information of the
edges but also express the hierarchical relationships of nodes in brain
networks. On OPT, nodes are connected by ordinal edges which are
constructed by using the ordinal pattern relationships of weighted
edges. We represent brain networks as OPTs and further develop a new
graph kernel called optimal transport (OT) based ordinal pattern tree
(OT-OPT) kernel to measure the similarity between paired brain networks.
In OT-OPT kernel, the OT distances are used to calculate the transport
costs between the nodes on the OPTs. Based on these OT distances, we use
exponential function to calculate OT-OPT kernel which is proved to be
positive definite. To evaluate the effectiveness of the proposed method,
we perform classification and regression experiments on ADHD-200, ABIDE
and ADNI datasets. The experimental results demonstrate that our
proposed method outperforms the state-of-the-art graph methods in the
classification and regression tasks.
脑网络用图论描述大脑的功能或结构相互作用,已广泛用于脑成像分析。目前,已经开发了几种网络表示方法来描述和分析大脑网络。然而,这些方法大多数都忽略了大脑网络中边缘的有价值的加权信息。在本文中,我们提出了一种用于脑网络分析的新表示方法(即序数模式树)。与现有的网络表示方法相比,所提出的序数模式树(OPT)不仅可以利用边缘的加权信息,还可以表达脑网络中节点的层次关系。在 OPT 上,节点通过序数边连接,序数边是使用加权边的序数模式关系构造的。我们将大脑网络表示为 OPT,并进一步开发了一种新的图内核,称为基于最优传输 (OT) 的序数模式树 (OT-OPT) 内核,以测量配对大脑网络之间的相似性。在OT-OPT内核中,OT距离用于计算OPT上节点之间的传输成本。基于这些OT距离,我们使用指数函数来计算OT-OPT核,并证明该核是正定的。为了评估所提出方法的有效性,我们在 ADHD-200、ABIDE 和 ADNI 数据集上进行分类和回归实验。实验结果表明,我们提出的方法在分类和回归任务中优于最先进的图方法。
AU Li, Zihan
Li, Yunxiang
Li, Qingde
Wang, Puyang
Guo, Dazhou
Lu, Le
Jin, Dakai
Zhang, You
Hong, Qingqi
李AU、李子涵、李云翔、王庆德、郭濮阳、路大舟、金乐、张大开、尤红、庆琪
LViT: Language Meets Vision Transformer in Medical Image Segmentation
LViT:医学图像分割中语言与视觉转换器的结合
Deep learning has been widely used in medical image segmentation and
other aspects. However, the performance of existing medical image
segmentation models has been limited by the challenge of obtaining
sufficient high-quality labeled data due to the prohibitive data
annotation cost. To alleviate this limitation, we propose a new
text-augmented medical image segmentation model LViT (Language meets
Vision Transformer). In our LViT model, medical text annotation is
incorporated to compensate for the quality deficiency in image data. In
addition, the text information can guide to generate pseudo labels of
improved quality in the semi-supervised learning. We also propose an
Exponential Pseudo label Iteration mechanism (EPI) to help the
Pixel-Level Attention Module (PLAM) preserve local image features in
semi-supervised LViT setting. In our model, LV (Language-Vision) loss is
designed to supervise the training of unlabeled images using text
information directly. For evaluation, we construct three multimodal
medical segmentation datasets (image + text) containing X-rays and CT
images. Experimental results show that our proposed LViT has superior
segmentation performance in both fully-supervised and semi-supervised
setting. The code and datasets are available at
https://github.com/HUANGLIZI/LViT.
深度学习已广泛应用于医学图像分割等方面。然而,由于数据注释成本过高,现有医学图像分割模型的性能受到获取足够高质量标记数据的挑战的限制。为了缓解这一限制,我们提出了一种新的文本增强医学图像分割模型 LViT(Language meet Vision Transformer)。在我们的 LViT 模型中,结合了医学文本注释来弥补图像数据的质量缺陷。此外,文本信息可以指导在半监督学习中生成质量提高的伪标签。我们还提出了指数伪标签迭代机制(EPI)来帮助像素级注意力模块(PLAM)在半监督 LViT 设置中保留局部图像特征。在我们的模型中,LV(语言-视觉)损失旨在直接使用文本信息监督未标记图像的训练。为了进行评估,我们构建了三个包含 X 射线和 CT 图像的多模态医学分割数据集(图像 + 文本)。实验结果表明,我们提出的 LViT 在全监督和半监督设置中都具有优异的分割性能。代码和数据集可在 https://github.com/HUANGLIZI/LViT 获取。
AU Zhang, Yinglin
Xi, Ruiling
Zeng, Lingxi
Towey, Dave
Bai, Ruibin
Higashita, Risa
Liu, Jiang
AU 张、奚英林、曾瑞玲、Lingxi Towey、Dave Bai、Ruibin Higashita、Risa Liu、Jiang
Structural Priors Guided Network for the Corneal Endothelial Cell
Segmentation
用于角膜内皮细胞分割的结构先验引导网络
The segmentation of blurred cell boundaries in cornea endothelium
microscope images is challenging, which affects the clinical parameter
estimation accuracy. Existing deep learning methods only consider
pixel-wise classification accuracy and lack of utilization of cell
structure knowledge. Therefore, the segmentation of the blurred cell
boundary is discontinuous. This paper proposes a structural prior guided
network (SPG-Net) for corneal endothelium cell segmentation. We first
employ a hybrid transformer convolution backbone to capture more global
context. Then, we use Feature Enhancement (FE) module to improve the
representation ability of features and Local Affinity-based Feature
Fusion (LAFF) module to propagate structural information among
hierarchical features. Finally, we introduce the joint loss based on
cross entropy and structure similarity index measure (SSIM) to supervise
the training process under pixel and structure levels. We compare the
SPG-Net with various state-of-the-art methods on four corneal
endothelial datasets. The experiment results suggest that the SPG-Net
can alleviate the problem of discontinuous cell boundary segmentation
and balance the pixel-wise accuracy and structure preservation. We also
evaluate the agreement of parameter estimation between ground truth and
the prediction of SPG-Net. The statistical analysis results show a good
agreement and correlation.
角膜内皮显微镜图像中模糊细胞边界的分割具有挑战性,这影响了临床参数估计的准确性。现有的深度学习方法仅考虑像素级分类精度,缺乏对细胞结构知识的利用。因此,模糊细胞边界的分割是不连续的。本文提出了一种用于角膜内皮细胞分割的结构先验引导网络(SPG-Net)。我们首先采用混合变压器卷积主干来捕获更多全局上下文。然后,我们使用特征增强(FE)模块来提高特征的表示能力,并使用基于局部亲和力的特征融合(LAFF)模块来在层次特征之间传播结构信息。最后,我们引入基于交叉熵和结构相似性指数度量(SSIM)的联合损失来监督像素和结构级别下的训练过程。我们在四个角膜内皮数据集上将 SPG-Net 与各种最先进的方法进行比较。实验结果表明,SPG-Net 可以缓解不连续的细胞边界分割问题,并平衡像素精度和结构保留。我们还评估了真实值与 SPG-Net 预测之间参数估计的一致性。统计分析结果显示出良好的一致性和相关性。
AU Liu, Zhentao
Fang, Yu
Li, Changjian
Wu, Han
Liu, Yuan
Shen, Dinggang
Cui, Zhiming
刘AU、方振涛、李宇、吴昌建、刘涵、沉远、崔定刚、志明
Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction.
用于稀疏视图 CBCT 重建的几何感知衰减学习。
Cone Beam Computed Tomography (CBCT) plays a vital role in clinical
imaging. Traditional methods typically require hundreds of 2D X-ray
projections to reconstruct a high-quality 3D CBCT image, leading to
considerable radiation exposure. This has led to a growing interest in
sparse-view CBCT reconstruction to reduce radiation doses. While recent
advances, including deep learning and neural rendering algorithms, have
made strides in this area, these methods either produce unsatisfactory
results or suffer from time inefficiency of individual optimization. In
this paper, we introduce a novel geometry-aware encoder-decoder
framework to solve this problem. Our framework starts by encoding
multi-view 2D features from various 2D X-ray projections with a 2D CNN
encoder. Leveraging the geometry of CBCT scanning, it then back-projects
the multi-view 2D features into the 3D space to formulate a
comprehensive volumetric feature map, followed by a 3D CNN decoder to
recover 3D CBCT image. Importantly, our approach respects the geometric
relationship between 3D CBCT image and its 2D X-ray projections during
feature back projection stage, and enjoys the prior knowledge learned
from the data population. This ensures its adaptability in dealing with
extremely sparse view inputs without individual training, such as
scenarios with only 5 or 10 X-ray projections. Extensive evaluations on
two simulated datasets and one real-world dataset demonstrate
exceptional reconstruction quality and time efficiency of our method.
锥形束计算机断层扫描(CBCT)在临床成像中发挥着至关重要的作用。传统方法通常需要数百次 2D X 射线投影才能重建高质量的 3D CBCT 图像,从而导致相当大的辐射暴露。这导致人们对稀疏视图 CBCT 重建以减少辐射剂量越来越感兴趣。尽管深度学习和神经渲染算法等最新进展在这一领域取得了长足的进步,但这些方法要么产生不令人满意的结果,要么遭受个体优化时间效率低下的困扰。在本文中,我们引入了一种新颖的几何感知编码器-解码器框架来解决这个问题。我们的框架首先使用 2D CNN 编码器对来自各种 2D X 射线投影的多视图 2D 特征进行编码。然后,利用 CBCT 扫描的几何结构,将多视图 2D 特征反投影到 3D 空间中,以制定全面的体积特征图,然后通过 3D CNN 解码器恢复 3D CBCT 图像。重要的是,我们的方法在特征反投影阶段尊重 3D CBCT 图像与其 2D X 射线投影之间的几何关系,并享受从数据群体中学到的先验知识。这确保了其无需单独训练即可处理极其稀疏的视图输入的适应性,例如只有 5 或 10 个 X 射线投影的场景。对两个模拟数据集和一个真实数据集的广泛评估证明了我们的方法具有出色的重建质量和时间效率。
AU Xu, Shicheng
Li, Wei
Li, Zuoyong
Zhao, Tiesong
Zhang, Bob
徐AU、李世成、李伟、赵作勇、张铁松、Bob
Facing Differences of Similarity: Intra- and Inter-Correlation
Unsupervised Learning for Chest X-Ray Anomaly Detection.
面对相似性的差异:胸部 X 射线异常检测的内相关和互相关无监督学习。
Anomaly detection can significantly aid doctors in interpreting chest
X-rays. The commonly used strategy involves utilizing the pre-trained
network to extract features from normal data to establish feature
representations. However, when a pre-trained network is applied to more
detailed X-rays, differences of similarity can limit the robustness of
these feature representations. Therefore, we propose an intra- and
inter-correlation learning framework for chest X-ray anomaly detection.
Firstly, to better leverage the similar anatomical structure information
in chest X-rays, we introduce the Anatomical-Feature Pyramid Fusion
Module for feature fusion. This module aims to obtain fusion features
with both local details and global contextual information. These fusion
features are initialized by a trainable feature mapper and stored in a
feature bank to serve as centers for learning. Furthermore, to Facing
Differences of Similarity (FDS) introduced by the pre-trained network,
we propose an intra- and inter-correlation learning strategy: (1) We use
intra-correlation learning to establish intra-correlation between mapped
features of individual images and semantic centers, thereby initially
discovering lesions; (2) We employ inter-correlation learning to
establish inter-correlation between mapped features of different images,
further mitigating the differences of similarity introduced by the
pre-trained network, and achieving effective detection results even in
diverse chest disease environments. Finally, a comparison with 18
state-of-the-art methods on three datasets demonstrates the superiority
and effectiveness of the proposed method across various scenarios.
异常检测可以极大地帮助医生解读胸部 X 光片。常用的策略是利用预训练的网络从正常数据中提取特征来建立特征表示。然而,当预训练的网络应用于更详细的 X 射线时,相似性的差异可能会限制这些特征表示的鲁棒性。因此,我们提出了一种用于胸部 X 射线异常检测的内相关和互相关学习框架。首先,为了更好地利用胸部X光片中相似的解剖结构信息,我们引入了解剖特征金字塔融合模块进行特征融合。该模块旨在获得具有局部细节和全局上下文信息的融合特征。这些融合特征由可训练的特征映射器初始化,并存储在特征库中作为学习中心。此外,针对预训练网络引入的相似性差异(FDS),我们提出了一种内相关和互相关学习策略:(1)我们使用内相关学习来建立各个图像的映射特征之间的内相关和语义中心,从而初步发现病变; (2)我们采用互相关学习来建立不同图像的映射特征之间的互相关性,进一步减轻预训练网络引入的相似性差异,即使在不同的胸部疾病环境中也能获得有效的检测结果。最后,在三个数据集上与 18 种最先进的方法进行比较,证明了所提出的方法在各种场景下的优越性和有效性。
AU Song, Haofei
Mao, Xintian
Yu, Jing
Li, Qingli
Wang, Yan
AU宋、毛浩飞、于心田、李静、王庆利、严
I<SUP>3</SUP>Net: Inter-Intra-Slice Interpolation Network for Medical
Slice Synthesis
I<SUP>3</SUP>Net:用于医学切片合成的切片内插值网络
Medical imaging is limited by acquisition time and scanning equipment.
CT and MR volumes, reconstructed with thicker slices, are anisotropic
with high in-plane resolution and low through-plane resolution. We
reveal an intriguing phenomenon that due to the mentioned nature of
data, performing slice-wise interpolation from the axial view can yield
greater benefits than performing super-resolution from other views.
Based on this observation, we propose an Inter-Intra-slice Interpolation
Network (I(3)Net), which fully explores information from high in-plane
resolution and compensates for low through-plane resolution. The
through-plane branch supplements the limited information contained in
low through-plane resolution from high in-plane resolution and enables
continual and diverse feature learning. In-plane branch transforms
features to the frequency domain and enforces an equal learning
opportunity for all frequency bands in a global context learning
paradigm. We further propose a cross-view block to take advantage of the
information from all three views online. Extensive experiments on two
public datasets demonstrate the effectiveness of I(3)Net, and noticeably
outperforms state-of-the-art super-resolution, video frame interpolation
and slice interpolation methods by a large margin. We achieve 43.90dB in
PSNR, with at least 1.14dB improvement under the upscale factor of x2 on
MSD dataset with faster inference. Code is available at
https://github.com/DeepMedLab-ECNU/Medical-Image-Reconstruction.
医学成像受到采集时间和扫描设备的限制。用较厚的切片重建的 CT 和 MR 体积具有各向异性,具有高平面内分辨率和低平面分辨率。我们揭示了一个有趣的现象,由于所提到的数据性质,从轴向视图执行切片插值可以比从其他视图执行超分辨率产生更大的好处。基于这一观察,我们提出了一种片内插值网络(I(3)Net),它充分探索高平面内分辨率的信息并补偿低平面分辨率。贯通平面分支从高平面内分辨率中补充了低贯通平面分辨率中包含的有限信息,并实现了持续且多样化的特征学习。平面内分支将特征转换到频域,并在全局上下文学习范式中为所有频段强制提供平等的学习机会。我们进一步提出了一个跨视图块来利用来自所有三个在线视图的信息。对两个公共数据集的大量实验证明了 I(3)Net 的有效性,并且明显优于最先进的超分辨率、视频帧插值和切片插值方法。我们在 MSD 数据集上实现了 43.90dB 的 PSNR,在 x2 的放大因子下至少提高了 1.14dB,推理速度更快。代码可在 https://github.com/DeepMedLab-ECNU/Medical-Image-Reconstruction 获取。
AU Miller, David A.
Grannonico, Marta
Liu, Mingna
Savier, Elise
McHaney, Kara
Erisir, Alev
Netland, Peter A.
Cang, Jianhua
Liu, Xiaorong
Zhang, Hao F.
AU Miller、David A. Grannonico、Marta Liu、Mingna Savier、Elise McHaney、Kara Erisir、Alev Netland、Peter A. Cang、Jianhua Liu、张晓蓉、Hao F.
Visible-Light Optical Coherence Tomography Fibergraphy of the Tree Shrew
Retinal Ganglion Cell Axon Bundles
树鼩视网膜神经节细胞轴突束的可见光光学相干断层扫描纤维成像
We seek to develop techniques for high-resolution imaging of the tree
shrew retina for visualizing and parameterizing retinal ganglion cell
(RGC) axon bundles in vivo. We applied visible-light optical coherence
tomography fibergraphy (vis-OCTF) and temporal speckle averaging (TSA)
to visualize individual RGC axon bundles in the tree shrew retina. For
the first time, we quantified individual RGC bundle width, height, and
cross-sectional area and applied vis-OCT angiography (vis-OCTA) to
visualize the retinal microvasculature in tree shrews. Throughout the
retina, as the distance from the optic nerve head (ONH) increased from
0.5 mm to 2.5 mm, bundle width increased by 30%, height decreased by
67%, and cross-sectional area decreased by 36%. We also showed that axon
bundles become vertically elongated as they converge toward the ONH. Ex
vivo confocal microscopy of retinal flat-mounts immunostained with Tuj1
confirmed our in vivo vis-OCTF findings.
我们寻求开发树鼩视网膜高分辨率成像技术,用于体内视网膜神经节细胞(RGC)轴突束的可视化和参数化。我们应用可见光光学相干断层扫描纤维成像 (vis-OCTF) 和颞散斑平均 (TSA) 来可视化树鼩视网膜中的单个 RGC 轴突束。我们首次量化了单个 RGC 束的宽度、高度和横截面积,并应用 vis-OCT 血管造影 (vis-OCTA) 来可视化树鼩的视网膜微脉管系统。在整个视网膜中,随着距视神经头(ONH)的距离从0.5毫米增加到2.5毫米,束宽度增加30%,高度减少67%,横截面积减少36%。我们还表明,轴突束在向 ONH 汇聚时会垂直拉长。用 Tuj1 免疫染色的视网膜平片的离体共聚焦显微镜证实了我们的体内 vis-OCTF 研究结果。
AU Zhang, Yuhan
Ma, Xiao
Huang, Kun
Li, Mingchao
Heng, Pheng-Ann
张AU、马雨涵、小黄、李坤、衡明超、彭安
Semantic-Oriented Visual Prompt Learning for Diabetic Retinopathy
Grading on Fundus Images
眼底图像糖尿病视网膜病变分级的面向语义的视觉提示学习
Diabetic retinopathy (DR) is a serious ocular condition that requires
effective monitoring and treatment by ophthalmologists. However,
constructing a reliable DR grading model remains a challenging and
costly task, heavily reliant on high-quality training sets and adequate
hardware resources. In this paper, we investigate the knowledge
transferability of large-scale pre-trained models (LPMs) to fundus
images based on prompt learning to construct a DR grading model
efficiently. Unlike full-tuning which fine-tunes all parameters of LPMs,
prompt learning only involves a minimal number of additional learnable
parameters while achieving a competitive effect as full-tuning. Inspired
by visual prompt tuning, we propose Semantic-oriented Visual Prompt
Learning (SVPL) to enhance the semantic perception ability for better
extracting task-specific knowledge from LPMs, without any additional
annotations. Specifically, SVPL assigns a group of learnable prompts for
each DR level to fit the complex pathological manifestations and then
aligns each prompt group to task-specific semantic space via a
contrastive group alignment (CGA) module. We also propose a
plug-and-play adapter module, Hierarchical Semantic Delivery (HSD),
which allows the semantic transition of prompt groups from shallow to
deep layers to facilitate efficient knowledge mining and model
convergence. Our extensive experiments on three public DR grading
datasets demonstrate that SVPL achieves superior results compared to
other transfer tuning and DR grading methods. Further analysis suggests
that the generalized knowledge from LPMs is advantageous for
constructing the DR grading model on fundus images.
糖尿病视网膜病变(DR)是一种严重的眼部疾病,需要眼科医生的有效监测和治疗。然而,构建可靠的 DR 分级模型仍然是一项具有挑战性且成本高昂的任务,严重依赖高质量的训练集和充足的硬件资源。在本文中,我们研究了基于即时学习的大规模预训练模型(LPM)到眼底图像的知识可迁移性,以有效地构建 DR 分级模型。与对 LPM 的所有参数进行微调的全调优不同,即时学习仅涉及最少数量的额外可学习参数,同时实现与全调优一样的竞争效果。受视觉提示调整的启发,我们提出了面向语义的视觉提示学习(SVPL)来增强语义感知能力,以便更好地从 LPM 中提取特定于任务的知识,而无需任何额外的注释。具体来说,SVPL 为每个 DR 级别分配一组可学习的提示,以适应复杂的病理表现,然后通过对比组对齐(CGA)模块将每个提示组与特定任务的语义空间对齐。我们还提出了一种即插即用的适配器模块——分层语义传递(HSD),它允许提示组从浅层到深层的语义转换,以促进高效的知识挖掘和模型收敛。我们对三个公共 DR 分级数据集进行的广泛实验表明,与其他传输调整和 DR 分级方法相比,SVPL 取得了优异的结果。进一步分析表明,LPM 的广义知识有利于构建眼底图像的 DR 分级模型。
AU Liu, Aohan
Guo, Yuchen
Yong, Jun-Hai
Xu, Feng
刘AU、郭敖汉、勇雨辰、徐俊海、冯
Multi-Grained Radiology Report Generation With Sentence-Level
Image-Language Contrastive Learning
通过句子级图像语言对比学习生成多粒度放射学报告
The automatic generation of accurate radiology reports is of great
clinical importance and has drawn growing research interest. However, it
is still a challenging task due to the imbalance between normal and
abnormal descriptions and the multi-sentence and multi-topic nature of
radiology reports. These features result in significant challenges to
generating accurate descriptions for medical images, especially the
important abnormal findings. Previous methods to tackle these problems
rely heavily on extra manual annotations, which are expensive to
acquire. We propose a multi-grained report generation framework
incorporating sentence-level image-sentence contrastive learning, which
does not require any extra labeling but effectively learns knowledge
from the image-report pairs. We first introduce contrastive learning as
an auxiliary task for image feature learning. Different from previous
contrastive methods, we exploit the multi-topic nature of imaging
reports and perform fine-grained contrastive learning by extracting
sentence topics and contents and contrasting between sentence contents
and refined image contents guided by sentence topics. This forces the
model to learn distinct abnormal image features for each specific topic.
During generation, we use two decoders to first generate coarse sentence
topics and then the fine-grained text of each sentence. We directly
supervise the intermediate topics using sentence topics learned by our
contrastive objective. This strengthens the generation constraint and
enables independent fine-tuning of the decoders using reinforcement
learning, which further boosts model performance. Experiments on two
large-scale datasets MIMIC-CXR and IU-Xray demonstrate that our approach
outperforms existing state-of-the-art methods, evaluated by both
language generation metrics and clinical accuracy.
自动生成准确的放射学报告具有重要的临床意义,并引起了越来越多的研究兴趣。然而,由于正常和异常描述之间的不平衡以及放射学报告的多句和多主题性质,这仍然是一项具有挑战性的任务。这些特征给生成医学图像的准确描述带来了重大挑战,尤其是重要的异常发现。以前解决这些问题的方法严重依赖额外的手动注释,而获取这些注释的成本很高。我们提出了一种结合句子级图像句子对比学习的多粒度报告生成框架,它不需要任何额外的标签,但可以有效地从图像报告对中学习知识。我们首先引入对比学习作为图像特征学习的辅助任务。与以往的对比方法不同,我们利用成像报告的多主题性质,通过提取句子主题和内容,并将句子内容与句子主题引导的细化图像内容进行对比来进行细粒度的对比学习。这迫使模型学习每个特定主题的明显异常图像特征。在生成过程中,我们使用两个解码器首先生成粗粒度的句子主题,然后生成每个句子的细粒度文本。我们使用对比目标学习的句子主题直接监督中间主题。这增强了生成约束,并能够使用强化学习对解码器进行独立微调,从而进一步提高模型性能。 在两个大型数据集 MIMIC-CXR 和 IU-Xray 上进行的实验表明,通过语言生成指标和临床准确性进行评估,我们的方法优于现有的最先进方法。
AU Shi, Yongyi
Xia, Wenjun
Wang, Ge
Mou, Xuanqin
区石、夏永义、王文军、葛某、玄勤
Blind CT Image Quality Assessment Using DDPM-derived Content and
Transformer-based Evaluator.
使用 DDPM 衍生内容和基于 Transformer 的评估器进行盲 CT 图像质量评估。
Lowering radiation dose per view and utilizing sparse views per scan are
two common CT scan modes, albeit often leading to distorted images
characterized by noise and streak artifacts. Blind image quality
assessment (BIQA) strives to evaluate perceptual quality in alignment
with what radiologists perceive, which plays an important role in
advancing low-dose CT reconstruction techniques. An intriguing direction
involves developing BIQA methods that mimic the operational
characteristic of the human visual system (HVS). The internal generative
mechanism (IGM) theory reveals that the HVS actively deduces primary
content to enhance comprehension. In this study, we introduce an
innovative BIQA metric that emulates the active inference process of
IGM. Initially, an active inference module, implemented as a denoising
diffusion probabilistic model (DDPM), is constructed to anticipate the
primary content. Then, the dissimilarity map is derived by assessing the
interrelation between the distorted image and its primary content.
Subsequently, the distorted image and dissimilarity map are combined
into a multi-channel image, which is inputted into a transformer-based
image quality evaluator. By leveraging the DDPM-derived primary content,
our approach achieves competitive performance on a low-dose CT dataset.
降低每次视图的辐射剂量和利用每次扫描的稀疏视图是两种常见的 CT 扫描模式,尽管这通常会导致以噪声和条纹伪影为特征的图像失真。盲图像质量评估 (BIQA) 致力于评估与放射科医生感知一致的感知质量,这在推进低剂量 CT 重建技术方面发挥着重要作用。一个有趣的方向涉及开发模仿人类视觉系统 (HVS) 操作特征的 BIQA 方法。内部生成机制(IGM)理论揭示,HVS 主动推断主要内容以增强理解力。在本研究中,我们引入了一种创新的 BIQA 指标,可以模拟 IGM 的主动推理过程。最初,构建一个作为去噪扩散概率模型(DDPM)实现的主动推理模块来预测主要内容。然后,通过评估失真图像与其主要内容之间的相互关系来导出相异图。随后,将失真图像和相异图组合成多通道图像,并将其输入到基于变换器的图像质量评估器中。通过利用 DDPM 衍生的主要内容,我们的方法在低剂量 CT 数据集上实现了具有竞争力的性能。
AU Xia, Jinqiu
Zhou, Yiwen
Deng, Wenxin
Kang, Jing
Wu, Wangjiang
Qi, Mengke
Zhou, Linghong
Ma, Jianhui
Xu, Yuan
夏区、周金秋、邓一文、康文欣、吴静、齐王江、周孟克、马令红、徐建辉、袁
PND-Net: Physics-Inspired Non-Local Dual-Domain Network for Metal
Artifact Reduction
PND-Net:物理启发的非本地双域网络,用于减少金属伪影
Metal artifacts caused by the presence of metallic implants tremendously
degrade the quality of reconstructed computed tomography (CT) images and
therefore affect the clinical diagnosis or reduce the accuracy of organ
delineation and dose calculation in radiotherapy. Although various deep
learning methods have been proposed for metal artifact reduction (MAR),
most of them aim to restore the corrupted sinogram within the metal
trace, which removes beam hardening artifacts but ignores other
components of metal artifacts. In this paper, based on the physical
property of metal artifacts which is verified via Monte Carlo (MC)
simulation, we propose a novel physics-inspired non-local dual-domain
network (PND-Net) for MAR in CT imaging. Specifically, we design a novel
non-local sinogram decomposition network (NSD-Net) to acquire the
weighted artifact component and develop an image restoration network
(IR-Net) to reduce the residual and secondary artifacts in the image
domain. To facilitate the generalization and robustness of our method on
clinical CT images, we employ a trainable fusion network (F-Net) in the
artifact synthesis path to achieve unpaired learning. Furthermore, we
design an internal consistency loss to ensure the data fidelity of
anatomical structures in the image domain and introduce the linear
interpolation sinogram as prior knowledge to guide sinogram
decomposition. NSD-Net, IR-Net, and F-Net are jointly trained so that
they can benefit from one another. Extensive experiments on simulation
and clinical data demonstrate that our method outperforms
state-of-the-art MAR methods.
金属植入物的存在引起的金属伪影极大地降低了重建计算机断层扫描(CT)图像的质量,从而影响临床诊断或降低放射治疗中器官描绘和剂量计算的准确性。尽管已经提出了各种深度学习方法来减少金属伪影(MAR),但大多数方法的目的是恢复金属迹线内损坏的正弦图,从而消除束硬化伪影,但忽略金属伪影的其他组成部分。在本文中,基于通过蒙特卡罗(MC)模拟验证的金属伪影的物理特性,我们提出了一种新颖的物理启发的非局部双域网络(PND-Net),用于CT成像中的MAR。具体来说,我们设计了一种新颖的非局部正弦图分解网络(NSD-Net)来获取加权伪影分量,并开发图像恢复网络(IR-Net)来减少图像域中的残留和二次伪影。为了促进我们的方法在临床 CT 图像上的泛化和鲁棒性,我们在工件合成路径中采用可训练的融合网络(F-Net)来实现不配对学习。此外,我们设计了内部一致性损失以确保图像域中解剖结构的数据保真度,并引入线性插值正弦图作为先验知识来指导正弦图分解。 NSD-Net、IR-Net 和 F-Net 是联合训练的,因此它们可以相互受益。大量的模拟和临床数据实验表明,我们的方法优于最先进的 MAR 方法。
AU Lin, Jingyin
Xie, Wende
Kang, Li
Wu, Huisi
AU Lin、谢静银、康文德、吴莉、慧思
Dynamic-guided Spatiotemporal Attention for Echocardiography Video
Segmentation.
用于超声心动图视频分割的动态引导时空注意力。
Left ventricle (LV) endocardium segmentation in echocardiography video
has received much attention as an important step in quantifying LV
ejection fraction. Most existing methods are dedicated to exploiting
temporal information on top of 2D convolutional networks. In addition to
single appearance semantic learning, some research attempted to
introduce motion cues through the optical flow estimation (OFE) task to
enhance temporal consistency modeling. However, OFE in these methods is
tightly coupled to LV endocardium segmentation, resulting in noisy
inter-frame flow prediction, and post-optimization based on these flows
accumulates errors. To address these drawbacks, we propose
dynamic-guided spatiotemporal attention (DSA) for semi-supervised
echocardiography video segmentation. We first fine-tune the
off-the-shelf OFE network RAFT on echocardiography data to provide
dynamic information. Taking inter-frame flows as additional input, we
use a dual-encoder structure to extract motion and appearance features
separately. Based on the connection between dynamic continuity and
semantic consistency, we propose a bilateral feature calibration module
to enhance both features. For temporal consistency modeling, the DSA is
proposed to aggregate neighboring frame context using deformable
attention that is realized by offsets grid attention. Dynamic
information is introduced into DSA through a bilateral offset estimation
module to effectively combine with appearance semantics and predict
attention offsets, thereby guiding semantic-based spatiotemporal
attention. We evaluated our method on two popular echocardiography
datasets, CAMUS and EchoNet-Dynamic, and achieved state-of-the-art.
超声心动图视频中的左心室 (LV) 心内膜分割作为量化 LV 射血分数的重要步骤而受到广泛关注。大多数现有方法致力于利用 2D 卷积网络之上的时间信息。除了单一外观语义学习之外,一些研究还尝试通过光流估计(OFE)任务引入运动线索来增强时间一致性建模。然而,这些方法中的 OFE 与左心室心内膜分割紧密耦合,导致帧间血流预测存在噪声,并且基于这些血流的后优化会累积误差。为了解决这些缺点,我们提出了用于半监督超声心动图视频分割的动态引导时空注意力(DSA)。我们首先根据超声心动图数据对现成的 OFE 网络 RAFT 进行微调,以提供动态信息。以帧间流作为附加输入,我们使用双编码器结构分别提取运动和外观特征。基于动态连续性和语义一致性之间的联系,我们提出了双边特征校准模块来增强这两个特征。对于时间一致性建模,提出了 DSA 使用通过偏移网格注意力实现的可变形注意力来聚合相邻帧上下文。通过双边偏移估计模块将动态信息引入DSA,有效结合外观语义并预测注意力偏移,从而指导基于语义的时空注意力。我们在两个流行的超声心动图数据集 CAMUS 和 EchoNet-Dynamic 上评估了我们的方法,并达到了最先进的水平。
AU Mandot, Shubham
Zannoni, Elena M.
Cai, Ling
Nie, Xingchen
La Riviere, Patrick J.
Wilson, Matthew D.
Meng, Ling Jian
AU Mandot、Shubham Zannoni、Elena M. Cai、聂凌、Xingchen La Riviere、Patrick J. Wilson、Matthew D.Meng、凌健
A High-Sensitivity Benchtop X-Ray Fluorescence Emission Tomography
(XFET) System With a Full-Ring of X-Ray Imaging-Spectrometers and a
Compound-Eye Collimation Aperture
具有全环 X 射线成像光谱仪和复眼准直孔径的高灵敏度台式 X 射线荧光发射断层扫描 (XFET) 系统
The advent of metal-based drugs and metal nanoparticles as therapeutic
agents in anti-tumor treatment has motivated the advancement of X-ray
fluorescence computed tomography (XFCT) techniques. An XFCT imaging
modality can detect, quantify, and image the biodistribution of metal
elements using the X-ray fluorescence signal emitted upon X-ray
irradiation. However, the majority of XFCT imaging systems and
instrumentation developed so far rely on a single or a small number of
detectors. This work introduces the first full-ring benchtop X-ray
fluorescence emission tomography (XFET) system equipped with 24
solid-state detectors arranged in a hexagonal geometry and a 96-pinhole
compound-eye collimator. We experimentally demonstrate the system's
sensitivity and its capability of multi-element detection and
quantification by performing imaging studies on an animal-sized phantom.
In our preliminary studies, the phantom was irradiated with a pencil
beam of X-rays produced using a low-powered polychromatic X-ray source
(90kVp and 60W max power). This investigation shows a significant
enhancement in the detection limit of gadolinium to as low as 0.1 mg/mL
concentration. The results also illustrate the unique capabilities of
the XFET system to simultaneously determine the spatial distribution and
accurately quantify the concentrations of multiple metal elements.
金属基药物和金属纳米颗粒作为抗肿瘤治疗药物的出现推动了X射线荧光计算机断层扫描(XFCT)技术的进步。 XFCT 成像模式可以使用 X 射线照射时发出的 X 射线荧光信号来检测、量化和成像金属元素的生物分布。然而,迄今为止开发的大多数 XFCT 成像系统和仪器都依赖于单个或少量探测器。这项工作介绍了第一个全环台式X射线荧光发射断层扫描(XFET)系统,配备有24个呈六边形几何形状排列的固态探测器和一个96针孔复眼准直器。我们通过对动物大小的模型进行成像研究,通过实验证明了系统的灵敏度及其多元素检测和量化的能力。在我们的初步研究中,用低功率多色 X 射线源(90kVp 和 60W 最大功率)产生的笔形 X 射线束照射模型。这项研究表明钆的检测限显着提高至低至 0.1 mg/mL 浓度。结果还说明了 XFET 系统同时确定空间分布并准确量化多种金属元素浓度的独特能力。
AU Pang, Yan
Liang, Jiaming
Huang, Teng
Chen, Hao
Li, Yunhao
Li, Dan
Huang, Lin
Wang, Qiong
AU庞、梁彦、黄家明、陈腾、李浩、李云浩、黄丹、王林、琼
Slim UNETR: Scale Hybrid Transformers to Efficient 3D Medical Image
Segmentation Under Limited Computational Resources
Slim UNETR:在有限的计算资源下扩展混合变压器以实现高效的 3D 医学图像分割
Hybrid transformer-based segmentation approaches have shown great
promise in medical image analysis. However, they typically require
considerable computational power and resources during both training and
inference stages, posing a challenge for resource-limited medical
applications common in the field. To address this issue, we present an
innovative framework called Slim UNETR, designed to achieve a balance
between accuracy and efficiency by leveraging the advantages of both
convolutional neural networks and transformers. Our method features the
Slim UNETR Block as a core component, which effectively enables
information exchange through self-attention mechanism decomposition and
cost-effective representation aggregation. Additionally, we utilize the
throughput metric as an efficiency indicator to provide feedback on
model resource consumption. Our experiments demonstrate that Slim UNETR
outperforms state-of-the-art models in terms of accuracy, model size,
and efficiency when deployed on resource-constrained devices.
Remarkably, Slim UNETR achieves 92.44% dice accuracy on BraTS2021 while
being 34.6x smaller and 13.4x faster during inference compared to Swin
UNETR.
基于混合变压器的分割方法在医学图像分析中显示出了巨大的前景。然而,它们在训练和推理阶段通常需要大量的计算能力和资源,这对该领域常见的资源有限的医疗应用提出了挑战。为了解决这个问题,我们提出了一个名为 Slim UNETR 的创新框架,旨在通过利用卷积神经网络和 Transformer 的优势来实现准确性和效率之间的平衡。我们的方法以 Slim UNETR Block 作为核心组件,通过自注意力机制分解和经济有效的表示聚合有效地实现信息交换。此外,我们利用吞吐量指标作为效率指标来提供有关模型资源消耗的反馈。我们的实验表明,当部署在资源受限的设备上时,Slim UNETR 在准确性、模型大小和效率方面优于最先进的模型。值得注意的是,与 Swin UNETR 相比,Slim UNETR 在 BraTS2021 上的骰子准确度达到了 92.44%,同时体积缩小了 34.6 倍,推理速度加快了 13.4 倍。
AU Park, Mi-Ae
Zaha, Vlad G.
Badawi, Ramsey D.
Bowen, Spencer L.
AU Park、Mi-Ae Zaha、Vlad G. Badawi、Ramsey D. Bowen、Spencer L.
Supplemental Transmission Aided Attenuation Correction for Quantitative
Cardiac PET
定量心脏 PET 的补充传输辅助衰减校正
Quantitative PET attenuation correction (AC) for cardiac PET/CT and
PET/MR is a challenging problem. We propose and evaluate an AC approach
that uses coincidences from a relatively weak and physically fixed
sparse external source, in combination with that from the patient, to
reconstruct $\mu $ -maps based on physics principles alone. The low 30
cm3 volume of the source makes it easy to fill and place, and the method
does not use prior image data or attenuation map assumptions. Our
supplemental transmission aided maximum likelihood reconstruction of
attenuation and activity (sTX-MLAA) algorithm contains an attenuation
map update that maximizes the likelihood of terms representing
coincidences originating from tracer in the patient and a weighted
expression of counts segmented from the external source alone. Both
external source and patient scatter and randoms are fully corrected. We
evaluated performance of sTX-MLAA compared to reference standard
CT-based AC with FDG PET/CT phantom studies; including modeling a
patient with myocardial inflammation. Through an ROI analysis we
measured <= 5 % bias in activity concentrations for PET images generated
with sTX-MLAA and a TX source strength >= 12.7$ MBq, relative to CT-AC.
PET background variability (from noise and sparse sampling) was
substantially reduced with sTX-MLAA compared to using counts segmented
from the transmission source alone for AC. Results suggest that sTX-MLAA
will enable quantitative PET during cardiac PET/CT and PET/MR of human
patients.
心脏 PET/CT 和 PET/MR 的定量 PET 衰减校正 (AC) 是一个具有挑战性的问题。我们提出并评估了一种 AC 方法,该方法使用来自相对较弱且物理固定的稀疏外部源的巧合,结合来自患者的巧合,仅基于物理原理来重建 $\mu $ -图。源的体积仅为 30 cm3,因此易于填充和放置,并且该方法不使用先前的图像数据或衰减图假设。我们的补充传输辅助衰减和活动的最大似然重建 (sTX-MLAA) 算法包含衰减图更新,可最大化表示源自患者示踪剂的重合项的可能性以及仅从外部源分段的计数的加权表达式。外部源和患者分散和随机均已完全校正。我们通过 FDG PET/CT 模型研究评估了 sTX-MLAA 与基于 CT 的参考标准 AC 的性能;包括对患有心肌炎症的患者进行建模。通过 ROI 分析,我们测量到使用 sTX-MLAA 生成的 PET 图像的活性浓度存在 <= 5% 偏差,TX 源强度 >= 12.7$ MBq(相对于 CT-AC)。与仅使用从 AC 传输源分段的计数相比,sTX-MLAA 显着降低了 PET 背景变异性(来自噪声和稀疏采样)。结果表明,sTX-MLAA 将在人类患者的心脏 PET/CT 和 PET/MR 期间实现定量 PET。
AU Zhang, Yi
Li, Jiayue
Li, Xinyang
Xie, Min
Islam, Md. Tauhidul
Zhang, Haixian
AU 张、李毅、李家跃、谢欣阳、Min Islam、Md. Tauhidul 张、海贤
FAOT-Net: A 1.5-Stage Framework for 3D Pelvic Lymph Node Detection With
Online Candidate Tuning
FAOT-Net:通过在线候选调整进行 3D 盆腔淋巴结检测的 1.5 阶段框架
Accurate and automatic detection of pelvic lymph nodes in computed
tomography (CT) scans is critical for diagnosing lymph node metastasis
in colorectal cancer, which in turn plays a crucial role in its staging,
treatment planning, surgical guidance, and postoperative follow-up of
colorectal cancer. However, achieving high detection sensitivity and
specificity poses a challenge due to the small and variable sizes of
these nodes, as well as the presence of numerous similar signals within
the complex pelvic CT image. To tackle these issues, we propose a 3D
feature-aware online-tuning network (FAOT-Net) that introduces a novel
1.5-stage structure to seamlessly integrate detection and refinement via
our online candidate tuning process and takes advantage of multi-level
information through the tailored feature flow. Furthermore, we redesign
the anchor fitting and anchor matching strategies to further improve
detection performance in a nearly hyperparameter-free manner. Our
framework achieves the FROC score of 52.8 and the sensitivity of 91.7%
with 16 false positives per scan on the PLNDataset.
计算机断层扫描(CT)扫描中准确、自动检测盆腔淋巴结对于诊断结直肠癌淋巴结转移至关重要,而这对于结直肠癌的分期、治疗计划、手术指导和术后随访起着至关重要的作用。癌症。然而,由于这些节点较小且尺寸可变,以及复杂的盆腔 CT 图像中存在大量相似信号,实现高检测灵敏度和特异性提出了挑战。为了解决这些问题,我们提出了一个 3D 特征感知在线调整网络(FAOT-Net),它引入了一种新颖的 1.5 阶段结构,通过我们的在线候选调整过程无缝集成检测和细化,并通过定制的功能流程。此外,我们重新设计了锚点拟合和锚点匹配策略,以近乎无超参数的方式进一步提高检测性能。我们的框架在 PLN 数据集上的每次扫描出现 16 个误报,FROC 得分为 52.8,灵敏度为 91.7%。
AU Bian, Chenyuan
Xia, Nan
Xie, Anmu
Cong, Shan
Dong, Qian
卞卞、夏晨元、谢楠、丛安木、山东、钱
Adversarially Trained Persistent Homology Based Graph Convolutional
Network for Disease Identification Using Brain Connectivity
基于对抗性训练的持久同源图卷积网络,利用大脑连接进行疾病识别
Brain disease propagation is associated with characteristic alterations
in the structural and functional connectivity networks of the brain. To
identify disease-specific network representations, graph convolutional
networks (GCNs) have been used because of their powerful graph embedding
ability to characterize the non-Euclidean structure of brain networks.
However, existing GCNs generally focus on learning the discriminative
region of interest (ROI) features, often ignoring important topological
information that enables the integration of connectome patterns of brain
activity. In addition, most methods fail to consider the vulnerability
of GCNs to perturbations in network properties of the brain, which
considerably degrades the reliability of diagnosis results. In this
study, we propose an adversarially trained persistent homology-based
graph convolutional network (ATPGCN) to capture disease-specific brain
connectome patterns and classify brain diseases. First, the brain
functional/structural connectivity is constructed using different
neuroimaging modalities. Then, we develop a novel strategy that
concatenates the persistent homology features from a brain algebraic
topology analysis with readout features of the global pooling layer of a
GCN model to collaboratively learn the individual-level representation.
Finally, we simulate the adversarial perturbations by targeting the risk
ROIs from clinical prior, and incorporate them into a training loop to
evaluate the robustness of the model. The experimental results on three
independent datasets demonstrate that ATPGCN outperforms existing
classification methods in disease identification and is robust to minor
perturbations in network architecture. Our code is available at
https://github.com/CYB08/ATPGCN.
脑部疾病的传播与大脑结构和功能连接网络的特征改变有关。为了识别特定疾病的网络表示,图卷积网络(GCN)已被使用,因为它们具有强大的图嵌入能力来表征大脑网络的非欧几里德结构。然而,现有的 GCN 通常专注于学习区分性感兴趣区域 (ROI) 特征,常常忽略能够整合大脑活动连接组模式的重要拓扑信息。此外,大多数方法未能考虑 GCN 对大脑网络特性扰动的脆弱性,这大大降低了诊断结果的可靠性。在这项研究中,我们提出了一种经过对抗性训练的基于同源性的持久图卷积网络(ATPGCN)来捕获特定疾病的大脑连接组模式并对大脑疾病进行分类。首先,使用不同的神经影像模式构建大脑功能/结构连接。然后,我们开发了一种新颖的策略,将大脑代数拓扑分析中的持久同源特征与 GCN 模型的全局池化层的读出特征连接起来,以协作学习个体级别的表示。最后,我们通过针对临床先验的风险投资回报率来模拟对抗性扰动,并将其纳入训练循环中以评估模型的稳健性。三个独立数据集上的实验结果表明,ATPGCN 在疾病识别方面优于现有的分类方法,并且对网络架构中的微小扰动具有鲁棒性。我们的代码可在 https://github.com/CYB08/ATPGCN 获取。
AU Ju, Lie
Yu, Zhen
Wang, Lin
Zhao, Xin
Wang, Xin
Bonnington, Paul
Ge, Zongyuan
AU Ju, 烈宇, 王震, 赵林, 王鑫, Xin Bonnington, Paul Ge, 宗源
Hierarchical Knowledge Guided Learning for Real-World Retinal Disease
Recognition
分层知识引导学习现实世界的视网膜疾病识别
In the real world, medical datasets often exhibit a long-tailed data
distribution (i.e., a few classes occupy the majority of the data, while
most classes have only a limited number of samples), which results in a
challenging long-tailed learning scenario. Some recently published
datasets in ophthalmology AI consist of more than 40 kinds of retinal
diseases with complex abnormalities and variable morbidity.
Nevertheless, more than 30 conditions are rarely seen in global patient
cohorts. From a modeling perspective, most deep learning models trained
on these datasets may lack the ability to generalize to rare diseases
where only a few available samples are presented for training. In
addition, there may be more than one disease for the presence of the
retina, resulting in a challenging label co-occurrence scenario, also
known as multi-label, which can cause problems when some re-sampling
strategies are applied during training. To address the above two major
challenges, this paper presents a novel method that enables the deep
neural network to learn from a long-tailed fundus database for various
retinal disease recognition. Firstly, we exploit the prior knowledge in
ophthalmology to improve the feature representation using a
hierarchy-aware pre-training. Secondly, we adopt an instance-wise
class-balanced sampling strategy to address the label co-occurrence
issue under the long-tailed medical dataset scenario. Thirdly, we
introduce a novel hybrid knowledge distillation to train a less biased
representation and classifier. We conducted extensive experiments on
four databases, including two public datasets and two in-house databases
with more than one million fundus images. The experimental results
demonstrate the superiority of our proposed methods with recognition
accuracy outperforming the state-of-the-art competitors, especially for
these rare diseases.
在现实世界中,医学数据集通常表现出长尾数据分布(即少数类别占据大部分数据,而大多数类别只有有限数量的样本),这导致了具有挑战性的长尾学习场景。最近发布的一些眼科人工智能数据集包含 40 多种视网膜疾病,这些疾病具有复杂的异常情况和不同的发病率。然而,超过 30 种病症在全球患者队列中很少见。从建模的角度来看,大多数在这些数据集上训练的深度学习模型可能缺乏泛化到罕见疾病的能力,因为只有少数可用样本可供训练。此外,视网膜的存在可能存在不止一种疾病,从而导致具有挑战性的标签共现场景,也称为多标签,在训练期间应用一些重采样策略时可能会导致问题。为了解决上述两个主要挑战,本文提出了一种新方法,使深度神经网络能够从长尾眼底数据库中学习,以进行各种视网膜疾病识别。首先,我们利用眼科的先验知识,使用层次结构感知预训练来改进特征表示。其次,我们采用实例级类平衡采样策略来解决长尾医疗数据集场景下的标签共现问题。第三,我们引入了一种新颖的混合知识蒸馏来训练偏差较小的表示和分类器。我们对四个数据库进行了广泛的实验,包括两个公共数据集和两个内部数据库,拥有超过一百万张眼底图像。 实验结果证明了我们提出的方法的优越性,其识别精度优于最先进的竞争对手,特别是对于这些罕见疾病。
AU Lagogiannis, Ioannis
Meissen, Felix
Kaissis, Georgios
Rueckert, Daniel
AU Lagogiannis、Ioannis Meissen、Felix Kaissis、Georgios Rueckert、Daniel
Unsupervised Pathology Detection: A Deep Dive Into the State of the Art
无监督病理检测:深入研究最先进的技术
Deep unsupervised approaches are gathering increased attention for
applications such as pathology detection and segmentation in medical
images since they promise to alleviate the need for large labeled
datasets and are more generalizable than their supervised counterparts
in detecting any kind of rare pathology. As the Unsupervised Anomaly
Detection (UAD) literature continuously grows and new paradigms emerge,
it is vital to continuously evaluate and benchmark new methods in a
common framework, in order to reassess the state-of-the-art (SOTA) and
identify promising research directions. To this end, we evaluate a
diverse selection of cutting-edge UAD methods on multiple medical
datasets, comparing them against the established SOTA in UAD for brain
MRI. Our experiments demonstrate that newly developed feature-modeling
methods from the industrial and medical literature achieve increased
performance compared to previous work and set the new SOTA in a variety
of modalities and datasets. Additionally, we show that such methods are
capable of benefiting from recently developed self-supervised
pre-training algorithms, further increasing their performance. Finally,
we perform a series of experiments in order to gain further insights
into some unique characteristics of selected models and datasets. Our
code can be found under https://github.com/iolag/UPD_study/.
深度无监督方法越来越受到医学图像中病理检测和分割等应用的关注,因为它们有望减轻对大型标记数据集的需求,并且在检测任何类型的罕见病理方面比有监督方法更通用。随着无监督异常检测(UAD)文献的不断增长和新范式的出现,在通用框架中不断评估和基准测试新方法至关重要,以便重新评估最先进的(SOTA)并识别有前途的研究方向。为此,我们在多个医学数据集上评估了多种选择的尖端 UAD 方法,并将它们与脑 MRI UAD 中已建立的 SOTA 进行比较。我们的实验表明,与之前的工作相比,来自工业和医学文献的新开发的特征建模方法实现了更高的性能,并在各种模式和数据集中设置了新的 SOTA。此外,我们表明此类方法能够受益于最近开发的自监督预训练算法,进一步提高其性能。最后,我们进行了一系列实验,以便进一步了解所选模型和数据集的一些独特特征。我们的代码可以在 https://github.com/iolag/UPD_study/ 下找到。
AU Li, Jiajia
Zhang, Pingping
Wang, Teng
Zhu, Lei
Liu, Ruhan
Yang, Xia
Wang, Kaixuan
Shen, Dinggang
Sheng, Bin
AU 李、张佳佳、王萍萍、朱腾、刘雷、杨如涵、王霞、沉凯旋、盛定刚、斌
DSMT-Net: Dual Self-Supervised Multi-Operator Transformation for
Multi-Source Endoscopic Ultrasound Diagnosis
DSMT-Net:多源内窥镜超声诊断的双重自监督多操作员改造
Pancreatic cancer has the worst prognosis of all cancers. The clinical
application of endoscopic ultrasound (EUS) for the assessment of
pancreatic cancer risk and of deep learning for the classification of
EUS images have been hindered by inter-grader variability and labeling
capability. One of the key reasons for these difficulties is that EUS
images are obtained from multiple sources with varying resolutions,
effective regions, and interference signals, making the distribution of
the data highly variable and negatively impacting the performance of
deep learning models. Additionally, manual labeling of images is
time-consuming and requires significant effort, leading to the desire to
effectively utilize a large amount of unlabeled data for network
training. To address these challenges, this study proposes the Dual
Self-supervised Multi-Operator Transformation Network (DSMT-Net) for
multi-source EUS diagnosis. The DSMT-Net includes a multi-operator
transformation approach to standardize the extraction of regions of
interest in EUS images and eliminate irrelevant pixels. Furthermore, a
transformer-based dual self-supervised network is designed to integrate
unlabeled EUS images for pre-training the representation model, which
can be transferred to supervised tasks such as classification,
detection, and segmentation. A large-scale EUS-based pancreas image
dataset (LEPset) has been collected, including 3,500 pathologically
proven labeled EUS images (from pancreatic and non-pancreatic cancers)
and 8,000 unlabeled EUS images for model development. The
self-supervised method has also been applied to breast cancer diagnosis
and was compared to state-of-the-art deep learning models on both
datasets. The results demonstrate that the DSMT-Net significantly
improves the accuracy of pancreatic and breast cancer diagnosis.
胰腺癌是所有癌症中预后最差的。用于评估胰腺癌风险的内窥镜超声 (EUS) 和用于分类 EUS 图像的深度学习的临床应用受到年级间差异和标记能力的阻碍。造成这些困难的关键原因之一是 EUS 图像是从多个来源获得的,具有不同的分辨率、有效区域和干扰信号,使得数据的分布高度可变,并对深度学习模型的性能产生负面影响。此外,手动标记图像非常耗时且需要付出巨大的努力,因此需要有效利用大量未标记的数据进行网络训练。为了应对这些挑战,本研究提出了用于多源 EUS 诊断的双自监督多操作员转换网络(DSMT-Net)。 DSMT-Net 包括一种多算子变换方法,用于标准化 EUS 图像中感兴趣区域的提取并消除不相关的像素。此外,基于变压器的双自监督网络被设计用于集成未标记的 EUS 图像来预训练表示模型,该模型可以转移到分类、检测和分割等监督任务。已经收集了基于 EUS 的大规模胰腺图像数据集 (LEPset),包括 3,500 张经病理学证明的标记 EUS 图像(来自胰腺癌和非胰腺癌)和 8,000 张用于模型开发的未标记 EUS 图像。这种自我监督方法也已应用于乳腺癌诊断,并与两个数据集上最先进的深度学习模型进行了比较。 结果表明 DSMT-Net 显着提高了胰腺癌和乳腺癌诊断的准确性。
AU Wang, Chong
Chen, Yuanhong
Liu, Fengbei
Elliott, Michael
Kwok, Chun Fung
Pena-Solorzano, Carlos
Frazer, Helen
McCarthy, Davis James
Carneiro, Gustavo
AU Wang、Chong Chen、Yuanhong Liu、Fengbei Elliott、Michael Kwok、Chun Fung Pena-Solorzano、Carlos Frazer、Helen McCarthy、Davis James Carneiro、Gustavo
An Interpretable and Accurate Deep-Learning Diagnosis Framework Modeled
With Fully and Semi-Supervised Reciprocal Learning
通过全监督和半监督交互学习建模的可解释且准确的深度学习诊断框架
The deployment of automated deep-learning classifiers in clinical
practice has the potential to streamline the diagnosis process and
improve the diagnosis accuracy, but the acceptance of those classifiers
relies on both their accuracy and interpretability. In general, accurate
deep-learning classifiers provide little model interpretability, while
interpretable models do not have competitive classification accuracy. In
this paper, we introduce a new deep-learning diagnosis framework, called
InterNRL, that is designed to be highly accurate and interpretable.
InterNRL consists of a student-teacher framework, where the student
model is an interpretable prototype-based classifier (ProtoPNet) and the
teacher is an accurate global image classifier (GlobalNet). The two
classifiers are mutually optimised with a novel reciprocal learning
paradigm in which the student ProtoPNet learns from optimal pseudo
labels produced by the teacher GlobalNet, while GlobalNet learns from
ProtoPNet's classification performance and pseudo labels. This
reciprocal learning paradigm enables InterNRL to be flexibly optimised
under both fully- and semi-supervised learning scenarios, reaching
state-of-the-art classification performance in both scenarios for the
tasks of breast cancer and retinal disease diagnosis. Moreover, relying
on weakly-labelled training images, InterNRL also achieves superior
breast cancer localisation and brain tumour segmentation results than
other competing methods.
在临床实践中部署自动化深度学习分类器有可能简化诊断过程并提高诊断准确性,但这些分类器的接受程度取决于其准确性和可解释性。一般来说,准确的深度学习分类器提供的模型可解释性很小,而可解释模型不具有竞争性的分类准确性。在本文中,我们介绍了一种新的深度学习诊断框架,称为 InterNRL,其设计高度准确且可解释。 InterNRL 由学生-教师框架组成,其中学生模型是可解释的基于原型的分类器 (ProtoPNet),教师模型是精确的全局图像分类器 (GlobalNet)。这两个分类器通过一种新颖的互惠学习范式相互优化,其中学生 ProtoPNet 从教师 GlobalNet 生成的最佳伪标签中学习,而 GlobalNet 从 ProtoPNet 的分类性能和伪标签中学习。这种互惠学习范式使 InterNRL 能够在全监督和半监督学习场景下进行灵活优化,在乳腺癌和视网膜疾病诊断任务的两种场景下都达到最先进的分类性能。此外,依靠弱标记的训练图像,InterNRL 还取得了比其他竞争方法更优异的乳腺癌定位和脑肿瘤分割结果。
AU Zhi, Shaohua
Wang, Yinghui
Xiao, Haonan
Bai, Ti
Li, Bing
Tang, Yunsong
Liu, Chenyang
Li, Wen
Li, Tian
Ge, Hong
Cai, Jing
区志、王少华、肖英辉、白浩南、李体、唐兵、刘云松、李晨阳、李文、戈天、蔡宏、静
Coarse-Super-Resolution-Fine Network (CoSF-Net): A Unified End-to-End
Neural Network for 4D-MRI With Simultaneous Motion Estimation and
Super-Resolution
粗超分辨率精细网络 (CoSF-Net):用于 4D-MRI 的统一端到端神经网络,具有同步运动估计和超分辨率
Four-dimensional magnetic resonance imaging (4D-MRI) is an emerging
technique for tumor motion management in image-guided radiation therapy
(IGRT). However, current 4D-MRI suffers from low spatial resolution and
strong motion artifacts owing to the long acquisition time and patients'
respiratory variations. If not managed properly, these limitations can
adversely affect treatment planning and delivery in IGRT. In this study,
we developed a novel deep learning framework called the
coarse-super-resolution-fine network (CoSF-Net) to achieve simultaneous
motion estimation and super-resolution within a unified model. We
designed CoSF-Net by fully excavating the inherent properties of 4D-MRI,
with consideration of limited and imperfectly matched training datasets.
We conducted extensive experiments on multiple real patient datasets to
assess the feasibility and robustness of the developed network. Compared
with existing networks and three state-of-the-art conventional
algorithms, CoSF-Net not only accurately estimated the deformable vector
fields between the respiratory phases of 4D-MRI but also simultaneously
improved the spatial resolution of 4D-MRI, enhancing anatomical features
and producing 4D-MR images with high spatiotemporal resolution.
四维磁共振成像(4D-MRI)是图像引导放射治疗(IGRT)中肿瘤运动管理的新兴技术。然而,由于采集时间长和患者呼吸变化,当前的 4D-MRI 存在空间分辨率低和运动伪影强的问题。如果管理不当,这些限制可能会对 IGRT 的治疗计划和实施产生不利影响。在这项研究中,我们开发了一种称为粗超分辨率精细网络(CoSF-Net)的新型深度学习框架,以在统一模型内实现同时运动估计和超分辨率。我们充分挖掘4D-MRI的固有特性,并考虑到有限且不完全匹配的训练数据集,设计了CoSF-Net。我们对多个真实患者数据集进行了广泛的实验,以评估所开发网络的可行性和稳健性。与现有网络和三种最先进的传统算法相比,CoSF-Net不仅准确估计了4D-MRI呼吸阶段之间的可变形矢量场,而且同时提高了4D-MRI的空间分辨率,增强了解剖特征并生成具有高时空分辨率的 4D-MR 图像。
AU Zhang, Qixiang
Li, Yi
Xue, Cheng
Wang, Haonan
Li, Xiaomeng
张AU、李启翔、薛毅、王成、李浩南、小萌
GlandSAM: Injecting Morphology Knowledge into Segment Anything Model for
Label-free Gland Segmentation.
GlandSAM:将形态学知识注入 Segment Anything 模型中,以实现无标记腺体分割。
This paper presents a label-free gland segmentation, GlandSAM, which
achieves comparable performance with supervised methods while no label
is required during its training or inference phase. We observe that the
Segment Anything model produces sub-optimal results on gland dataset: It
either over-segments a gland into many fractions or under-segments the
gland regions by confusing many of them with the background, due to the
complex morphology of glands and lack of sufficient labels. To address
this challenge, our GlandSAM innovatively injects two clues about gland
morphology into SAM to guide the segmentation process: (1) Heterogeneity
within glands and (2) Similarity with the background. Initially, we
leverage the clues to decompose the intricate glands by selectively
extracting a proposal for each gland sub-region of heterogeneous
appearances. Then, we inject the morphology clues into SAM in a
fine-tuning manner with a novel morphology-aware semantic grouping
module that explicitly groups the high-level semantics of gland
sub-regions. In this way, our GlandSAM could capture comprehensive
knowledge about gland morphology, and produce well-delineated and
complete segmentation results. Extensive experiments conducted on the
GlaS dataset and the CRAG dataset reveal that GlandSAM outperforms
state-of-the-art label-free methods by a significant margin. Notably,
our GlandSAM even surpasses several fully-supervised methods that
require pixel-wise labels for training, which highlights the remarkable
performance and potential of GlandSAM in the realm of gland
segmentation.
本文提出了一种无标签腺体分割 GlandSAM,它实现了与监督方法相当的性能,而在训练或推理阶段不需要标签。我们观察到,Segment Anything 模型在腺体数据集上产生次优结果:由于腺体的复杂形态,它要么将腺体过度分割成许多部分,要么通过将其中许多部分与背景混淆来对腺体区域进行欠分割。缺乏足够的标签。为了应对这一挑战,我们的 GlandSAM 创新地将有关腺体形态的两条线索注入 SAM 中以指导分割过程:(1) 腺体内的异质性和 (2) 与背景的相似性。最初,我们利用线索通过选择性地提取具有异质外观的每个腺体子区域的建议来分解复杂的腺体。然后,我们使用新颖的形态感知语义分组模块以微调方式将形态线索注入 SAM,该模块明确对腺体子区域的高级语义进行分组。通过这种方式,我们的 GlandSAM 可以捕获有关腺体形态的全面知识,并产生清晰且完整的分割结果。在 GlaS 数据集和 CRAG 数据集上进行的大量实验表明,GlandSAM 的性能明显优于最先进的无标签方法。值得注意的是,我们的 GlandSAM 甚至超越了几种需要逐像素标签进行训练的完全监督方法,这凸显了 GlandSAM 在腺体分割领域的卓越性能和潜力。
AU Huang, Yanyan
Zhao, Weiqin
Fu, Yu
Zhu, Lingting
Yu, Lequan
黄AU, 赵艳艳, 付伟勤, 朱宇, 于凌婷, 乐泉
Unleash the Power of State Space Model for Whole Slide Image with Local
Aware Scanning and Importance Resampling.
通过局部感知扫描和重要性重采样,释放整个幻灯片图像的状态空间模型的力量。
Whole slide image (WSI) analysis is gaining prominence within the
medical imaging field. However, previous methods often fall short of
efficiently processing entire WSIs due to their gigapixel size. Inspired
by recent developments in state space models, this paper introduces a
new Pathology Mamba (PAM) for more accurate and robust WSI analysis. PAM
includes three carefully designed components to tackle the challenges of
enormous image size, the utilization of local and hierarchical
information, and the mismatch between the feature distributions of
training and testing during WSI analysis. Specifically, we design a
Bi-directional Mamba Encoder to process the extensive patches present in
WSIs effectively and efficiently, which can handle large-scale
pathological images while achieving high performance and accuracy. To
further harness the local information and inherent hierarchical
structure of WSI, we introduce a novel Local-aware Scanning module,
which employs a local-aware mechanism alongside hierarchical scanning to
adeptly capture both the local information and the overarching structure
within WSIs. Moreover, to alleviate the patch feature distribution
misalignment between training and testing, we propose a Test-time
Importance Resampling module to conduct testing patch resampling to
ensure consistency of feature distribution between the training and
testing phases, and thus enhance model prediction. Extensive evaluation
on nine WSI datasets with cancer subtyping and survival prediction tasks
demonstrates that PAM outperforms current state-of-the-art methods and
also its enhanced capability in modeling discriminative areas within
WSIs. The source code is available at https://github.com/HKU-MedAI/PAM.
全幻灯片图像 (WSI) 分析在医学成像领域越来越受到重视。然而,由于 WSI 的大小为十亿像素,以前的方法通常无法有效地处理整个 WSI。受状态空间模型最新发展的启发,本文引入了新的 Pathology Mamba (PAM),以实现更准确、更稳健的 WSI 分析。 PAM 包括三个精心设计的组件,以应对巨大图像尺寸、本地和分层信息的利用以及 WSI 分析期间训练和测试的特征分布之间不匹配的挑战。具体来说,我们设计了一种双向 Mamba 编码器来有效且高效地处理 WSI 中存在的大量斑块,它可以处理大规模病理图像,同时实现高性能和准确性。为了进一步利用 WSI 的本地信息和固有的层次结构,我们引入了一种新颖的本地感知扫描模块,该模块采用本地感知机制和层次扫描来熟练地捕获 WSI 内的本地信息和总体结构。此外,为了缓解训练和测试之间的补丁特征分布不一致,我们提出了测试时重要性重采样模块来进行测试补丁重采样,以确保训练和测试阶段之间特征分布的一致性,从而增强模型预测。对包含癌症亚型和生存预测任务的 9 个 WSI 数据集进行的广泛评估表明,PAM 优于当前最先进的方法,并且其在 WSI 内的判别区域建模方面的能力也得到了增强。源代码可在 https://github.com/HKU-MedAI/PAM 获取。
AU Thies, Mareike
Wagner, Fabian
Maul, Noah
Yu, Haijun
Meier, Manuela Goldmann
Schneider, Linda-Sophie
Gu, Mingxuan
Mei, Siyuan
Folle, Lukas
Preuhs, Alexander
Manhart, Michael
Maier, Andreas
AU Thies、Mareike Wagner、Fabian Maul、Noah Yu、海军梅尔、Manuela Goldmann Schneider、Linda-Sophie Gu、梅明轩、思源·福勒、Lukas Preuhs、Alexander Manhart、Michael Maier、Andreas
A gradient-based approach to fast and accurate head motion compensation
in cone-beam CT.
一种基于梯度的方法,可在锥束 CT 中实现快速、准确的头部运动补偿。
Cone-beam computed tomography (CBCT) systems, with their flexibility,
present a promising avenue for direct point-of-care medical imaging,
particularly in critical scenarios such as acute stroke assessment.
However, the integration of CBCT into clinical workflows faces
challenges, primarily linked to long scan duration resulting in patient
motion during scanning and leading to image quality degradation in the
reconstructed volumes. This paper introduces a novel approach to CBCT
motion estimation using a gradient-based optimization algorithm, which
leverages generalized derivatives of the backprojection operator for
cone-beam CT geometries. Building on that, a fully differentiable target
function is formulated which grades the quality of the current motion
estimate in reconstruction space. We drastically accelerate motion
estimation yielding a 19-fold speed-up compared to existing methods.
Additionally, we investigate the architecture of networks used for
quality metric regression and propose predicting voxel-wise quality
maps, favoring autoencoder-like architectures over contracting ones.
This modification improves gradient flow, leading to more accurate
motion estimation. The presented method is evaluated through realistic
experiments on head anatomy. It achieves a reduction in reprojection
error from an initial average of 3 mm to 0.61 mm after motion
compensation and consistently demonstrates superior performance compared
to existing approaches. The analytic Jacobian for the backprojection
operation, which is at the core of the proposed method, is made publicly
available. In summary, this paper contributes to the advancement of CBCT
integration into clinical workflows by proposing a robust motion
estimation approach that enhances efficiency and accuracy, addressing
critical challenges in time-sensitive scenarios.
锥形束计算机断层扫描 (CBCT) 系统凭借其灵活性,为直接护理点医学成像提供了一种有前景的途径,特别是在急性中风评估等关键情况下。然而,将 CBCT 集成到临床工作流程中面临着挑战,主要与长扫描持续时间有关,导致扫描期间患者运动并导致重建体积中的图像质量下降。本文介绍了一种使用基于梯度的优化算法进行 CBCT 运动估计的新方法,该算法利用锥束 CT 几何形状的反投影算子的广义导数。在此基础上,制定了一个完全可微的目标函数,该函数对重建空间中当前运动估计的质量进行分级。我们极大地加速了运动估计,与现有方法相比,速度提高了 19 倍。此外,我们研究了用于质量度量回归的网络架构,并提出预测体素质量图,与收缩架构相比,更倾向于类似自动编码器的架构。此修改改进了梯度流,从而实现更准确的运动估计。通过头部解剖学的真实实验来评估所提出的方法。运动补偿后,它可以将重投影误差从最初的平均 3 毫米减少到 0.61 毫米,并且始终表现出比现有方法更优越的性能。作为所提出方法的核心的反投影运算的解析雅可比行列式是公开的。 总之,本文提出了一种强大的运动估计方法,可提高效率和准确性,解决时间敏感场景中的关键挑战,有助于推动 CBCT 融入临床工作流程。
AU Weng, Li
Zhu, Zhoule
Dai, Kaixin
Zheng, Zhe
Zhu, Junming
Wu, Hemmings
翁翁、朱丽、戴周乐、郑凯欣、朱哲、吴俊明、Hemmings
Reduced-Reference Learning for Target Localization in Deep Brain
Stimulation
脑深部刺激中目标定位的减少参考学习
This work proposes a supervised machine learning method for target
localization in deep brain stimulation (DBS). DBS is a recognized
treatment for essential tremor. The effects of DBS significantly depend
on the precise implantation of electrodes. Recent research on diffusion
tensor imaging shows that the optimal target for essential tremor is
related to the dentato-rubro-thalamic tract (DRTT), thus DRTT targeting
has become a promising direction. The tractography-based targeting is
more accurate than conventional ones, but still too complicated for
clinical scenarios, where only structural magnetic resonance imaging
(sMRI) data is available. In order to improve efficiency and utility, we
consider target localization as a non-linear regression problem in a
reduced-reference learning framework, and solve it with convolutional
neural networks (CNNs). The proposed method is an efficient two-step
framework, and consists of two image-based networks: one for
classification and the other for localization. We model the basic
workflow as an image retrieval process and define relevant performance
metrics. Using DRTT as pseudo groundtruths, we show that individualized
tractography-based optimal targets can be inferred from sMRI data with
high accuracy. For two datasets of ${280}\times {220}/{272}\times {227}$
(0.7/0.8 mm slice thickness) sMRI input, our model achieves an average
posterior localization error of 2.3/1.2 mm, and a median of 1.7/1.02 mm.
The proposed framework is a novel application of reduced-reference
learning, and a first attempt to localize DRTT from sMRI. It
significantly outperforms existing methods using 3D-CNN, anatomical and
DRTT atlas, and may serve as a new baseline for general target
localization problems.
这项工作提出了一种用于深部脑刺激(DBS)中目标定位的监督机器学习方法。 DBS 是公认的特发性震颤治疗方法。 DBS 的效果很大程度上取决于电极的精确植入。最近的弥散张量成像研究表明,特发性震颤的最佳靶点与齿状红丘脑束(DRTT)有关,因此DRTT靶向已成为一个有前途的方向。基于纤维束成像的靶向比传统靶向更准确,但对于只能获得结构磁共振成像(sMRI)数据的临床场景来说仍然过于复杂。为了提高效率和实用性,我们将目标定位视为减少参考学习框架中的非线性回归问题,并用卷积神经网络(CNN)来解决它。所提出的方法是一种有效的两步框架,由两个基于图像的网络组成:一个用于分类,另一个用于定位。我们将基本工作流程建模为图像检索过程,并定义相关的性能指标。使用 DRTT 作为伪事实,我们表明可以从 sMRI 数据中高精度地推断出基于个体化纤维束成像的最佳目标。对于 ${280}\times {220}/{272}\times {227}$ (0.7/0.8 mm 切片厚度) sMRI 输入的两个数据集,我们的模型实现了 2.3/1.2 mm 的平均后定位误差,并且中位数为 1.7/1.02 毫米。所提出的框架是减少参考学习的新颖应用,也是从 sMRI 定位 DRTT 的首次尝试。它明显优于使用 3D-CNN、解剖学和 DRTT 图集的现有方法,并且可以作为一般目标定位问题的新基线。
AU Xu, Chi
Xu, Haozheng
Giannarou, Stamatia
AU 徐、徐驰、Haozheng Giannarou、Stamatia
Distance Regression Enhanced with Temporal Information Fusion and
Adversarial Training for Robot-Assisted Endomicroscopy.
通过时间信息融合和机器人辅助内镜检查的对抗训练增强距离回归。
Probe-based confocal laser endomicroscopy (pCLE) has a role in
characterising tissue intraoperatively to guide tumour resection during
surgery. To capture good quality pCLE data which is important for
diagnosis, the probe-tissue contact needs to be maintained within a
working range of micrometre scale. This can be achieved through
micro-surgical robotic manipulation which requires the automatic
estimation of the probe-tissue distance. In this paper, we propose a
novel deep regression framework composed of the Deep Regression
Generative Adversarial Network (DR-GAN) and a Sequence Attention (SA)
module. The aim of DR-GAN is to train the network using an enhanced
image-based supervision approach. It extents the standard generator by
using a well-defined function for image generation, instead of a
learnable decoder. Also, DR-GAN uses a novel learnable neural perceptual
loss which combines for the first time spatial and frequency domain
features. This effectively suppresses the adverse effects of noise in
the pCLE data. To incorporate temporal information, we've designed the
SA module which is a cross-attention module, enhanced with Radial Basis
Function based encoding (SA-RBF). Furthermore, to train the regression
framework, we designed a multi-step training mechanism. During
inference, the trained network is used to generate data representations
which are fused along time in the SA-RBF module to boost the regression
stability. Our proposed network advances SOTA networks by addressing the
challenge of excessive noise in the pCLE data and enhancing regression
stability. It outperforms SOTA networks applied on the pCLE Regression
dataset (PRD) in terms of accuracy, data quality and stability.
基于探针的共焦激光内窥镜(pCLE)在术中表征组织以指导手术期间的肿瘤切除方面发挥着作用。为了捕获对诊断很重要的高质量 pCLE 数据,探针与组织的接触需要保持在微米级的工作范围内。这可以通过显微外科机器人操作来实现,这需要自动估计探针与组织的距离。在本文中,我们提出了一种新颖的深度回归框架,由深度回归生成对抗网络(DR-GAN)和序列注意(SA)模块组成。 DR-GAN 的目标是使用增强的基于图像的监督方法来训练网络。它通过使用定义明确的图像生成函数而不是可学习的解码器来扩展标准生成器。此外,DR-GAN 使用了一种新颖的可学习神经感知损失,首次结合了空间和频域特征。这有效地抑制了pCLE数据中噪声的不利影响。为了合并时间信息,我们设计了 SA 模块,它是一个交叉注意力模块,并通过基于径向基函数的编码 (SA-RBF) 进行了增强。此外,为了训练回归框架,我们设计了一个多步骤训练机制。在推理过程中,经过训练的网络用于生成数据表示,这些数据表示在 SA-RBF 模块中随时间融合,以提高回归稳定性。我们提出的网络通过解决 pCLE 数据中过多噪声的挑战并增强回归稳定性来推进 SOTA 网络。它在准确性、数据质量和稳定性方面优于应用于 pCLE 回归数据集(PRD)的 SOTA 网络。
AU Tan, Zuopeng
Zhang, Lihe
Lv, Yanan
Ma, Yili
Lu, Huchuan
谭AU、张作鹏、吕礼合、马亚男、路一立、胡川
GroupMorph: Medical Image Registration via Grouping Network with
Contextual Fusion.
GroupMorph:通过具有上下文融合的分组网络进行医学图像配准。
Pyramid-based deformation decomposition is a promising registration
framework, which gradually decomposes the deformation field into
multi-resolution subfields for precise registration. However, most
pyramid-based methods directly produce one subfield per resolution
level, which does not fully depict the spatial deformation. In this
paper, we propose a novel registration model, called GroupMorph.
Different from typical pyramid-based methods, we adopt the
grouping-combination strategy to predict deformation field at each
resolution. Specifically, we perform group-wise correlation calculation
to measure the similarities of grouped features. After that, n groups of
deformation subfields with different receptive fields are predicted in
parallel. By composing these subfields, a deformation field with
multi-receptive field ranges is formed, which can effectively identify
both large and small deformations. Meanwhile, a contextual fusion module
is designed to fuse the contextual features and provide the inter-group
information for the field estimator of the next level. By leveraging the
inter-group correspondence, the synergy among deformation subfields is
enhanced. Extensive experiments on four public datasets demonstrate the
effectiveness of GroupMorph. Code is available at
https://github.com/TVayne/GroupMorph.
基于金字塔的变形分解是一种有前途的配准框架,它将变形场逐渐分解为多分辨率子场以进行精确配准。然而,大多数基于金字塔的方法直接为每个分辨率级别生成一个子场,这不能完全描述空间变形。在本文中,我们提出了一种新颖的注册模型,称为 GroupMorph。与典型的基于金字塔的方法不同,我们采用分组组合策略来预测每个分辨率下的变形场。具体来说,我们执行分组相关性计算来测量分组特征的相似性。之后,并行预测n组具有不同感受野的形变子场。通过组合这些子场,形成具有多个感受野范围的变形场,可以有效地识别大变形和小变形。同时,设计了上下文融合模块来融合上下文特征,为下一级的场估计器提供组间信息。通过利用组间对应关系,增强了变形子场之间的协同作用。对四个公共数据集的大量实验证明了 GroupMorph 的有效性。代码可在 https://github.com/TVayne/GroupMorph 获取。
EI 1558-254X
DA 2024-05-16
UT MEDLINE:38739510
PM 38739510
ER
EI 1558-254X DA 2024-05-16 UT MEDLINE:38739510 PM 38739510 ER
AU Ambrosanio, Michele
Bevacqua, Martina Teresa
Lovetri, Joe
Pascazio, Vito
Isernia, Tommaso
AU Ambrosanio、米歇尔·贝瓦夸、玛蒂娜·特蕾莎·洛夫特里、乔·帕斯卡齐奥、维托·伊塞尔尼亚、托马索
In-Vivo Electrical Properties Estimation of Biological Tissues by Means
of a Multi-Step Microwave Tomography Approach
通过多步微波断层扫描方法估计生物组织的体内电特性
The accurate quantitative estimation of the electromagnetic properties
of tissues can serve important diagnostic and therapeutic medical
purposes. Quantitative microwave tomography is an imaging modality that
can provide maps of the in-vivo electromagnetic properties of the imaged
tissues, i.e. both the permittivity and the electric conductivity. A
multi-step microwave tomography approach is proposed for the accurate
retrieval of such spatial maps of biological tissues. The underlying
idea behind the new imaging approach is to progressively add details to
the maps in a step-wise fashion starting from single-frequency
qualitative reconstructions. Multi-frequency microwave data is utilized
strategically in the final stage. The approach results in improved
accuracy of the reconstructions compared to inversion of the data in a
single step. As a case study, the proposed workflow was tested on an
experimental microwave data set collected for the imaging of the human
forearm. The human forearm is a good test case as it contains several
soft tissues as well as bone, exhibiting a wide range of values for the
electrical properties.
组织电磁特性的准确定量估计可以服务于重要的诊断和治疗医学目的。定量微波断层扫描是一种成像方式,可以提供成像组织的体内电磁特性图,即介电常数和电导率。提出了一种多步骤微波断层扫描方法,用于准确检索生物组织的此类空间图。新成像方法背后的基本思想是从单频定性重建开始,以逐步的方式逐步向地图添加细节。在最后阶段战略性地利用多频微波数据。与单步数据反演相比,该方法提高了重建的准确性。作为案例研究,所提出的工作流程在为人类前臂成像而收集的实验微波数据集上进行了测试。人类前臂是一个很好的测试用例,因为它包含多个软组织和骨骼,表现出广泛的电特性值。
AU Guo, Rui
Lin, Zhichao
Xin, Jingyu
Li, Maokun
Yang, Fan
Xu, Shenheng
Abubakar, Aria
郭果、林睿、辛志超、李靖宇、杨茂琨、徐帆、申恒阿布巴卡、Aria
Three Dimensional Microwave Data Inversion in Feature Space for Stroke
Imaging
中风成像特征空间中的三维微波数据反演
Microwave imaging is a promising method for early diagnosing and
monitoring brain strokes. It is portable, non-invasive, and safe to the
human body. Conventional techniques solve for unknown electrical
properties represented as pixels or voxels, but often result in
inadequate structural information and high computational costs. We
propose to reconstruct the three dimensional (3D) electrical properties
of the human brain in a feature space, where the unknowns are latent
codes of a variational autoencoder (VAE). The decoder of the VAE, with
prior knowledge of the brain, acts as a module of data inversion. The
codes in the feature space are optimized by minimizing the misfit
between measured and simulated data. A dataset of 3D heads characterized
by permittivity and conductivity is constructed to train the VAE.
Numerical examples show that our method increases structural similarity
by 14% and speeds up the solution process by over 3 orders of magnitude
using only 4.8% number of the unknowns compared to the voxel-based
method. This high-resolution imaging of electrical properties leads to
more accurate stroke diagnosis and offers new insights into the study of
the human brain.
微波成像是早期诊断和监测脑中风的一种有前途的方法。它便携、无创、对人体安全。传统技术解决了以像素或体素表示的未知电特性,但通常会导致结构信息不足和计算成本高昂。我们建议在特征空间中重建人脑的三维(3D)电特性,其中未知数是变分自动编码器(VAE)的潜在代码。 VAE 的解码器具有大脑的先验知识,充当数据反演的模块。通过最小化测量数据和模拟数据之间的失配来优化特征空间中的代码。构建以介电常数和电导率为特征的 3D 头部数据集来训练 VAE。数值示例表明,与基于体素的方法相比,我们的方法仅使用 4.8% 的未知数,将结构相似性提高了 14%,并将求解过程加快了 3 个数量级以上。这种高分辨率的电特性成像可以实现更准确的中风诊断,并为人类大脑的研究提供新的见解。
AU Wu, Weiwen
Pan, Jiayi
Wang, Yanyang
Wang, Shaoyu
Zhang, Jianjia
吴AU、潘伟文、王佳怡、王艳阳、张少宇、健佳
Multi-channel Optimization Generative Model for Stable Ultra-Sparse-View
CT Reconstruction.
用于稳定超稀疏视图 CT 重建的多通道优化生成模型。
Score-based generative model (SGM) has risen to prominence in
sparse-view CT reconstruction due to its impressive generation
capability. The consistency of data is crucial in guiding the
reconstruction process in SGM-based reconstruction methods. However, the
existing data consistency policy exhibits certain limitations. Firstly,
it employs partial data from the reconstructed image of iteration
process for image updates, which leads to secondary artifacts with
compromising image quality. Moreover, the updates to the SGM and data
consistency are considered as distinct stages, disregarding their
interdependent relationship. Additionally, the reference image used to
compute gradients in the reconstruction process is derived from
intermediate result rather than ground truth. Motivated by the fact that
a typical SGM yields distinct outcomes with different random noise
inputs, we propose a Multi-channel Optimization Generative Model (MOGM)
for stable ultra-sparse-view CT reconstruction by integrating a novel
data consistency term into the stochastic differential equation model.
Notably, the unique aspect of this data consistency component is its
exclusive reliance on original data for effectively confining generation
outcomes. Furthermore, we pioneer an inference strategy that traces back
from the current iteration result to ground truth, enhancing
reconstruction stability through foundational theoretical support. We
also establish a multi-channel optimization reconstruction framework,
where conventional iterative techniques are employed to seek the
reconstruction solution. Quantitative and qualitative assessments on 23
views datasets from numerical simulation, clinical cardiac and sheep's
lung underscore the superiority of MOGM over alternative methods.
Reconstructing from just 10 and 7 views, our method consistently
demonstrates exceptional performance.
基于评分的生成模型 (SGM) 因其令人印象深刻的生成能力而在稀疏视图 CT 重建中脱颖而出。在基于 SGM 的重建方法中,数据的一致性对于指导重建过程至关重要。然而,现有的数据一致性策略存在一定的局限性。首先,它利用迭代过程中重建图像的部分数据进行图像更新,这会导致二次伪影,从而影响图像质量。此外,SGM 的更新和数据一致性被视为不同的阶段,而不考虑它们之间的相互依赖关系。此外,在重建过程中用于计算梯度的参考图像是从中间结果而不是地面实况中得出的。受典型 SGM 在不同随机噪声输入下产生不同结果这一事实的启发,我们提出了一种多通道优化生成模型 (MOGM),通过将新颖的数据一致性项集成到随机微分方程中,实现稳定的超稀疏视图 CT 重建模型。值得注意的是,该数据一致性组件的独特之处在于它完全依赖原始数据来有效限制发电结果。此外,我们首创了一种从当前迭代结果追溯到地面实况的推理策略,通过基础理论支持增强重建稳定性。我们还建立了多通道优化重建框架,采用传统的迭代技术来寻求重建解决方案。对数值模拟、临床心脏和羊肺的 23 个视图数据集进行的定量和定性评估强调了 MOGM 相对于其他方法的优越性。 仅从 10 个和 7 个视图进行重建,我们的方法始终表现出卓越的性能。
AU Liu, Jingyu
Cui, Weigang
Chen, Yipeng
Ma, Yulan
Dong, Qunxi
Cai, Ran
Li, Yang
Hu, Bin
刘AU、崔静宇、陈伟刚、马一鹏、董玉兰、蔡群喜、李然、胡杨、斌
Deep Fusion of Multi-Template Using Spatio-Temporal Weighted
Multi-Hypergraph Convolutional Networks for Brain Disease Analysis
使用时空加权多超图卷积网络进行多模板深度融合进行脑疾病分析
Conventional functional connectivity network (FCN) based on
resting-state fMRI (rs-fMRI) can only reflect the relationship between
pairwise brain regions. Thus, the hyper-connectivity network (HCN) has
been widely used to reveal high-order interactions among multiple brain
regions. However, existing HCN models are essentially spatial HCN, which
reflect the spatial relevance of multiple brain regions, but ignore the
temporal correlation among multiple time points. Furthermore, the
majority of HCN construction and learning frameworks are limited to
using a single template, while the multi-template carries richer
information. To address these issues, we first employ multiple templates
to parcellate the rs-fMRI into different brain regions. Then, based on
the multi-template data, we propose a spatio-temporal weighted HCN
(STW-HCN) to capture more comprehensive high-order temporal and spatial
properties of brain activity. Next, a novel deep fusion model of
multi-template called spatio-temporal weighted multi-hypergraph
convolutional network (STW-MHGCN) is proposed to fuse the STW-HCN of
multiple templates, which extracts the deep interrelation information
between different templates. Finally, we evaluate our method on the
ADNI-2 and ABIDE-I datasets for mild cognitive impairment (MCI) and
autism spectrum disorder (ASD) analysis. Experimental results
demonstrate that the proposed method is superior to the state-of-the-art
approaches in MCI and ASD classification, and the abnormal
spatio-temporal hyper-edges discovered by our method have significant
significance for the brain abnormalities analysis of MCI and ASD.
传统的基于静息态功能磁共振成像(rs-fMRI)的功能连接网络(FCN)只能反映成对大脑区域之间的关系。因此,超连接网络(HCN)已被广泛用于揭示多个大脑区域之间的高阶相互作用。然而,现有的HCN模型本质上是空间HCN,反映了多个大脑区域的空间相关性,但忽略了多个时间点之间的时间相关性。此外,大多数HCN构建和学习框架仅限于使用单个模板,而多模板承载了更丰富的信息。为了解决这些问题,我们首先使用多个模板将 rs-fMRI 分割到不同的大脑区域。然后,基于多模板数据,我们提出了时空加权 HCN(STW-HCN)来捕获更全面的大脑活动的高阶时空特性。接下来,提出了一种新颖的多模板深度融合模型,称为时空加权多超图卷积网络(STW-MHGCN)来融合多个模板的STW-HCN,提取不同模板之间的深层相互关系信息。最后,我们在 ADNI-2 和 ABIDE-I 数据集上评估我们的方法,以进行轻度认知障碍 (MCI) 和自闭症谱系障碍 (ASD) 分析。实验结果表明,该方法优于 MCI 和 ASD 分类中最先进的方法,并且我们的方法发现的异常时空超边缘对于 MCI 和 ASD 的大脑异常分析具有重要意义。
AU Razavi, Raha
Plonka, Gerlind
Rabbani, Hossein
AU Razavi、Raha Plonka、Gerlind Rabbani、Hossein
<i>X</i>-Let's Atom Combinations for Modeling and Denoising of OCT
Images by Modified Morphological Component Analysis
<i>X</i>-让我们通过改进的形态成分分析对 OCT 图像进行建模和去噪的原子组合
An improved analysis of Optical Coherence Tomography (OCT) images of the
retina is of essential importance for the correct diagnosis of retinal
abnormalities. Unfortunately, OCT images suffer from noise arising from
different sources. In particular, speckle noise caused by the scattering
of light waves strongly degrades the quality of OCT image acquisitions.
In this paper, we employ a Modified Morphological Component Analysis
(MMCA) to provide a new method that separates the image into components
that contain different features as texture, piecewise smooth parts, and
singularities along curves. Each image component is computed as a sparse
representation in a suitable dictionary. To create these dictionaries,
we use non-data-adaptive multi-scale ( X -let) transforms which have
been shown to be well suitable to extract the special OCT image
features. In this way, we reach two goals at once. On the one hand, we
achieve strongly improved denoising results by applying adaptive local
thresholding techniques separately to each image component. The
denoising performance outperforms other state-of-the-art denoising
algorithms regarding the PSNR as well as no-reference image quality
assessments. On the other hand, we obtain a decomposition of the OCT
images in well-interpretable image components that can be exploited for
further image processing tasks, such as classification.
改进视网膜光学相干断层扫描 (OCT) 图像的分析对于正确诊断视网膜异常至关重要。不幸的是,OCT 图像受到不同来源产生的噪声的影响。特别是,光波散射引起的散斑噪声严重降低了 OCT 图像采集的质量。在本文中,我们采用改进的形态成分分析(MMCA)来提供一种新方法,将图像分成包含不同特征的成分,如纹理、分段平滑部分和沿曲线的奇点。每个图像分量都被计算为合适字典中的稀疏表示。为了创建这些字典,我们使用非数据自适应多尺度(X-let)变换,该变换已被证明非常适合提取特殊的 OCT 图像特征。这样,我们就同时达到了两个目标。一方面,我们通过对每个图像分量分别应用自适应局部阈值技术,获得了显着改善的去噪结果。在 PSNR 以及无参考图像质量评估方面,去噪性能优于其他最先进的去噪算法。另一方面,我们获得了 OCT 图像在易于解释的图像成分中的分解,可用于进一步的图像处理任务,例如分类。
AU Li, Ziyu
Miller, Karla L
Chen, Xi
Chiew, Mark
Wu, Wenchuan
AU Li, Ziyu Miller, Karla L Chen, Xi Chiew, Mark Wu, 汶川
Self-navigated 3D diffusion MRI using an optimized CAIPI sampling and
structured low-rank reconstruction estimated navigator.
使用优化的 CAIPI 采样和结构化低秩重建估计导航器的自导航 3D 扩散 MRI。
3D multi-slab acquisitions are an appealing approach for diffusion MRI
because they are compatible with the imaging regime delivering optimal
SNR efficiency. In conventional 3D multi-slab imaging, shot-to-shot
phase variations caused by motion pose challenges due to the use of
multi-shot k-space acquisition. Navigator acquisition after each imaging
echo is typically employed to correct phase variations, which prolongs
scan time and increases the specific absorption rate (SAR). The aim of
this study is to develop a highly efficient, self-navigated method to
correct for phase variations in 3D multi-slab diffusion MRI without
explicitly acquiring navigators. The sampling of each shot is carefully
designed to intersect with the central kz=0 plane of each slab, and the
multi-shot sampling is optimized for self-navigation performance while
retaining decent reconstruction quality. The kz=0 intersections from all
shots are jointly used to reconstruct a 2D phase map for each shot using
a structured low-rank constrained reconstruction that leverages the
redundancy in shot and coil dimensions. The phase maps are used to
eliminate the shot-to-shot phase inconsistency in the final 3D
multi-shot reconstruction. We demonstrate the method's efficacy using
retrospective simulations and prospectively acquired in-vivo experiments
at 1.22 mm and 1.09 mm isotropic resolutions. Compared to conventional
navigated 3D multi-slab imaging, the proposed self-navigated method
achieves comparable image quality while shortening the scan time by
31.7% and improving the SNR efficiency by 15.5%. The proposed method
produces comparable quality of DTI and white matter tractography to
conventional navigated 3D multi-slab acquisition with a much shorter
scan time.
3D 多板采集是扩散 MRI 的一种有吸引力的方法,因为它们与提供最佳 SNR 效率的成像机制兼容。在传统的 3D 多板成像中,由于使用多镜头 k 空间采集,由运动引起的镜头间相位变化带来了挑战。每次成像回波后的导航器采集通常用于校正相位变化,从而延长扫描时间并增加比吸收率 (SAR)。本研究的目的是开发一种高效的自导航方法来校正 3D 多板扩散 MRI 中的相位变化,而无需显式获取导航器。每个镜头的采样都经过精心设计,与每个板的中心 kz=0 平面相交,并且多镜头采样针对自导航性能进行了优化,同时保留了良好的重建质量。来自所有炮弹的 kz=0 交点被联合用于使用利用炮弹和线圈尺寸中的冗余的结构化低秩约束重建来重建每个炮弹的二维相位图。相位图用于消除最终 3D 多镜头重建中镜头与镜头之间的相位不一致。我们使用回顾性模拟和前瞻性体内实验在 1.22 毫米和 1.09 毫米各向同性分辨率下证明了该方法的有效性。与传统的导航3D多板成像相比,所提出的自导航方法实现了可比的图像质量,同时扫描时间缩短了31.7%,信噪比效率提高了15.5%。所提出的方法可产生与传统导航 3D 多板采集相当的 DTI 和白质纤维束成像质量,且扫描时间短得多。
AU Emre, Taha
Chakravarty, Arunava
Rivail, Antoine
Lachinov, Dmitrii
Leingang, Oliver
Riedl, Sophie
Mai, Julia
Scholl, Hendrik P. N.
Sivaprasad, Sobha
Rueckert, Daniel
Lotery, Andrew
Schmidt-Erfurth, Ursula
Bogunovic, Hrvoje
CA PINNACLE Consortium
AU Emre、Taha Chakravarty、Arunava Rivail、Antoine Lachinov、Dmitrii Leingang、Oliver Riedl、Sophie Mai、Julia Scholl、Hendrik PN Sivaprasad、Sobha Rueckert、Daniel Lotery、Andrew Schmidt-Erfurth、Ursula Bogunovic、Hrvoje CA PINNACLE 联盟
3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease
Progression From Longitudinal OCTs
3DTINC:通过纵向 OCT 预测疾病进展的时间等变非对比学习
Self-supervised learning (SSL) has emerged as a powerful technique for
improving the efficiency and effectiveness of deep learning models.
Contrastive methods are a prominent family of SSL that extract similar
representations of two augmented views of an image while pushing away
others in the representation space as negatives. However, the
state-of-the-art contrastive methods require large batch sizes and
augmentations designed for natural images that are impractical for 3D
medical images. To address these limitations, we propose a new
longitudinal SSL method, 3DTINC, based on non-contrastive learning. It
is designed to learn perturbation-invariant features for 3D optical
coherence tomography (OCT) volumes, using augmentations specifically
designed for OCT. We introduce a new non-contrastive similarity loss
term that learns temporal information implicitly from intra-patient
scans acquired at different times. Our experiments show that this
temporal information is crucial for predicting progression of retinal
diseases, such as age-related macular degeneration (AMD). After
pretraining with 3DTINC, we evaluated the learned representations and
the prognostic models on two large-scale longitudinal datasets of
retinal OCTs where we predict the conversion to wet-AMD within a
six-month interval. Our results demonstrate that each component of our
contributions is crucial for learning meaningful representations useful
in predicting disease progression from longitudinal volumetric scans.
自监督学习(SSL)已成为提高深度学习模型效率和有效性的强大技术。对比方法是 SSL 的一个重要家族,它提取图像的两个增强视图的相似表示,同时将表示空间中的其他视图作为底片排除。然而,最先进的对比方法需要大批量大小和针对自然图像设计的增强,这对于 3D 医学图像来说是不切实际的。为了解决这些限制,我们提出了一种基于非对比学习的新的纵向 SSL 方法 3DTINC。它旨在使用专为 OCT 设计的增强功能来学习 3D 光学相干断层扫描 (OCT) 体积的扰动不变特征。我们引入了一种新的非对比相似性损失项,它可以从不同时间获取的患者体内扫描中隐式学习时间信息。我们的实验表明,这种时间信息对于预测视网膜疾病的进展至关重要,例如年龄相关性黄斑变性(AMD)。使用 3DTINC 进行预训练后,我们在两个大规模视网膜 OCT 纵向数据集上评估了学习到的表示和预后模型,我们预测在六个月的时间间隔内向湿性 AMD 的转化。我们的结果表明,我们贡献的每个组成部分对于学习有意义的表示至关重要,这些表示有助于通过纵向体积扫描预测疾病进展。
AU Kong, Youyong
Zhang, Xiaotong
Wang, Wenhan
Zhou, Yue
Li, Yueying
Yuan, Yonggui
AU Kong、张友勇、王晓彤、周文瀚、李悦、袁月英、永贵
Multi-Scale Spatial-Temporal Attention Networks for Functional
Connectome Classification.
用于功能连接组分类的多尺度时空注意力网络。
Many neuropsychiatric disorders are considered to be associated with
abnormalities in the functional connectivity networks of the brain. The
research on the classification of functional connectivity can therefore
provide new perspectives for understanding the pathology of disorders
and contribute to early diagnosis and treatment. Functional connectivity
exhibits a nature of dynamically changing over time, however, the
majority of existing methods are unable to collectively reveal the
spatial topology and time-varying characteristics. Furthermore, despite
the efforts of limited spatial-temporal studies to capture rich
information across different spatial scales, they have not delved into
the temporal characteristics among different scales. To address above
issues, we propose a novel Multi-Scale Spatial-Temporal Attention
Networks (MSSTAN) to exploit the multi-scale spatial-temporal
information provided by functional connectome for classification. To
fully extract spatial features of brain regions, we propose a Topology
Enhanced Graph Transformer module to guide the attention calculations in
the learning of spatial features by incorporating topology priors. A
Multi-Scale Pooling Strategy is introduced to obtain representations of
brain connectome at various scales. Considering the temporal dynamic
characteristics between dynamic functional connectome, we employ
Locality Sensitive Hashing attention to further capture long-term
dependencies in time dynamics across multiple scales and reduce the
computational complexity of the original attention mechanism.
Experiments on three brain fMRI datasets of MDD and ASD demonstrate the
superiority of our proposed approach. In addition, benefiting from the
attention mechanism in Transformer, our results are interpretable, which
can contribute to the discovery of biomarkers. The code is available at
https://github.com/LIST-KONG/MSSTAN.
许多神经精神疾病被认为与大脑功能连接网络的异常有关。因此,对功能连接分类的研究可以为理解疾病的病理学提供新的视角,并有助于早期诊断和治疗。功能连通性表现出随时间动态变化的性质,然而,大多数现有方法无法共同揭示空间拓扑和时变特征。此外,尽管有限的时空研究努力捕捉不同空间尺度上的丰富信息,但他们并没有深入研究不同尺度之间的时间特征。为了解决上述问题,我们提出了一种新颖的多尺度时空注意力网络(MSSTAN),以利用功能连接组提供的多尺度时空信息进行分类。为了充分提取大脑区域的空间特征,我们提出了一个拓扑增强图转换器模块,通过结合拓扑先验来指导空间特征学习中的注意力计算。引入多尺度池化策略来获取不同尺度的大脑连接组的表示。考虑到动态功能连接组之间的时间动态特征,我们采用局部敏感哈希注意力来进一步捕获跨多个尺度的时间动态的长期依赖性,并降低原始注意力机制的计算复杂度。对 MDD 和 ASD 的三个大脑功能磁共振成像数据集的实验证明了我们提出的方法的优越性。 此外,受益于 Transformer 中的注意力机制,我们的结果是可解释的,这可以有助于生物标志物的发现。该代码可从 https://github.com/LIST-KONG/MSSTAN 获取。
EI 1558-254X
DA 2024-08-24
UT MEDLINE:39172603
PM 39172603
ER
EI 1558-254X DA 2024-08-24 UT MEDLINE:39172603 PM 39172603 ER
AU Chen, Ruifeng
Zhang, Zhongliang
Quan, Guotao
Du, Yanfeng
Chen, Yang
Li, Yinsheng
陈AU、张瑞峰、全忠良、杜国涛、陈彦峰、李杨、银生
PRECISION: A Physics-Constrained and Noise-Controlled Diffusion Model
for Photon Counting Computed Tomography.
精度:用于光子计数计算机断层扫描的物理约束和噪声控制扩散模型。
Recently, the use of photon counting detectors in computed tomography
(PCCT) has attracted extensive attention. It is highly desired to
improve the quality of material basis image and the quantitative
accuracy of elemental composition, particularly when PCCT data is
acquired at lower radiation dose levels. In this work, we develop a
physics-constrained and noise-controlled diffusion model, PRECISION in
short, to address the degraded quality of material basis images and
inaccurate quantification of elemental composition mainly caused by
imperfect noise model and/or hand-crafted regularization of material
basis images, such as local smoothness and/or sparsity, leveraged in the
existing direct material basis image reconstruction approaches. In stark
contrast, PRECISION learns distribution-level regularization to describe
the feature of ideal material basis images via training a
noise-controlled spatial-spectral diffusion model. The optimal material
basis images of each individual subject are sampled from this learned
distribution under the constraint of the physical model of a given PCCT
and the measured data obtained from the subject. PRECISION exhibits the
potential to improve the quality of material basis images and the
quantitative accuracy of elemental composition for PCCT.
最近,光子计数探测器在计算机断层扫描(PCCT)中的使用引起了广泛的关注。非常需要提高材料基础图像的质量和元素成分的定量精度,特别是在较低辐射剂量水平下采集 PCCT 数据时。在这项工作中,我们开发了一种物理约束和噪声控制的扩散模型,简称 PRECISION,以解决主要由不完善的噪声模型和/或手工正则化引起的材料基础图像质量下降和元素成分量化不准确的问题。在现有的直接物质基础图像重建方法中利用物质基础图像,例如局部平滑度和/或稀疏度。形成鲜明对比的是,PRECISION 通过训练噪声控制的空间光谱扩散模型来学习分布级正则化来描述理想材料基础图像的特征。在给定 PCCT 的物理模型和从受试者获得的测量数据的约束下,从该学习分布中采样每个受试者的最佳材料基础图像。 PRECISION 展现出提高 PCCT 材料基础图像质量和元素成分定量准确性的潜力。
AU Majumder, Sharmin
Islam, Md. Tauhidul
Taraballi, Francesca
Righetti, Raffaella
AU Majumder、Sharmin Islam、Md. Tauhidul Taraballi、Francesca Righetti、Raffaella
Non-Invasive Imaging of Mechanical Properties of Cancers In Vivo Based
on Transformations of the Eshelby's Tensor Using Compression
Elastography
基于压缩弹性成像的埃谢尔比张量变换的体内癌症机械特性的非侵入性成像
Knowledge of the mechanical properties is of great clinical significance
for diagnosis, prognosis and treatment of cancers. Recently, a new
method based on Eshelby's theory to simultaneously assess Young's
modulus (YM) and Poisson's ratio (PR) in tissues has been proposed. A
significant limitation of this method is that accuracy of the
reconstructed YM and PR is affected by the orientation/alignment of the
tumor with the applied stress. In this paper, we propose a new method to
reconstruct YM and PR in cancers that is invariant to the 3D orientation
of the tumor with respect to the axis of applied stress. The novelty of
the proposed method resides on the use of a tensor transformation to
improve the robustness of Eshelby's theory and reconstruct YM and PR of
tumors with high accuracy and in realistic experimental conditions. The
method is validated using finite element simulations and controlled
experiments using phantoms with known mechanical properties. The in vivo
feasibility of the developed method is demonstrated in an orthotopic
mouse model of breast cancer. Our results show that the proposed
technique can estimate the YM and PR with overall accuracy of (97.06 +/-
2.42) % under all tested tumor orientations. Animal experimental data
demonstrate the potential of the proposed methodology in vivo. The
proposed method can significantly expand the range of applicability of
the Eshelby's theory to tumors and provide new means to accurately image
and quantify mechanical parameters of cancers in clinical conditions.
了解其机械性能对于癌症的诊断、预后和治疗具有重要的临床意义。最近,提出了一种基于埃谢尔比理论同时评估组织中杨氏模量(YM)和泊松比(PR)的新方法。该方法的一个显着限制是重建的 YM 和 PR 的准确性受到肿瘤与所施加应力的方向/对齐的影响。在本文中,我们提出了一种重建癌症中 YM 和 PR 的新方法,该方法对于肿瘤相对于施加应力轴的 3D 方向是不变的。该方法的新颖性在于使用张量变换来提高 Eshelby 理论的鲁棒性,并在真实的实验条件下高精度地重建肿瘤的 YM 和 PR。该方法通过有限元模拟和使用具有已知机械性能的模型的受控实验进行验证。所开发方法的体内可行性在乳腺癌原位小鼠模型中得到了证明。我们的结果表明,所提出的技术可以在所有测试的肿瘤方向下以 (97.06 +/- 2.42) % 的总体准确度估计 YM 和 PR。动物实验数据证明了所提出的方法在体内的潜力。所提出的方法可以显着扩展 Eshelby 理论对肿瘤的适用范围,并为临床条件下癌症的力学参数的精确成像和量化提供新的手段。
AU Xie, Zhiying
Zeinstra, Nicole
Kirby, Mitchell A.
Le, Nhan Minh
Murry, Charles E.
Zheng, Ying
Wang, Ruikang K.
AU Xie、Zhiying Zeinstra、Nicole Kirby、Mitchell A. Le、Nhan Minh Murry、Charles E. Cheng、Ying Wang、Ruikang K.
Quantifying Microvascular Structure in Healthy and Infarcted Rat Hearts
Using Optical Coherence Tomography Angiography
使用光学相干断层扫描血管造影量化健康和梗塞大鼠心脏的微血管结构
Myocardial infarction (MI) is a life-threatening medical emergency
resulting in coronary microvascular dysregulation and heart muscle
damage. One of the primary characteristics of MI is capillary loss,
which plays a significant role in the progression of this cardiovascular
condition. In this study, we utilized optical coherence tomography
angiography (OCTA) to image coronary microcirculation in fixed rat
hearts, aiming to analyze coronary microvascular impairment
post-infarction. Various angiographic metrics are presented to quantify
vascular features, including the vessel area density, vessel complexity
index, vessel tortuosity index, and flow impairment. Pathological
differences identified from OCTA analysis are corroborated with
histological analysis. The quantitative assessments reveal a significant
decrease in microvascular density in the capillary-sized vessels and an
enlargement for the arteriole/venule-sized vessels. Further,
microvascular tortuosity and complexity exhibit an increase after
myocardial infarction. The results underscore the feasibility of using
OCTA to offer qualitative microvascular details and quantitative
metrics, providing insights into coronary vascular network remodeling
during disease progression and response to therapy.
心肌梗死 (MI) 是一种危及生命的医疗紧急情况,会导致冠状动脉微血管失调和心肌损伤。 MI 的主要特征之一是毛细血管损失,这在这种心血管疾病的进展中起着重要作用。在本研究中,我们利用光学相干断层扫描血管造影(OCTA)对固定大鼠心脏的冠状动脉微循环进行成像,旨在分析梗死后冠状动脉微血管损伤。提出了各种血管造影指标来量化血管特征,包括血管面积密度、血管复杂性指数、血管迂曲指数和血流损伤。 OCTA 分析确定的病理学差异得到组织学分析的证实。定量评估显示毛细血管大小的血管中微血管密度显着降低,而小动脉/微静脉大小的血管增大。此外,心肌梗塞后微血管的迂曲度和复杂性表现出增加。结果强调了使用 OCTA 提供定性微血管细节和定量指标的可行性,从而深入了解疾病进展和治疗反应期间的冠状血管网络重塑。
AU Deng, Shiyu
Chen, Yinda
Huang, Wei
Zhang, Ruobing
Xiong, Zhiwei
AU 邓、陈世宇、黄银达、张伟、熊若冰、志伟
Unsupervised Domain Adaptation for EM Image Denoising with Invertible
Networks.
使用可逆网络进行电磁图像去噪的无监督域适应。
Electron microscopy (EM) image denoising is critical for visualization
and subsequent analysis. Despite the remarkable achievements of deep
learning-based non-blind denoising methods, their performance drops
significantly when domain shifts exist between the training and testing
data. To address this issue, unpaired blind denoising methods have been
proposed. However, these methods heavily rely on image-to-image
translation and neglect the inherent characteristics of EM images,
limiting their overall denoising performance. In this paper, we propose
the first unsupervised domain adaptive EM image denoising method, which
is grounded in the observation that EM images from similar samples share
common content characteristics. Specifically, we first disentangle the
content representations and the noise components from noisy images and
establish a shared domain-agnostic content space via domain alignment to
bridge the synthetic images (source domain) and the real images (target
domain). To ensure precise domain alignment, we further incorporate
domain regularization by enforcing that: the pseudo-noisy images,
reconstructed using both content representations and noise components,
accurately capture the characteristics of the noisy images from which
the noise components originate, all while maintaining semantic
consistency with the noisy images from which the content representations
originate. To guarantee lossless representation decomposition and image
reconstruction, we introduce disentanglement-reconstruction invertible
networks. Finally, the reconstructed pseudo-noisy images, paired with
their corresponding clean counterparts, serve as valuable training data
for the denoising network. Extensive experiments on synthetic and real
EM datasets demonstrate the superiority of our method in terms of image
restoration quality and downstream neuron segmentation accuracy. Our
code is publicly available at https://github.com/sydeng99/DADn.
电子显微镜 (EM) 图像去噪对于可视化和后续分析至关重要。尽管基于深度学习的非盲去噪方法取得了显着的成就,但当训练数据和测试数据之间存在域转移时,其性能会显着下降。为了解决这个问题,提出了不成对的盲去噪方法。然而,这些方法严重依赖图像到图像的转换,忽略了电磁图像的固有特征,限制了它们的整体去噪性能。在本文中,我们提出了第一个无监督域自适应电磁图像去噪方法,该方法基于对来自相似样本的电磁图像具有共同内容特征的观察。具体来说,我们首先从噪声图像中分离出内容表示和噪声分量,并通过域对齐建立一个共享的与域无关的内容空间,以桥接合成图像(源域)和真实图像(目标域)。为了确保精确的域对齐,我们通过强制执行以下操作进一步合并域正则化:使用内容表示和噪声分量重建的伪噪声图像准确地捕获噪声分量所源自的噪声图像的特征,同时保持语义一致性与内容表示所源自的噪声图像。为了保证无损表示分解和图像重建,我们引入了解纠缠重建可逆网络。最后,重建的伪噪声图像与其相应的干净对应图像配对,作为去噪网络的宝贵训练数据。 对合成和真实 EM 数据集的大量实验证明了我们的方法在图像恢复质量和下游神经元分割精度方面的优越性。我们的代码可在 https://github.com/sydeng99/DADn 上公开获取。
AU Wang, Zhenguo
Wu, Yaping
Xia, Zeheng
Chen, Xinyi
Li, Xiaochen
Bai, Yan
Zhou, Yun
Liang, Dong
Zheng, Hairong
Yang, Yongfeng
Wang, Shanshan
Wang, Meiyun
Sun, Tao
王AU、吴振国、夏亚平、陈泽恒、李欣怡、白晓晨、周彦、梁云、郑东、杨海荣、王永峰、王珊珊、孙美云、陶
Non-Invasive Quantification of the Brain [<SUP>18</SUP>F]FDG-PET Using
Inferred Blood Input Function Learned From Total-Body Data With Physical
Constraint
使用从具有物理约束的全身数据中学习的推断血液输入功能对大脑进行无创定量 [<SUP>18</SUP>F]FDG-PET
Full quantification of brain PET requires the blood input function (IF),
which is traditionally achieved through an invasive and time-consuming
arterial catheter procedure, making it unfeasible for clinical routine.
This study presents a deep learning based method to estimate the input
function (DLIF) for a dynamic brain FDG scan. A long short-term memory
combined with a fully connected network was used. The dataset for
training was generated from 85 total-body dynamic scans obtained on a
uEXPLORER scanner. Time-activity curves from 8 brain regions and the
carotid served as the input of the model, and labelled IF was generated
from the ascending aorta defined on CT image. We emphasize the
goodness-of-fitting of kinetic modeling as an additional physical loss
to reduce the bias and the need for large training samples. DLIF was
evaluated together with existing methods in terms of RMSE, area under
the curve, regional and parametric image quantifications. The results
revealed that the proposed model can generate IFs that closer to the
reference ones in terms of shape and amplitude compared with the IFs
generated using existing methods. All regional kinetic parameters
calculated using DLIF agreed with reference values, with the correlation
coefficient being 0.961 (0.913) and relative bias being 1.68 +/- 8.74%
(0.37 +/- 4.93%) for K-i (K-1). In terms of the visual appearance and
quantification, parametric images were also highly identical to the
reference images. In conclusion, our experiments indicate that a trained
model can infer an image-derived IF from dynamic brain PET data, which
enables subsequent reliable kinetic modeling.
脑PET的全面量化需要血液输入功能(IF),传统上这是通过侵入性且耗时的动脉导管手术来实现的,这使得其在临床常规中不可行。本研究提出了一种基于深度学习的方法来估计动态脑 FDG 扫描的输入函数 (DLIF)。使用了长短期记忆与完全连接的网络相结合。用于训练的数据集是通过 uEXPLORER 扫描仪获得的 85 次全身动态扫描生成的。来自 8 个脑区和颈动脉的时间-活动曲线作为模型的输入,标记的 IF 由 CT 图像上定义的升主动脉生成。我们强调动力学建模的拟合优度作为额外的物理损失,以减少偏差和对大型训练样本的需求。 DLIF 与现有方法一起在 RMSE、曲线下面积、区域和参数图像量化方面进行评估。结果表明,与使用现有方法生成的中频相比,所提出的模型可以生成在形状和幅度方面更接近参考中频的中频。使用 DLIF 计算的所有区域动力学参数均与参考值一致,Ki (K-1) 的相关系数为 0.961 (0.913),相对偏差为 1.68 +/- 8.74% (0.37 +/- 4.93%)。在视觉外观和量化方面,参数图像也与参考图像高度一致。总之,我们的实验表明,经过训练的模型可以从动态大脑 PET 数据中推断出图像衍生的 IF,从而实现后续可靠的动力学建模。
C1 Chinese Acad Sci, Paul C Lauterbur Res Ctr Biomed Imaging, Shenzhen Inst
Adv Technol, Shenzhen 518055, Peoples R China
C1 Zhengzhou Univ, Henan Prov Peoples Hosp, Zhengzhou Peoples Hosp,
Zhengzhou 450001, Peoples R China
C1 United Imaging Healthcare Grp Co Ltd, Cent Res Inst, Shanghai 201815,
Peoples R China
C3 United Imaging Healthcare Grp Co Ltd
SN 0278-0062
EI 1558-254X
DA 2024-07-22
UT WOS:001263692100016
PM 38386580
ER
C1 中国科学院,Paul C Lauterbur Res Ctr Biomed Imaging,深圳先进技术研究院,深圳 518055,人民医院 C1 河南省郑州大学人民医院,郑州人民医院,郑州 450001,人民医院 C1 联合影像医疗集团有限公司,中心研究所,上海 201815,人民 R 中国 C3 联影医疗集团有限公司 SN 0278-0062 EI 1558-254X DA 2024-07-22 UT WOS:001263692100016 PM 38386580 ER
AU Fan, Jiansong
Lv, Tianxu
Wang, Pei
Hong, Xiaoyan
Liu, Yuan
Jiang, Chunjuan
Ni, Jianming
Li, Lihua
Pan, Xiang
AU Fan、吕建松、王天旭、洪沛、刘晓燕、蒋媛、倪春娟、李建明、潘丽华、项
DCDiff: Dual-Granularity Cooperative Diffusion Models for Pathology
Image Analysis.
DCDiff:用于病理图像分析的双粒度协作扩散模型。
Whole Slide Images (WSIs) are paramount in the medical field, with
extensive applications in disease diagnosis and treatment. Recently,
many deep-learning methods have been used to classify WSIs. However,
these methods are inadequate for accurately analyzing WSIs as they treat
regions in WSIs as isolated entities and ignore contextual information.
To address this challenge, we propose a novel Dual-Granularity
Cooperative Diffusion Model (DCDiff) for the precise classification of
WSIs. Specifically, we first design a cooperative forward and reverse
diffusion strategy, utilizing fine-granularity and coarse-granularity to
regulate each diffusion step and gradually improve context awareness. To
exchange information between granularities, we propose a coupled U-Net
for dual-granularity denoising, which efficiently integrates
dual-granularity consistency information using the designed Fine- and
Coarse-granularity Cooperative Aware (FCCA) model. Ultimately, the
cooperative diffusion features extracted by DCDiff can achieve
cross-sample perception from the reconstructed distribution of training
samples. Experiments on three public WSI datasets show that the proposed
method can achieve superior performance over state-of-the-art methods.
The code is available at https://github.com/hemo0826/DCDiff.
全幻灯片图像(WSI)在医学领域至关重要,在疾病诊断和治疗中有着广泛的应用。最近,许多深度学习方法已被用于对 WSI 进行分类。然而,这些方法不足以准确分析 WSI,因为它们将 WSI 中的区域视为孤立的实体并忽略上下文信息。为了应对这一挑战,我们提出了一种新颖的双粒度协作扩散模型(DCDiff),用于 WSI 的精确分类。具体来说,我们首先设计了一种协作的正向和反向扩散策略,利用细粒度和粗粒度来调节每个扩散步骤并逐渐提高上下文感知。为了在粒度之间交换信息,我们提出了一种用于双粒度去噪的耦合 U-Net,它使用设计的细粒度和粗粒度协作感知(FCCA)模型有效地集成了双粒度一致性信息。最终,DCDiff提取的协作扩散特征可以从训练样本的重构分布中实现跨样本感知。在三个公共 WSI 数据集上的实验表明,所提出的方法可以实现优于最先进方法的性能。该代码可在 https://github.com/hemo0826/DCDiff 获取。
AU Azampour, Mohammad Farid
Mach, Kristina
Fatemizadeh, Emad
Demiray, Beatrice
Westenfelder, Kay
Steiger, Katja
Eiber, Matthias
Wendler, Thomas
Kainz, Bernhard
Navab, Nassir
AU Azampour、穆罕默德·法里德·马赫、克里斯蒂娜·法特米扎德、埃马德·德米雷、比阿特丽斯·韦斯滕菲尔德、凯·斯泰格、卡佳·艾伯、马蒂亚斯·温德勒、托马斯·凯恩斯、伯恩哈德·纳瓦布、纳西尔
Multitask Weakly Supervised Generative Network for MR-US Registration.
用于 MR-US 注册的多任务弱监督生成网络。
Registering pre-operative modalities, such as magnetic resonance imaging
or computed tomography, to ultrasound images is crucial for guiding
clinicians during surgeries and biopsies. Recently, deep-learning
approaches have been proposed to increase the speed and accuracy of this
registration problem. However, all of these approaches need expensive
supervision from the ultrasound domain. In this work, we propose a
multitask generative framework that needs weak supervision only from the
pre-operative imaging domain during training. To perform a deformable
registration, the proposed framework translates a magnetic resonance
image to the ultrasound domain while preserving the structural content.
To demonstrate the efficacy of the proposed method, we tackle the
registration problem of pre-operative 3D MR to transrectal
ultrasonography images as necessary for targeted prostate biopsies. We
use an in-house dataset of 600 patients, divided into 540 for training,
30 for validation, and the remaining for testing. An expert manually
segmented the prostate in both modalities for validation and test sets
to assess the performance of our framework. The proposed framework
achieves a 3.58 mm target registration error on the expert-selected
landmarks, 89.2% in the Dice score, and 1.81 mm 95th percentile
Hausdorff distance on the prostate masks in the test set. Our
experiments demonstrate that the proposed generative model successfully
translates magnetic resonance images into the ultrasound domain. The
translated image contains the structural content and fine details due to
an ultrasound-specific two-path design of the generative model. The
proposed framework enables training learning-based registration methods
while only weak supervision from the pre-operative domain is available.
将磁共振成像或计算机断层扫描等术前模式与超声图像配准对于指导临床医生进行手术和活检至关重要。最近,人们提出了深度学习方法来提高配准问题的速度和准确性。然而,所有这些方法都需要来自超声领域的昂贵监督。在这项工作中,我们提出了一个多任务生成框架,在训练期间仅需要来自术前成像领域的弱监督。为了执行可变形配准,所提出的框架将磁共振图像转换到超声域,同时保留结构内容。为了证明所提出方法的有效性,我们解决了术前 3D MR 与经直肠超声图像的配准问题,这是目标前列腺活检所必需的。我们使用包含 600 名患者的内部数据集,其中 540 名用于训练,30 名用于验证,其余用于测试。专家以验证和测试集的方式手动分割前列腺,以评估我们框架的性能。所提出的框架在专家选择的地标上实现了 3.58 毫米的目标配准误差,在 Dice 得分中实现了 89.2%,在测试集中的前列腺面罩上实现了 1.81 毫米的第 95 个百分位 Hausdorff 距离。我们的实验表明,所提出的生成模型成功地将磁共振图像转换到超声领域。由于生成模型的超声特定双路径设计,翻译后的图像包含结构内容和精细细节。所提出的框架能够训练基于学习的配准方法,而仅可使用来自术前域的弱监督。
AU Xu, Mengya
Islam, Mobarakol
Bai, Long
Ren, Hongliang
AU Xu、Mengya Islam、白莫巴拉科尔、任龙、洪亮
Privacy-Preserving Synthetic Continual Semantic Segmentation for Robotic
Surgery
用于机器人手术的隐私保护综合连续语义分割
Deep Neural Networks (DNNs) based semantic segmentation of the robotic
instruments and tissues can enhance the precision of surgical activities
in robot-assisted surgery. However, in biological learning, DNNs cannot
learn incremental tasks over time and exhibit catastrophic forgetting,
which refers to the sharp decline in performance on previously learned
tasks after learning a new one. Specifically, when data scarcity is the
issue, the model shows a rapid drop in performance on previously learned
instruments after learning new data with new instruments. The problem
becomes worse when it limits releasing the dataset of the old
instruments for the old model due to privacy concerns and the
unavailability of the data for the new or updated version of the
instruments for the continual learning model. For this purpose, we
develop a privacy-preserving synthetic continual semantic segmentation
framework by blending and harmonizing (i) open-source old instruments
foreground to the synthesized background without revealing real patient
data in public and (ii) new instruments foreground to extensively
augmented real background. To boost the balanced logit distillation from
the old model to the continual learning model, we design overlapping
class-aware temperature normalization (CAT) by controlling model
learning utility. We also introduce multi-scale shifted-feature
distillation (SD) to maintain long and short-range spatial relationships
among the semantic objects where conventional short-range spatial
features with limited information reduce the power of feature
distillation. We demonstrate the effectiveness of our framework on the
EndoVis 2017 and 2018 instrument segmentation dataset with a generalized
continual learning setting. Code is available at
https://github.com/XuMengyaAmy/Synthetic_CAT_SD.
基于深度神经网络(DNN)的机器人器械和组织语义分割可以提高机器人辅助手术中手术活动的精度。然而,在生物学习中,DNN 无法随着时间的推移学习增量任务,并表现出灾难性遗忘,这是指在学习新任务后,先前学习的任务的性能急剧下降。具体来说,当数据稀缺成为问题时,该模型在使用新仪器学习新数据后,在之前学习的仪器上表现出性能迅速下降。当由于隐私问题以及持续学习模型的新版本或更新版本的工具的数据不可用而限制发布旧模型的旧工具的数据集时,问题会变得更糟。为此,我们开发了一个保护隐私的合成连续语义分割框架,通过混合和协调(i)开源旧仪器前景到合成背景,而不在公共场合透露真实的患者数据,以及(ii)新仪器前景到广泛增强的真实数据背景。为了促进从旧模型到持续学习模型的平衡逻辑蒸馏,我们通过控制模型学习效用来设计重叠的类感知温度归一化(CAT)。我们还引入了多尺度移位特征蒸馏(SD)来维持语义对象之间的长程和短程空间关系,其中信息有限的传统短程空间特征降低了特征蒸馏的能力。我们通过广义持续学习设置在 EndoVis 2017 和 2018 仪器分割数据集上展示了我们的框架的有效性。代码可在 https://github 上获取。com/XuMengyaAmy/Synthetic_CAT_SD.
AU Liu, Pei
Ji, Luping
Zhang, Xinyu
Ye, Feng
刘AU、季沛、张路平、叶新宇、冯
Pseudo-Bag Mixup Augmentation for Multiple Instance Learning-Based Whole
Slide Image Classification
基于多实例学习的整个幻灯片图像分类的伪袋混合增强
Given the special situation of modeling gigapixel images, multiple
instance learning (MIL) has become one of the most important frameworks
for Whole Slide Image (WSI) classification. In current practice, most
MIL networks often face two unavoidable problems in training: i)
insufficient WSI data and ii) the sample memorization inclination
inherent in neural networks. These problems may hinder MIL models from
adequate and efficient training, suppressing the continuous performance
promotion of classification models on WSIs. Inspired by the basic idea
of Mixup, this paper proposes a new Pseudo-bag Mixup (PseMix) data
augmentation scheme to improve the training of MIL models. This scheme
generalizes the Mixup strategy for general images to special WSIs via
pseudo-bags so as to be applied in MIL-based WSI classification.
Cooperated by pseudo-bags, our PseMix fulfills the critical size
alignment and semantic alignment in Mixup strategy. Moreover, it is
designed as an efficient and decoupled method, neither involving
time-consuming operations nor relying on MIL model predictions.
Comparative experiments and ablation studies are specially designed to
evaluate the effectiveness and advantages of our PseMix. Experimental
results show that PseMix could often assist state-of-the-art MIL
networks to refresh their classification performance on WSIs. Besides,
it could also boost the generalization performance of MIL models in
special test scenarios, and promote their robustness to patch occlusion
and label noise. Our source code is available at
https://github.com/liupei101/PseMix.
鉴于十亿像素图像建模的特殊情况,多实例学习(MIL)已成为全幻灯片图像(WSI)分类最重要的框架之一。在当前实践中,大多数MIL网络在训练中经常面临两个不可避免的问题:i)WSI数据不足和ii)神经网络固有的样本记忆倾向。这些问题可能会阻碍MIL模型充分有效的训练,抑制WSI上分类模型的持续性能提升。受Mixup基本思想的启发,本文提出了一种新的伪袋混合(PseMix)数据增强方案来改进MIL模型的训练。该方案通过伪袋将一般图像的 Mixup 策略推广到特殊的 WSI,从而应用于基于 MIL 的 WSI 分类。在伪袋的配合下,我们的 PseMix 实现了 Mixup 策略中的关键尺寸对齐和语义对齐。此外,它被设计为一种高效、解耦的方法,既不涉及耗时的操作,也不依赖于 MIL 模型预测。比较实验和消融研究是专门为评估我们的 PseMix 的有效性和优势而设计的。实验结果表明,PseMix 通常可以帮助最先进的 MIL 网络刷新其在 WSI 上的分类性能。此外,它还可以提高MIL模型在特殊测试场景中的泛化性能,并提高其对补丁遮挡和标签噪声的鲁棒性。我们的源代码位于 https://github.com/liupei101/PseMix。
AU Ma, Yuxi
Wang, Jiacheng
Yang, Jing
Wang, Liansheng
区马、王雨曦、杨家成、王静、连胜
Model-Heterogeneous Semi-Supervised Federated Learning for Medical Image
Segmentation
用于医学图像分割的模型异构半监督联邦学习
Medical image segmentation is crucial in clinical diagnosis, helping
physicians identify and analyze medical conditions. However, this task
is often accompanied by challenges like sensitive data, privacy
concerns, and expensive annotations. Current research focuses on
personalized collaborative training of medical segmentation systems,
ignoring that obtaining segmentation annotations is time-consuming and
laborious. Achieving a perfect balance between annotation cost and
segmentation performance while ensuring local model personalization has
become a valuable direction. Therefore, this study introduces a novel
Model-Heterogeneous Semi-Supervised Federated (HSSF) Learning framework.
It proposes Regularity Condensation and Regularity Fusion to transfer
autonomously selective knowledge to ensure the personalization between
sites. In addition, to efficiently utilize unlabeled data and reduce the
annotation burden, it proposes a Self-Assessment (SA) module and a
Reliable Pseudo-Label Generation (RPG) module. The SA module generates
self-assessment confidence in real-time based on model performance, and
the RPG module generates reliable pseudo-label based on SA confidence.
We evaluate our model separately on the Skin Lesion and Polyp Lesion
datasets. The results show that our model performs better than other
methods characterized by heterogeneity. Moreover, it exhibits highly
commendable performance even in homogeneous designs, most notably in
region-based metrics. The full range of resources can be readily
accessed through the designated repository located at HSSF(github.com)
on the platform of GitHub.
医学图像分割在临床诊断中至关重要,可以帮助医生识别和分析医疗状况。然而,这项任务通常伴随着敏感数据、隐私问题和昂贵的注释等挑战。目前的研究主要集中在医学分割系统的个性化协同训练上,忽略了获取分割标注的耗时费力。在保证本地模型个性化的同时,在标注成本和分割性能之间实现完美平衡已成为一个有价值的方向。因此,本研究引入了一种新颖的模型-异构半监督联邦(HSSF)学习框架。它提出规则性压缩和规则性融合来传输自主选择的知识,以确保站点之间的个性化。此外,为了有效利用未标记的数据并减少注释负担,它提出了自我评估(SA)模块和可靠的伪标签生成(RPG)模块。 SA模块根据模型性能实时生成自我评估置信度,RPG模块根据SA置信度生成可靠的伪标签。我们在皮肤病变和息肉病变数据集上分别评估我们的模型。结果表明,我们的模型比其他具有异质性的方法表现得更好。此外,即使在同质设计中,它也表现出高度值得称赞的性能,尤其是在基于区域的指标中。全方位的资源可以通过GitHub平台上位于HSSF(github.com)的指定存储库轻松访问。
AU Bontempo, Gianpaolo
Bolelli, Federico
Porrello, Angelo
Calderara, Simone
Ficarra, Elisa
AU Bontempo、Gianpaolo Bolelli、Federico Porrello、Angelo Calderara、Simone Ficarra、Elisa
A Graph-Based Multi-Scale Approach With Knowledge Distillation for WSI
Classification
用于 WSI 分类的基于图的多尺度知识蒸馏方法
The usage of Multi Instance Learning (MIL) for classifying Whole Slide
Images (WSIs) has recently increased. Due to their gigapixel size, the
pixel-level annotation of such data is extremely expensive and
time-consuming, practically unfeasible. For this reason, multiple
automatic approaches have been raised in the last years to support
clinical practice and diagnosis. Unfortunately, most state-of-the-art
proposals apply attention mechanisms without considering the spatial
instance correlation and usually work on a single-scale resolution. To
leverage the full potential of pyramidal structured WSI, we propose a
graph-based multi-scale MIL approach, DAS-MIL. Our model comprises three
modules: i) a self-supervised feature extractor, ii) a graph-based
architecture that precedes the MIL mechanism and aims at creating a more
contextualized representation of the WSI structure by considering the
mutual (spatial) instance correlation both inter and intra-scale.
Finally, iii) a (self) distillation loss between resolutions is
introduced to compensate for their informative gap and significantly
improve the final prediction. The effectiveness of the proposed
framework is demonstrated on two well-known datasets, where we
outperform SOTA on WSI classification, gaining a +2.7% AUC and +3.7%
accuracy on the popular Camelyon16 benchmark.
最近,用于对整个幻灯片图像 (WSI) 进行分类的多实例学习 (MIL) 的使用有所增加。由于其大小为十亿像素,此类数据的像素级注释极其昂贵且耗时,实际上是不可行的。因此,近年来提出了多种自动方法来支持临床实践和诊断。不幸的是,大多数最先进的提案都应用注意力机制,而不考虑空间实例相关性,并且通常适用于单尺度分辨率。为了充分利用金字塔结构 WSI 的潜力,我们提出了一种基于图的多尺度 MIL 方法,DAS-MIL。我们的模型包含三个模块:i)自监督特征提取器,ii)基于图的架构,它先于 MIL 机制,旨在通过考虑相互(空间)实例相关性来创建 WSI 结构的更加上下文化的表示。和尺度内。最后,iii)引入分辨率之间的(自)蒸馏损失,以补偿它们的信息差距并显着改善最终预测。所提出的框架的有效性在两个著名的数据集上得到了证明,其中我们在 WSI 分类上的表现优于 SOTA,在流行的 Camelyon16 基准上获得了 +2.7% 的 AUC 和 +3.7% 的准确率。
AU Li, Jiawen
Cheng, Junru
Meng, Lingqin
Yan, Hui
He, Yonghong
Shi, Huijuan
Guan, Tian
Han, Anjia
AU Li、程嘉文、孟俊如、严令勤、何慧、施永红、关慧娟、田汉、安佳
DeepTree: Pathological Image Classification Through Imitating Tree-Like
Strategies of Pathologists
DeepTree:通过模仿病理学家的树状策略进行病理图像分类
Digitization of pathological slides has promoted the research of
computer-aided diagnosis, in which artificial intelligence analysis of
pathological images deserves attention. Appropriate deep learning
techniques in natural images have been extended to computational
pathology. Still, they seldom take into account prior knowledge in
pathology, especially the analysis process of lesion morphology by
pathologists. Inspired by the diagnosis decision of pathologists, we
design a novel deep learning architecture based on tree-like strategies
called DeepTree. It imitates pathological diagnosis methods, designed as
a binary tree structure, to conditionally learn the correlation between
tissue morphology, and optimizes branches to finetune the performance
further. To validate and benchmark DeepTree, we build a dataset of
frozen lung cancer tissues and design experiments on a public dataset of
breast tumor subtypes and our dataset. Results show that the deep
learning architecture based on tree-like strategies makes the
pathological image classification more accurate, transparent, and
convincing. Simultaneously, prior knowledge based on diagnostic
strategies yields superior representation ability compared to
alternative methods. Our proposed methodology helps improve the trust of
pathologists in artificial intelligence analysis and promotes the
practical clinical application of pathology-assisted diagnosis.
病理切片的数字化推动了计算机辅助诊断的研究,其中病理图像的人工智能分析值得关注。自然图像中适当的深度学习技术已扩展到计算病理学。尽管如此,他们很少考虑病理学的先验知识,尤其是病理学家对病变形态的分析过程。受到病理学家诊断决策的启发,我们设计了一种基于树状策略的新型深度学习架构,称为 DeepTree。它模仿病理诊断方法,设计为二叉树结构,有条件地学习组织形态之间的相关性,并优化分支以进一步微调性能。为了验证和基准 DeepTree,我们构建了冷冻肺癌组织的数据集,并在乳腺肿瘤亚型的公共数据集和我们的数据集上设计了实验。结果表明,基于树状策略的深度学习架构使得病理图像分类更加准确、透明和令人信服。同时,与其他方法相比,基于诊断策略的先验知识具有卓越的表示能力。我们提出的方法有助于提高病理学家对人工智能分析的信任,促进病理辅助诊断的实际临床应用。
AU Zhang, Huimin
Ren, Mingyang
Wang, Yu
Jin, Zhiyuan
Zhang, Shanxiang
Liu, Jiaqian
Fu, Jia
Qin, Huan
张AU、任惠民、王明阳、金宇、张志远、刘善祥、付家谦、秦家、欢
In Vivo Microwave-Induced Thermoacoustic Endoscopy for Colorectal Tumor
Detection in Deep Tissue
体内微波诱导热声内窥镜用于深部组织结直肠肿瘤检测
Optical endoscopy, as one of the common clinical diagnostic modalities,
provides irreplaceable advantages in the diagnosis and treatment of
internal organs. However, the approach is limited to the
characterization of superficial tissues due to the strong optical
scattering properties of tissue. In this work, a microwave-induced
thermoacoustic (TA) endoscope (MTAE) was developed and evaluated. The
MTAE system integrated a homemade monopole sleeve antenna (diameter = 7
mm) for providing homogenized pulsed microwave irradiation to induce a
TA signal in the colorectal cavity and a side-viewing focus ultrasonic
transducer (diameter = 3 mm) for detecting the TA signal in the
ultrasonic spectrum to construct the image. Our MTAE, system combined
microwave excitation and acoustic detection; produced images with
dielectric contrast and high spatial resolution at several centimeters
deep in soft tissues, overcome the current limitations of the imaging
depth of optical endoscopy and mechanical wave-based imaging contrast of
ultrasound endoscopy, and had the ability to extract complete features
for deep location tumors that could be infiltrating and invading
adjacent structures. The practical feasibility of the MTAE system was
evaluated i n vivo with rabbits having colorectal tumors. The results
demonstrated that colorectal tumor progression could be visualized from
the changes in electromagnetic parameters of the tissue via MTAE,
showing its potential clinical application.
光学内窥镜作为临床常见的诊断手段之一,在内脏器官的诊断和治疗中具有不可替代的优势。然而,由于组织的强光学散射特性,该方法仅限于浅表组织的表征。在这项工作中,开发并评估了微波诱导热声(TA)内窥镜(MTAE)。 MTAE系统集成了自制单极套筒天线(直径 = 7 mm),用于提供均匀脉冲微波辐射,以在结直肠腔中感应 TA 信号;以及侧视聚焦超声换能器(直径 = 3 mm),用于检测结直肠腔内的 TA 信号。超声波频谱来构建图像。我们的 MTAE,系统结合了微波激发和声学检测;在软组织深处产生了介电对比度和高空间分辨率的图像,克服了目前光学内窥镜成像深度和超声内窥镜基于机械波的成像对比度的限制,并具有提取完整特征进行深层定位的能力可能浸润和侵入邻近结构的肿瘤。 MTAE 系统的实际可行性在患有结直肠肿瘤的兔子身上进行了体内评估。结果表明,通过 MTAE 可以从组织电磁参数的变化中可视化结直肠肿瘤的进展,显示出其潜在的临床应用。
C1 South China Normal Univ, Coll Biophoton, MOE Key Lab Laser Life Sci,
Guangzhou Key Lab Spectral Anal & Funct Probes,Gua, Guangzhou 510631,
Peoples R China
C1 South China Normal Univ, Inst Laser Life Sci, Guangzhou 510631, Peoples
R China
SN 0278-0062
EI 1558-254X
DA 2024-07-02
UT WOS:001196733400008
PM 38113149
ER
C1 华南师范大学,生物光子学院,教育部激光生命科学重点实验室,广州市光谱分析与功能探针重点实验室,广州,广州 510631 C1 华南师范大学,激光生命科学研究所,广州 510631 SN 0278-0062 EI 1558-254X DA 2024-07-02 UT WOS:001196733400008 PM 38113149 ER
AU Wu, Huisi
Zhang, Baiming
Chen, Cheng
Qin, Jing
吴AU、张惠思、陈百明、秦程、静
Federated Semi-Supervised Medical Image Segmentation via Prototype-Based
Pseudo-Labeling and Contrastive Learning
通过基于原型的伪标签和对比学习进行联合半监督医学图像分割
Existing federated learning works mainly focus on the fully supervised
training setting. In realistic scenarios, however, most clinical sites
can only provide data without annotations due to the lack of resources
or expertise. In this work, we are concerned with the practical yet
challenging federated semi-supervised segmentation (FSSS), where labeled
data are only with several clients and other clients can just provide
unlabeled data. We take an early attempt to tackle this problem and
propose a novel FSSS method with prototype-based pseudo-labeling and
contrastive learning. First, we transmit a labeled-aggregated model,
which is obtained based on prototype similarity, to each unlabeled
client, to work together with the global model for debiased pseudo
labels generation via a consistency- and entropy-aware selection
strategy. Second, we transfer image-level prototypes from labeled
datasets to unlabeled clients and conduct prototypical contrastive
learning on unlabeled models to enhance their discriminative power.
Finally, we perform the dynamic model aggregation with a designed
consistency-aware aggregation strategy to dynamically adjust the
aggregation weights of each local model. We evaluate our method on
COVID-19 X-ray infected region segmentation, COVID-19 CT infected region
segmentation and colorectal polyp segmentation, and experimental results
consistently demonstrate the effectiveness of our proposed method. Codes
areavailable at https://github.com/zhangbaiming/FedSemiSeg.
现有的联邦学习工作主要集中在完全监督的训练设置上。然而,在现实场景中,由于缺乏资源或专业知识,大多数临床站点只能提供没有注释的数据。在这项工作中,我们关注实用但具有挑战性的联邦半监督分割(FSSS),其中标记数据仅存在于多个客户端,而其他客户端只能提供未标记数据。我们早期尝试解决这个问题,并提出了一种新颖的 FSSS 方法,具有基于原型的伪标签和对比学习。首先,我们将基于原型相似性获得的标记聚合模型传输给每个未标记的客户端,以通过一致性和熵感知的选择策略与全局模型一起生成去偏伪标签。其次,我们将图像级原型从标记数据集转移到未标记的客户端,并对未标记的模型进行原型对比学习,以增强其判别能力。最后,我们使用设计的一致性感知聚合策略来执行动态模型聚合,以动态调整每个本地模型的聚合权重。我们在 COVID-19 X 射线感染区域分割、COVID-19 CT 感染区域分割和结直肠息肉分割上评估了我们的方法,实验结果一致证明了我们提出的方法的有效性。代码可在 https://github.com/zhangbaiming/FedSemiSeg 获取。
AU Park, Jungkyu
Chledowski, Jakub
Jastrzebski, Stanislaw
Witowski, Jan
Xu, Yanqi
Du, Linda
Gaddam, Sushma
Kim, Eric
Lewin, Alana
Parikh, Ujas
Plaunova, Anastasia
Chen, Sardius
Millet, Alexandra
Park, James
Pysarenko, Kristine
Patel, Shalin
Goldberg, Julia
Wegener, Melanie
Moy, Linda
Heacock, Laura
Reig, Beatriu
Geras, Krzysztof J.
AU Park、Jungkyu Chledowski、Jakub Jastrzebski、Stanislaw Witowski、Jan Xu、Yanqi Du、Linda Gaddam、Sushma Kim、Eric Lewin、Alana Parikh、Ujas Plaunova、Anastasia Chen、Sardius Millet、Alexandra Park、James Pysarenko、Kristine Patel、Shalin Goldberg , 朱莉娅·韦格纳, 梅兰妮·莫伊, 琳达·希考克, 劳拉·雷格, 贝阿特留·杰拉斯, 克日什托夫·J.
An Efficient Deep Neural Network to Classify Large 3D Images With Small
Objects
用于对大型 3D 图像和小物体进行分类的高效深度神经网络
3D imaging enables accurate diagnosis by providing spatial information
about organ anatomy. However, using 3D images to train AI models is
computationally challenging because they consist of 10x or 100x more
pixels than their 2D counterparts. To be trained with high-resolution 3D
images, convolutional neural networks resort to downsampling them or
projecting them to 2D. We propose an effective alternative, a neural
network that enables efficient classification of full-resolution 3D
medical images. Compared to off-the-shelf convolutional neural networks,
our network, 3D Globally-Aware Multiple Instance Classifier (3D-GMIC),
uses 77.98%-90.05% less GPU memory and 91.23%-96.02% less computation.
While it is trained only with image-level labels, without segmentation
labels, it explains its predictions by providing pixel-level saliency
maps. On a dataset collected at NYU Langone Health, including 85,526
patients with full-field 2D mammography (FFDM), synthetic 2D
mammography, and 3D mammography, 3D-GMIC achieves an AUC of 0.831 (95%
CI: 0.769-0.887) in classifying breasts with malignant findings using 3D
mammography. This is comparable to the performance of GMIC on FFDM
(0.816, 95% CI: 0.737-0.878) and synthetic 2D (0.826, 95% CI:
0.754-0.884), which demonstrates that 3D-GMIC successfully classified
large 3D images despite focusing computation on a smaller percentage of
its input compared to GMIC. Therefore, 3D-GMIC identifies and utilizes
extremely small regions of interest from 3D images consisting of
hundreds of millions of pixels, dramatically reducing associated
computational challenges. 3D-GMIC generalizes well to BCS-DBT, an
external dataset from Duke University Hospital, achieving an AUC of
0.848 (95% CI: 0.798-0.896).
3D 成像通过提供有关器官解剖结构的空间信息来实现准确诊断。然而,使用 3D 图像来训练 AI 模型在计算上具有挑战性,因为它们包含的像素比 2D 图像多 10 倍或 100 倍。为了使用高分辨率 3D 图像进行训练,卷积神经网络会对其进行下采样或将其投影为 2D。我们提出了一种有效的替代方案,即一种能够对全分辨率 3D 医学图像进行有效分类的神经网络。与现成的卷积神经网络相比,我们的网络 3D 全局感知多实例分类器 (3D-GMIC) 使用的 GPU 内存减少了 77.98%-90.05%,计算量减少了 91.23%-96.02%。虽然它仅使用图像级标签进行训练,没有分割标签,但它通过提供像素级显着性图来解释其预测。在 NYU Langone Health 收集的数据集上,包括 85,526 名接受全视野 2D 乳房 X 光检查 (FFDM)、合成 2D 乳房 X 光检查和 3D 乳房 X 光检查的患者,3D-GMIC 在乳房分类方面的 AUC 为 0.831 (95% CI: 0.769-0.887)使用 3D 乳房 X 光检查发现恶性结果。这与 GMIC 在 FFDM(0.816,95% CI:0.737-0.878)和合成 2D(0.826,95% CI:0.754-0.884)上的性能相当,这表明 3D-GMIC 尽管进行聚焦计算,仍成功对大型 3D 图像进行分类与 GMIC 相比,其投入的比例较小。因此,3D-GMIC 可识别并利用由数亿像素组成的 3D 图像中极小的感兴趣区域,从而显着减少相关的计算挑战。 3D-GMIC 可以很好地推广到杜克大学医院的外部数据集 BCS-DBT,其 AUC 为 0.848(95% CI:0.798-0.896)。
AU Zhu, Meilu
Liao, Jing
Liu, Jun
Yuan, Yixuan
朱AU、廖美璐、刘静、袁俊、艺轩
FedOSS: Federated Open Set Recognition via Inter-Client Discrepancy and
Collaboration
FedOSS:通过客户端差异和协作进行联合开放集识别
Open set recognition (OSR) aims to accurately classify known diseases
and recognize unseen diseases as the unknown class in medical scenarios.
However, in existing OSR approaches, gathering data from distributed
sites to construct large-scale centralized training datasets usually
leads to high privacy and security risk, which could be alleviated
elegantly via the popular cross-site training paradigm, federated
learning (FL). To this end, we represent the first effort to formulate
federated open set recognition (FedOSR), and meanwhile propose a novel
Federated Open Set Synthesis (FedOSS) framework to address the core
challenge of FedOSR: the unavailability of unknown samples for all
anticipated clients during the training phase. The proposed FedOSS
framework mainly leverages two modules, i.e., Discrete Unknown Sample
Synthesis (DUSS) and Federated Open Space Sampling (FOSS), to generate
virtual unknown samples for learning decision boundaries between known
and unknown classes. Specifically, DUSS exploits inter-client knowledge
inconsistency to recognize known samples near decision boundaries and
then pushes them beyond decision boundaries to synthesize discrete
virtual unknown samples. FOSS unites these generated unknown samples
from different clients to estimate the class-conditional distributions
of open data space near decision boundaries and further samples open
data, thereby improving the diversity of virtual unknown samples.
Additionally, we conduct comprehensive ablation experiments to verify
the effectiveness of DUSS and FOSS. FedOSS shows superior performance on
public medical datasets in comparison with state-of-the-art approaches.
开放集识别(OSR)旨在对已知疾病进行准确分类,并将未见过的疾病识别为医疗场景中的未知类别。然而,在现有的 OSR 方法中,从分布式站点收集数据来构建大规模集中式训练数据集通常会导致较高的隐私和安全风险,而这可以通过流行的跨站点训练范式——联邦学习(FL)来优雅地缓解。为此,我们首次提出了联邦开放集识别(FedOSR),同时提出了一种新颖的联邦开放集综合(FedOSS)框架来解决FedOSR的核心挑战:在整个过程中,所有预期客户都无法获得未知样本。训练阶段。所提出的FedOSS框架主要利用两个模块,即离散未知样本合成(DUSS)和联合开放空间采样(FOSS)来生成虚拟未知样本,用于学习已知类和未知类之间的决策边界。具体来说,DUSS利用客户端间的知识不一致来识别决策边界附近的已知样本,然后将它们推到决策边界之外以合成离散的虚拟未知样本。 FOSS将这些来自不同客户端的生成的未知样本联合起来,估计决策边界附近开放数据空间的类条件分布,并对开放数据进行进一步采样,从而提高虚拟未知样本的多样性。此外,我们还进行了全面的消融实验来验证 DUSS 和 FOSS 的有效性。与最先进的方法相比,FedOSS 在公共医疗数据集上显示出卓越的性能。
AU Xu, Chenchu
Zhang, Tong
Zhang, Dong
Zhang, Dingwen
Han, Junwei
徐AU、张晨初、张桐、张栋、韩丁文、俊伟
Deep Generative Adversarial Reinforcement Learning for Semi-Supervised
Segmentation of Low-Contrast and Small Objects in Medical Images
用于医学图像中低对比度和小物体半监督分割的深度生成对抗强化学习
Deep reinforcement learning (DRL) has demonstrated impressive
performance in medical image segmentation, particularly for low-contrast
and small medical objects. However, current DRL-based segmentation
methods face limitations due to the optimization of error propagation in
two separate stages and the need for a significant amount of labeled
data. In this paper, we propose a novel deep generative adversarial
reinforcement learning (DGARL) approach that, for the first time,
enables end-to-end semi-supervised medical image segmentation in the DRL
domain. DGARL ingeniously establishes a pipeline that integrates DRL and
generative adversarial networks (GANs) to optimize both detection and
segmentation tasks holistically while mutually enhancing each other.
Specifically, DGARL introduces two innovative components to facilitate
this integration in semi-supervised settings. First, a task-joint GAN
with two discriminators links the detection results to the GAN's
segmentation performance evaluation, allowing simultaneous joint
evaluation and feedback. This ensures that DRL and GAN can be directly
optimized based on each other's results. Second, a bidirectional
exploration DRL integrates backward exploration and forward exploration
to ensure the DRL agent explores the correct direction when forward
exploration is disabled due to lack of explicit rewards. This mitigates
the issue of unlabeled data being unable to provide rewards and
rendering DRL unexplorable. Comprehensive experiments on three
generalization datasets, comprising a total of 640 patients, demonstrate
that our novel DGARL achieves 85.02% Dice and improves at least 1.91%
for brain tumors, achieves 73.18% Dice and improves at least 4.28% for
liver tumors, and achieves 70.85% Dice and improves at least 2.73% for
pancreas compared to the ten most recent advanced methods, our results
attest to the superiority of DGARL. Code is available at GitHub.
深度强化学习(DRL)在医学图像分割方面表现出了令人印象深刻的性能,特别是对于低对比度和小型医疗对象。然而,由于两个独立阶段的错误传播优化以及需要大量标记数据,当前基于 DRL 的分割方法面临局限性。在本文中,我们提出了一种新颖的深度生成对抗强化学习(DGARL)方法,该方法首次在 DRL 领域实现端到端的半监督医学图像分割。 DGARL 巧妙地建立了一个集成 DRL 和生成对抗网络(GAN)的管道,以整体优化检测和分割任务,同时相互增强。具体来说,DGARL 引入了两个创新组件来促进半监督环境中的这种集成。首先,具有两个判别器的任务联合 GAN 将检测结果与 GAN 的分割性能评估联系起来,从而允许同时进行联合评估和反馈。这确保了 DRL 和 GAN 可以直接基于彼此的结果进行优化。其次,双向探索 DRL 集成了后向探索和前向探索,以确保当由于缺乏显式奖励而禁用前向探索时,DRL 代理探索正确的方向。这缓解了未标记数据无法提供奖励以及导致 DRL 无法探索的问题。对三个泛化数据集(总共包括 640 名患者)的综合实验表明,我们的新型 DGARL 对于脑肿瘤实现了 85.02% Dice,改善了至少 1.91%,对于肝脏肿瘤实现了 73.18% Dice,改善了至少 4.28%,并且实现了 70.85 % 骰子并提高至少 2。与十种最新先进方法相比,胰腺的死亡率为 73%,我们的结果证明了 DGARL 的优越性。代码可在 GitHub 上获取。
AU Liu, Mingxin
Liu, Yunzan
Xu, Pengbo
Cui, Hui
Ke, Jing
Ma, Jiquan
刘AU、刘明欣、徐云赞、崔鹏波、柯辉、马晶、吉泉
Exploiting Geometric Features via Hierarchical Graph Pyramid Transformer
for Cancer Diagnosis Using Histopathological Images
通过分层图金字塔变压器利用几何特征使用组织病理学图像进行癌症诊断
Cancer is widely recognized as the primary cause of mortality worldwide,
and pathology analysis plays a pivotal role in achieving accurate cancer
diagnosis. The intricate representation of features in histopathological
images encompasses abundant information crucial for disease diagnosis,
regarding cell appearance, tumor microenvironment, and geometric
characteristics. However, recent deep learning methods have not
adequately exploited geometric features for pathological image
classification due to the absence of effective descriptors that can
capture both cell distribution and gathering patterns, which often serve
as potent indicators. In this paper, inspired by clinical practice, a
Hierarchical Graph Pyramid Transformer (HGPT) is proposed to guide
pathological image classification by effectively exploiting a geometric
representation of tissue distribution which was ignored by existing
state-of-the-art methods. First, a graph representation is constructed
according to morphological feature of input pathological image and learn
geometric representation through the proposed multi-head graph
aggregator. Then, the image and its graph representation are feed into
the transformer encoder layer to model long-range dependency. Finally, a
locality feature enhancement block is designed to enhance the 2D local
representation of feature embedding, which is not well explored in the
existing vision transformers. An extensive experimental study is
conducted on Kather-5K, MHIST, NCT-CRC-HE, and GasHisSDB for binary or
multi-category classification of multiple cancer types. Results
demonstrated that our method is capable of consistently reaching
superior classification outcomes for histopathological images, which
provide an effective diagnostic tool for malignant tumors in clinical
practice.
癌症被广泛认为是全世界死亡的主要原因,病理分析在实现准确的癌症诊断中发挥着关键作用。组织病理学图像中特征的复杂表示包含对疾病诊断至关重要的丰富信息,包括细胞外观、肿瘤微环境和几何特征。然而,由于缺乏可以捕获细胞分布和聚集模式的有效描述符,而这些描述符通常可以作为有效的指标,因此最近的深度学习方法尚未充分利用几何特征进行病理图像分类。在本文中,受临床实践的启发,提出了一种层次图金字塔变换器(HGPT),通过有效利用现有最先进方法忽略的组织分布的几何表示来指导病理图像分类。首先,根据输入病理图像的形态特征构建图表示,并通过所提出的多头图聚合器学习几何表示。然后,图像及其图形表示形式被输入到变压器编码器层以对远程依赖性进行建模。最后,设计了局部特征增强块来增强特征嵌入的二维局部表示,这在现有的视觉变换器中没有得到很好的探索。在 Kather-5K、MHIST、NCT-CRC-HE 和 GasHisSDB 上进行了广泛的实验研究,用于多种癌症类型的二元或多类别分类。 结果表明,我们的方法能够始终如一地达到组织病理学图像的优异分类结果,为临床实践中的恶性肿瘤提供有效的诊断工具。
AU Sun, Jiarui
Li, Qiuxuan
Liu, Yuhao
Liu, Yichuan
Coatrieux, Gouenou
Coatrieux, Jean-Louis
Chen, Yang
Lu, Jie
AU Sun、李嘉瑞、刘秋轩、刘宇豪、Yichuan Coatrieux、Gouenou Coatrieux、Jean-Louis Chen、杨路、杰
Pathological Asymmetry-Guided Progressive Learning for Acute Ischemic
Stroke Infarct Segmentation.
病理不对称引导的急性缺血性中风梗塞分割的渐进式学习。
Quantitative infarct estimation is crucial for diagnosis, treatment and
prognosis in acute ischemic stroke (AIS) patients. As the early changes
of ischemic tissue are subtle and easily confounded by normal brain
tissue, it remains a very challenging task. However, existing methods
often ignore or confuse the contribution of different types of
anatomical asymmetry caused by intrinsic and pathological changes to
segmentation. Further, inefficient domain knowledge utilization leads to
mis-segmentation for AIS infarcts. Inspired by this idea, we propose a
pathological asymmetry-guided progressive learning (PAPL) method for AIS
infarct segmentation. PAPL mimics the step-by-step learning patterns
observed in humans, including three progressive stages: knowledge
preparation stage, formal learning stage, and examination improvement
stage. First, knowledge preparation stage accumulates the preparatory
domain knowledge of the infarct segmentation task, helping to learn
domain-specific knowledge representations to enhance the discriminative
ability for pathological asymmetries by constructed contrastive learning
task. Then, formal learning stage efficiently performs end-to-end
training guided by learned knowledge representations, in which the
designed feature compensation module (FCM) can leverage the anatomy
similarity between adjacent slices from the volumetric medical image to
help aggregate rich anatomical context information. Finally, examination
improvement stage encourages improving the infarct prediction from the
previous stage, where the proposed perception refinement strategy (RPRS)
further exploits the bilateral difference comparison to correct the
mis-segmentation infarct regions by adaptively regional shrink and
expansion. Extensive experiments on public and in-house NCCT datasets
demonstrated the superiority of the proposed PAPL, which is promising to
help better stroke evaluation and treatment.
定量梗塞评估对于急性缺血性卒中(AIS)患者的诊断、治疗和预后至关重要。由于缺血组织的早期变化很微妙,很容易与正常脑组织混淆,因此这仍然是一项非常具有挑战性的任务。然而,现有的方法经常忽略或混淆由内在和病理变化引起的不同类型的解剖不对称对分割的贡献。此外,低效的领域知识利用会导致 AIS 梗塞的错误分割。受这个想法的启发,我们提出了一种用于 AIS 梗塞分割的病理不对称引导渐进学习(PAPL)方法。 PAPL模仿人类观察到的循序渐进的学习模式,包括三个渐进阶段:知识准备阶段、正式学习阶段和考试改进阶段。首先,知识准备阶段积累梗塞分割任务的准备领域知识,通过构建对比学习任务帮助学习特定领域的知识表示,以增强对病理不对称的判别能力。然后,正式学习阶段在学习的知识表示的指导下有效地执行端到端训练,其中设计的特征补偿模块(FCM)可以利用体积医学图像中相邻切片之间的解剖相似性来帮助聚合丰富的解剖上下文信息。最后,检查改进阶段鼓励改进前一阶段的梗塞预测,其中提出的感知细化策略(RPRS)进一步利用双边差异比较,通过自适应区域收缩和扩展来纠正错误分割的梗塞区域。 对公共和内部 NCCT 数据集的广泛实验证明了所提出的 PAPL 的优越性,它有望帮助更好的中风评估和治疗。
AU Wang, Hongqiu
Chen, Jian
Zhang, Shichen
He, Yuan
Xu, Jinfeng
Wu, Mengwan
He, Jinlan
Liao, Wenjun
Luo, Xiangde
王AU、陈红秋、张健、何世辰、徐媛、吴金峰、何梦万、廖金兰、罗文君、祥德
Dual-Reference Source-Free Active Domain Adaptation for Nasopharyngeal
Carcinoma Tumor Segmentation across Multiple Hospitals.
跨多个医院鼻咽癌肿瘤分割的双参考无源主动域适应。
Nasopharyngeal carcinoma (NPC) is a prevalent and clinically significant
malignancy that predominantly impacts the head and neck area. Precise
delineation of the Gross Tumor Volume (GTV) plays a pivotal role in
ensuring effective radiotherapy for NPC. Despite recent methods that
have achieved promising results on GTV segmentation, they are still
limited by lacking carefully-annotated data and hard-to-access data from
multiple hospitals in clinical practice. Although some unsupervised
domain adaptation (UDA) has been proposed to alleviate this problem,
unconditionally mapping the distribution distorts the underlying
structural information, leading to inferior performance. To address this
challenge, we devise a novel Sourece-Free Active Domain Adaptation
framework to facilitate domain adaptation for the GTV segmentation task.
Specifically, we design a dual reference strategy to select
domain-invariant and domain-specific representative samples from a
specific target domain for annotation and model fine-tuning without
relying on source-domain data. Our approach not only ensures data
privacy but also reduces the workload for oncologists as it just
requires annotating a few representative samples from the target domain
and does not need to access the source data. We collect a large-scale
clinical dataset comprising 1057 NPC patients from five hospitals to
validate our approach. Experimental results show that our method
outperforms the previous active learning (e.g., AADA and MHPL) and UDA
(e.g., Tent and CPR) methods, and achieves comparable results to the
fully supervised upper bound, even with few annotations, highlighting
the significant medical utility of our approach. In addition, there is
no public dataset about multi-center NPC segmentation, we will release
code and dataset for future research (Git).
鼻咽癌(NPC)是一种常见且具有临床意义的恶性肿瘤,主要影响头颈部区域。精确描绘大体肿瘤体积(GTV)对于确保鼻咽癌的有效放射治疗起着关键作用。尽管最近的方法在 GTV 分割方面取得了可喜的结果,但它们仍然受到临床实践中缺乏仔细注释的数据和难以访问来自多个医院的数据的限制。尽管已经提出了一些无监督域适应(UDA)来缓解这个问题,但无条件映射分布会扭曲底层结构信息,导致性能较差。为了应对这一挑战,我们设计了一种新颖的无源主动域适应框架,以促进 GTV 分割任务的域适应。具体来说,我们设计了一种双重参考策略,从特定目标域中选择域不变和域特定的代表性样本进行注释和模型微调,而不依赖于源域数据。我们的方法不仅确保了数据隐私,还减少了肿瘤学家的工作量,因为它只需要注释目标域中的一些代表性样本,而不需要访问源数据。我们收集了来自五家医院的 1057 名鼻咽癌患者的大规模临床数据集来验证我们的方法。实验结果表明,我们的方法优于之前的主动学习(例如,AADA 和 MHPL)和 UDA(例如,Tent 和 CPR)方法,并且即使没有很少的注释,也能达到与完全监督上限相当的结果,凸显了重要的医疗效用我们的方法。 另外,目前还没有关于多中心NPC分割的公共数据集,我们将发布代码和数据集以供将来的研究(Git)。
AU Yang, Yan
Yu, Jun
Fu, Zhenqi
Zhang, Ke
Yu, Ting
Wang, Xianyun
Jiang, Hanliang
Lv, Junhui
Huang, Qingming
Han, Weidong
欧阳、余彦、付军、张振奇、于柯、王婷、蒋先云、吕汉良、黄俊辉、韩清明、卫东
Token-Mixer: Bind Image and Text in One Embedding Space for Medical
Image Reporting.
令牌混合器:将图像和文本绑定在一个嵌入空间中,用于医学图像报告。
Medical image reporting focused on automatically generating the
diagnostic reports from medical images has garnered growing research
attention. In this task, learning cross-modal alignment between images
and reports is crucial. However, the exposure bias problem in
autoregressive text generation poses a notable challenge, as the model
is optimized by a word-level loss function using the teacher-forcing
strategy. To this end, we propose a novel Token-Mixer framework that
learns to bind image and text in one embedding space for medical image
reporting. Concretely, Token-Mixer enhances the cross-modal alignment by
matching image-to-text generation with text-to-text generation that
suffers less from exposure bias. The framework contains an image
encoder, a text encoder and a text decoder. In training, images and
paired reports are first encoded into image tokens and text tokens, and
these tokens are randomly mixed to form the mixed tokens. Then, the text
decoder accepts image tokens, text tokens or mixed tokens as prompt
tokens and conducts text generation for network optimization.
Furthermore, we introduce a tailored text decoder and an alternative
training strategy that well integrate with our Token-Mixer framework.
Extensive experiments across three publicly available datasets
demonstrate Token-Mixer successfully enhances the image-text alignment
and thereby attains a state-of-the-art performance. Related codes are
available at https://github.com/yangyan22/Token-Mixer.
专注于从医学图像自动生成诊断报告的医学图像报告已引起越来越多的研究关注。在此任务中,学习图像和报告之间的跨模式对齐至关重要。然而,自回归文本生成中的暴露偏差问题提出了一个显着的挑战,因为该模型是使用教师强制策略通过单词级损失函数进行优化的。为此,我们提出了一种新颖的令牌混合器框架,该框架学习将图像和文本绑定在一个嵌入空间中以进行医学图像报告。具体来说,Token-Mixer 通过将图像到文本生成与受曝光偏差影响较小的文本到文本生成相匹配来增强跨模式对齐。该框架包含图像编码器、文本编码器和文本解码器。在训练中,图像和配对报告首先被编码为图像标记和文本标记,并且这些标记被随机混合以形成混合标记。然后,文本解码器接受图像标记、文本标记或混合标记作为提示标记,并进行文本生成以进行网络优化。此外,我们引入了定制的文本解码器和替代训练策略,它们与我们的令牌混合器框架很好地集成。在三个公开可用的数据集上进行的广泛实验表明,Token-Mixer 成功地增强了图像文本对齐,从而获得了最先进的性能。相关代码可参见https://github.com/yangyan22/Token-Mixer。
AU Zhang, Yumin
Li, Hongliu
Gao, Yajun
Duan, Haoran
Huang, Yawen
Zheng, Yefeng
张AU、李玉民、高红柳、段亚军、黄浩然、郑亚文、叶峰
Prototype Correlation Matching and Class-Relation Reasoning for Few-Shot
Medical Image Segmentation.
少镜头医学图像分割的原型相关匹配和类关系推理。
Few-shot medical image segmentation has achieved great progress in
improving accuracy and efficiency of medical analysis in the biomedical
imaging field. However, most existing methods cannot explore inter-class
relations among base and novel medical classes to reason unseen novel
classes. Moreover, the same kind of medical class has large intra-class
variations brought by diverse appearances, shapes and scales, thus
causing ambiguous visual characterization to degrade generalization
performance of these existing methods on unseen novel classes. To
address the above challenges, in this paper, we propose a Prototype
correlation Matching and Class-relation Reasoning (i.e., PMCR) model.
The proposed model can effectively mitigate false pixel correlation
matches caused by large intra-class variations while reasoning
inter-class relations among different medical classes. Specifically, in
order to address false pixel correlation match brought by large
intra-class variations, we propose a prototype correlation matching
module to mine representative prototypes that can characterize diverse
visual information of different appearances well. We aim to explore
prototypelevel rather than pixel-level correlation matching between
support and query features via optimal transport algorithm to tackle
false matches caused by intra-class variations. Meanwhile, in order to
explore inter-class relations, we design a class-relation reasoning
module to segment unseen novel medical objects via reasoning inter-class
relations between base and novel classes. Such inter-class relations can
be well propagated to semantic encoding of local query features to
improve few-shot segmentation performance. Quantitative comparisons
illustrates the large performance improvement of our model over other
baseline methods.
小样本医学图像分割在提高生物医学成像领域医学分析的准确性和效率方面取得了巨大进步。然而,大多数现有方法无法探索基础医学类和新医学类之间的类间关系来推理未见过的新类。此外,同一类型的医学类由于不同的外观、形状和尺度而具有较大的类内差异,从而导致模糊的视觉表征,从而降低了这些现有方法对未知新类的泛化性能。为了解决上述挑战,在本文中,我们提出了原型相关匹配和类关系推理(即 PMCR)模型。所提出的模型可以有效地减轻由大的类内变化引起的错误像素相关匹配,同时推理不同医学类别之间的类间关系。具体来说,为了解决较大的类内变化带来的错误像素相关匹配问题,我们提出了一种原型相关匹配模块来挖掘能够很好地表征不同外观的各种视觉信息的代表性原型。我们的目标是通过最佳传输算法探索支持和查询特征之间的原型级而不是像素级相关匹配,以解决由类内变化引起的错误匹配。同时,为了探索类间关系,我们设计了一个类关系推理模块,通过推理基础类和新类之间的类间关系来分割未见过的新颖医疗对象。这种类间关系可以很好地传播到本地查询特征的语义编码,以提高少样本分割性能。 定量比较表明我们的模型相对于其他基线方法有巨大的性能改进。
AU Li, Kang
Zhu, Yu
Yu, Lequan
Heng, Pheng-Ann
区莉、朱康、余宇、恒乐泉、彭安
A Dual Enrichment Synergistic Strategy to Handle Data Heterogeneity for
Domain Incremental Cardiac Segmentation
处理域增量心脏分割数据异质性的双重丰富协同策略
Upon remarkable progress in cardiac image segmentation, contemporary
studies dedicate to further upgrading model functionality toward
perfection, through progressively exploring the sequentially delivered
datasets over time by domain incremental learning. Existing works mainly
concentrated on addressing the heterogeneous style variations, but
overlooked the critical shape variations across domains hidden behind
the sub-disease composition discrepancy. In case the updated model
catastrophically forgets the sub-diseases that were learned in past
domains but are no longer present in the subsequent domains, we proposed
a dual enrichment synergistic strategy to incrementally broaden model
competence for a growing number of sub-diseases. The data-enriched
scheme aims to diversify the shape composition of current training data
via displacement-aware shape encoding and decoding, to gradually build
up the robustness against cross-domain shape variations. Meanwhile, the
model-enriched scheme intends to strengthen model capabilities by
progressively appending and consolidating the latest expertise into a
dynamically-expanded multi-expert network, to gradually cultivate the
generalization ability over style-variated domains. The above two
schemes work in synergy to collaboratively upgrade model capabilities in
two-pronged manners. We have extensively evaluated our network with the
ACDC and M&Ms datasets in single-domain and compound-domain incremental
learning settings. Our approach outperformed other competing methods and
achieved comparable results to the upper bound.
随着心脏图像分割取得显着进展,当代研究致力于通过领域增量学习逐步探索随时间推移顺序交付的数据集,从而进一步将模型功能升级到完美。现有的工作主要集中在解决异质风格变化,但忽视了隐藏在子疾病组成差异背后的跨领域的关键形状变化。如果更新的模型灾难性地忘记了在过去领域中学到的但在后续领域中不再存在的子疾病,我们提出了一种双重富集协同策略,以逐步扩大模型针对越来越多的子疾病的能力。数据丰富方案旨在通过位移感知形状编码和解码来使当前训练数据的形状组成多样化,以逐步建立针对跨域形状变化的鲁棒性。同时,模型丰富方案旨在通过将最新的专业知识逐步附加和整合到动态扩展的多专家网络中来增强模型能力,逐步培养不同风格领域的泛化能力。上述两种方案协同作用,双管齐下协同升级模型能力。我们在单域和复合域增量学习设置中使用 ACDC 和 M&Ms 数据集广泛评估了我们的网络。我们的方法优于其他竞争方法,并取得了与上限相当的结果。
AU Ren, Zhimei
Sidky, Emil Y.
Barber, Rina Foygel
Kao, Chien-Min
Pan, Xiaochuan
AU Ren、Zhimei Sidky、Emil Y. Barber、Rina Foygel Kao、Chien-Min Pan、小川
Simultaneous Activity and Attenuation Estimation in TOF-PET With
TV-Constrained Nonconvex Optimization
利用电视约束非凸优化同时估计 TOF-PET 中的活性和衰减
An alternating direction method of multipliers (ADMM) framework is
developed for nonsmooth biconvex optimization for inverse problems in
imaging. In particular, the simultaneous estimation of activity and
attenuation (SAA) problem in time-of-flight positron emission tomography
(TOF-PET) has such a structure when maximum likelihood estimation (MLE)
is employed. The ADMM framework is applied to MLE for SAA in TOF-PET,
resulting in the ADMM-SAA algorithm. This algorithm is extended by
imposing total variation (TV) constraints on both the activity and
attenuation map, resulting in the ADMM-TVSAA algorithm. The performance
of this algorithm is illustrated using the penalized maximum likelihood
activity and attenuation estimation (P-MLAA) algorithm as a reference.
开发了一种交替方向乘法器 (ADMM) 框架,用于成像反问题的非光滑双凸优化。特别地,当采用最大似然估计(MLE)时,飞行时间正电子发射断层扫描(TOF-PET)中的活性和衰减的同时估计(SAA)问题具有这样的结构。 ADMM框架应用于TOF-PET中SAA的MLE,产生了ADMM-SAA算法。通过对活动图和衰减图施加总变分 (TV) 约束来扩展该算法,从而形成 ADMM-TVSAA 算法。使用惩罚最大似然活动和衰减估计(P-MLAA)算法作为参考来说明该算法的性能。
AU Tuccio, Giulia
Afrakhteh, Sajjad
Iacca, Giovanni
Demi, Libertario
AU Tuccio、Giulia Afrakhteh、Sajjad Iacca、Giovanni Demi、Libertario
Time Efficient Ultrasound Localization Microscopy Based on A Novel
Radial Basis Function 2D Interpolation
基于新型径向基函数二维插值的省时超声定位显微镜
Ultrasound localization microscopy (ULM) allows for the generation of
super-resolved (SR) images of the vasculature by precisely localizing
intravenously injected microbubbles. Although SR images may be useful
for diagnosing and treating patients, their use in the clinical context
is limited by the need for prolonged acquisition times and high frame
rates. The primary goal of our study is to relax the requirement of high
frame rates to obtain SR images. To this end, we propose a new
time-efficient ULM (TEULM) pipeline built on a cutting-edge
interpolation method. More specifically, we suggest employing Radial
Basis Functions (RBFs) as interpolators to estimate the missing values
in the 2-dimensional (2D) spatio-temporal structures. To evaluate this
strategy, we first mimic the data acquisition at a reduced frame rate by
applying a down-sampling (DS = 2, 4, 8, and 10) factor to high frame
rate ULM data. Then, we up-sample the data to the original frame rate
using the suggested interpolation to reconstruct the missing frames.
Finally, using both the original high frame rate data and the
interpolated one, we reconstruct SR images using the ULM framework
steps. We evaluate the proposed TEULM using four in vivo datasets, a Rat
brain (dataset A), a Rat kidney (dataset B), a Rat tumor (dataset C) and
a Rat brain bolus (dataset D), interpolating at the in-phase and
quadrature (IQ) level. Results demonstrate the effectiveness of TEULM in
recovering vascular structures, even at a DS rate of 10 (corresponding
to a frame rate of sub-100Hz). In conclusion, the proposed technique is
successful in reconstructing accurate SR images while requiring frame
rates of one order of magnitude lower than standard ULM.
超声定位显微镜(ULM)可以通过精确定位静脉注射的微泡来生成脉管系统的超分辨率(SR)图像。尽管 SR 图像可能对诊断和治疗患者有用,但其在临床环境中的使用因需要延长采集时间和高帧速率而受到限制。我们研究的主要目标是放宽获取 SR 图像的高帧率要求。为此,我们提出了一种基于尖端插值方法构建的新的省时 ULM (TEULM) 流程。更具体地说,我们建议使用径向基函数(RBF)作为插值器来估计二维(2D)时空结构中的缺失值。为了评估此策略,我们首先通过对高帧率 ULM 数据应用下采样(DS = 2、4、8 和 10)因子来模拟降低帧率的数据采集。然后,我们使用建议的插值将数据上采样到原始帧速率,以重建丢失的帧。最后,使用原始高帧率数据和插值数据,我们使用 ULM 框架步骤重建 SR 图像。我们使用四个体内数据集评估所提出的 TEULM,即大鼠大脑(数据集 A)、大鼠肾脏(数据集 B)、大鼠肿瘤(数据集 C)和大鼠脑丸(数据集 D),在同相插值和正交 (IQ) 电平。结果证明了 TEULM 在恢复血管结构方面的有效性,即使在 DS 速率为 10(对应于低于 100Hz 的帧速率)时也是如此。总之,所提出的技术成功地重建了精确的 SR 图像,同时要求帧速率比标准 ULM 低一个数量级。
AU Wu, Weiwen
Wang, Yanyang
Liu, Qiegen
Wang, Ge
Zhang, Jianjia
吴AU、王伟文、刘艳阳、王切根、张戈、健佳
Wavelet-Improved Score-Based Generative Model for Medical Imaging
基于小波改进的医学成像评分生成模型
The score-based generative model (SGM) has demonstrated remarkable
performance in addressing challenging under-determined inverse problems
in medical imaging. However, acquiring high-quality training datasets
for these models remains a formidable task, especially in medical image
reconstructions. Prevalent noise perturbations or artifacts in low-dose
Computed Tomography (CT) or under-sampled Magnetic Resonance Imaging
(MRI) hinder the accurate estimation of data distribution gradients,
thereby compromising the overall performance of SGMs when trained with
these data. To alleviate this issue, we propose a wavelet-improved
denoising technique to cooperate with the SGMs, ensuring effective and
stable training. Specifically, the proposed method integrates a wavelet
sub-network and the standard SGM sub-network into a unified framework,
effectively alleviating inaccurate distribution of the data distribution
gradient and enhancing the overall stability. The mutual feedback
mechanism between the wavelet sub-network and the SGM sub-network
empowers the neural network to learn accurate scores even when handling
noisy samples. This combination results in a framework that exhibits
superior stability during the learning process, leading to the
generation of more precise and reliable reconstructed images. During the
reconstruction process, we further enhance the robustness and quality of
the reconstructed images by incorporating regularization constraint. Our
experiments, which encompass various scenarios of low-dose and
sparse-view CT, as well as MRI with varying under-sampling rates and
masks, demonstrate the effectiveness of the proposed method by
significantly enhanced the quality of the reconstructed images.
Especially, our method with noisy training samples achieves comparable
results to those obtained using clean data.
基于评分的生成模型(SGM)在解决医学成像中具有挑战性的欠定逆问题方面表现出了卓越的性能。然而,为这些模型获取高质量的训练数据集仍然是一项艰巨的任务,特别是在医学图像重建中。低剂量计算机断层扫描 (CT) 或欠采样磁共振成像 (MRI) 中普遍存在的噪声扰动或伪影阻碍了数据分布梯度的准确估计,从而影响了使用这些数据进行训练时 SGM 的整体性能。为了缓解这个问题,我们提出了一种小波改进的去噪技术来与 SGM 配合,确保训练的有效和稳定。具体来说,该方法将小波子网络和标准SGM子网络集成到统一的框架中,有效缓解数据分布梯度分布不准确的问题,增强整体稳定性。小波子网络和SGM子网络之间的相互反馈机制使神经网络即使在处理噪声样本时也能学习准确的分数。这种组合形成了一个在学习过程中表现出卓越稳定性的框架,从而生成更精确和可靠的重建图像。在重建过程中,我们通过结合正则化约束进一步增强重建图像的鲁棒性和质量。我们的实验涵盖低剂量和稀疏视图 CT 的各种场景,以及具有不同欠采样率和掩模的 MRI,通过显着提高重建图像的质量证明了所提出方法的有效性。 特别是,我们使用噪声训练样本的方法取得了与使用干净数据获得的结果相当的结果。
AU Yang, Yuxuan
Wang, Hao
Wang, Jizhou
Dong, Kai
Ding, Shuai
欧阳、王雨轩、王浩、董继周、丁凯、帅
Semantic-Preserving Surgical Video Retrieval With Phase and Behavior
Coordinated Hashing
使用相位和行为协调哈希的语义保留手术视频检索
Medical professionals rely on surgical video retrieval to discover
relevant content within large numbers of videos for surgical education
and knowledge transfer. However, the existing retrieval techniques often
fail to obtain user-expected results since they ignore valuable
semantics in surgical videos. The incorporation of rich semantics into
video retrieval is challenging in terms of the hierarchical relationship
modeling and coordination between coarse- and fine-grained semantics. To
address these issues, this paper proposes a novel semantic-preserving
surgical video retrieval (SPSVR) framework, which incorporates surgical
phase and behavior semantics using a dual-level hashing module to
capture their hierarchical relationship. This module preserves the
semantics in binary hash codes by transforming the phase and behavior
similarities into high- and low-level similarities in a shared Hamming
space. The binary codes are optimized by performing a reconstruction
task, a high-level similarity preservation task, and a low-level
similarity preservation task, using a coordinated optimization strategy
for efficient learning. A self-supervised learning scheme is adopted to
capture behavior semantics from video clips so that the indexing of
behaviors is unencumbered by fine-grained annotation and recognition.
Experiments on four surgical video datasets for two different
disciplines demonstrate the robust performance of the proposed
framework. In addition, the results of the clinical validation
experiments indicate the ability of the proposed method to retrieve the
results expected by surgeons. The code can be found at
https://github.com/trigger26/SPSVR.
医疗专业人员依靠手术视频检索在大量视频中发现相关内容,以进行手术教育和知识转移。然而,现有的检索技术往往无法获得用户期望的结果,因为它们忽略了手术视频中有价值的语义。将丰富的语义融入视频检索在层次关系建模以及粗粒度和细粒度语义之间的协调方面具有挑战性。为了解决这些问题,本文提出了一种新颖的语义保留手术视频检索(SPSVR)框架,该框架使用双层哈希模块来结合手术阶段和行为语义来捕获它们的层次关系。该模块通过将阶段和行为相似性转换为共享汉明空间中的高级和低级相似性来保留二进制哈希码中的语义。通过执行重建任务、高级相似性保存任务和低级相似性保存任务来优化二进制代码,使用协调优化策略进行高效学习。采用自监督学习方案从视频剪辑中捕获行为语义,从而使行为索引不受细粒度注释和识别的阻碍。对两个不同学科的四个手术视频数据集的实验证明了所提出的框架的稳健性能。此外,临床验证实验的结果表明所提出的方法能够检索外科医生期望的结果。代码可以在 https://github.com/trigger26/SPSVR 找到。
AU Lu, Xu
Cui, Zengzhen
Sun, Yihua
Khor, Hee Guan
Sun, Ao
Ma, Longfei
Chen, Fang
Gao, Shan
Tian, Yun
Zhou, Fang
Lv, Yang
Liao, Hongen
AU Lu、徐翠、孙增振、许一华、孙熙冠、敖马、陈龙飞、高方、田善、周云、吕方、廖杨、洪恩
Better Rough Than Scarce: Proximal Femur Fracture Segmentation With
Rough Annotations
粗糙比稀缺更好:带有粗糙注释的近端股骨骨折分割
Proximal femoral fracture segmentation in computed tomography (CT) is
essential in the preoperative planning of orthopedic surgeons. Recently,
numerous deep learning-based approaches have been proposed for
segmenting various structures within CT scans. Nevertheless,
distinguishing various attributes between fracture fragments and soft
tissue regions in CT scans frequently poses challenges, which have
received comparatively limited research attention. Besides, the
cornerstone of contemporary deep learning methodologies is the
availability of annotated data, while detailed CT annotations remain
scarce. To address the challenge, we propose a novel weakly-supervised
framework, namely Rough Turbo Net (RT-Net), for the segmentation of
proximal femoral fractures. We emphasize the utilization of human
resources to produce rough annotations on a substantial scale, as
opposed to relying on limited fine-grained annotations that demand a
substantial time to create. In RT-Net, rough annotations pose
fractured-region constraints, which have demonstrated significant
efficacy in enhancing the accuracy of the network. Conversely, the fine
annotations can provide more details for recognizing edges and soft
tissues. Besides, we design a spatial adaptive attention module (SAAM)
that adapts to the spatial distribution of the fracture regions and
align feature in each decoder. Moreover, we propose a fine-edge loss
which is applied through an edge discrimination network to penalize the
absence or imprecision edge features. Extensive quantitative and
qualitative experiments demonstrate the superiority of RT-Net to
state-of-the-art approaches. Furthermore, additional experiments show
that RT-Net has the capability to produce pseudo labels for raw CT
images that can further improve fracture segmentation performance and
has the potential to improve segmentation performance on public
datasets.
计算机断层扫描 (CT) 中的近端股骨骨折分割对于骨科医生的术前计划至关重要。最近,人们提出了许多基于深度学习的方法来分割 CT 扫描中的各种结构。然而,在 CT 扫描中区分骨折碎片和软组织区域的各种属性经常会带来挑战,而这些挑战受到的研究关注相对有限。此外,当代深度学习方法的基石是注释数据的可用性,而详细的 CT 注释仍然很少。为了应对这一挑战,我们提出了一种新颖的弱监督框架,即 Rough Turbo Net (RT-Net),用于近端股骨骨折的分割。我们强调利用人力资源大规模地生成粗略注释,而不是依赖需要大量时间来创建的有限细粒度注释。在 RT-Net 中,粗略的注释造成了断裂区域约束,这在提高网络准确性方面表现出了显着的功效。相反,精细注释可以为识别边缘和软组织提供更多细节。此外,我们设计了一个空间自适应注意模块(SAAM),它适应断裂区域的空间分布并在每个解码器中对齐特征。此外,我们提出了一种通过边缘判别网络应用的精细边缘损失,以惩罚边缘特征的缺失或不精确。大量的定量和定性实验证明了 RT-Net 相对于最先进方法的优越性。 此外,额外的实验表明,RT-Net 具有为原始 CT 图像生成伪标签的能力,可以进一步提高裂缝分割性能,并有可能提高公共数据集的分割性能。
C1 Tsinghua Univ, Sch Biomed Engn, Beijing 100084, Peoples R China
C1 Tsinghua Univ, Grad Sch Shenzhen, Shenzhen 518055, Peoples R China
C1 Peking Univ Third Hosp, Dept Orthoped, Beijing 100191, Peoples R China
C1 Tsinghua Univ, Sch Biomed Engn, Beijing 100084, Peoples R China
C1 Shanghai Jiao Tong Univ, Sch Biomed Engn, Shanghai 200240, Peoples R
China
C1 Shanghai Jiao Tong Univ, Inst Med Robot, Shanghai 200240, Peoples R
China
SN 0278-0062
EI 1558-254X
DA 2024-09-18
UT WOS:001307429600012
PM 38652607
ER
C1 清华大学生物医学工程学院,北京 100084,人民大学 C1 清华大学深圳研究生院,深圳 518055,人民大学 C1 北京大学第三医院骨科,北京 100191,人民大学 C1 清华大学生物医学工程学院,北京100084,人民R中国C1上海交通大学,上海生物医学工程学院,上海200240,人民R中国C1上海交通大学,医学机器人研究所,上海200240,人民R中国SN 0278-0062 EI 1558-254X DA 2024- 09-18 UT WOS:001307429600012 PM 38652607 ER
AU Ruan, Guohui
Wang, Zhaonian
Liu, Chunyi
Xia, Ling
Wang, Huafeng
Qi, Li
Chen, Wufan
阮盟、王国辉、刘兆年、夏春一、王凌、齐华峰、陈力、吴凡
Magnetic Resonance Electrical Properties Tomography Based on Modified
Physics- Informed Neural Network and Multiconstraints
基于修正物理信息神经网络和多约束的磁共振电特性层析成像
This paper presents a novel method based on leveraging physics-informed
neural networks for magnetic resonance electrical property tomography
(MREPT). MREPT is a noninvasive technique that can retrieve the spatial
distribution of electrical properties (EPs) of scanned tissues from
measured transmit radiofrequency (RF) in magnetic resonance imaging
(MRI) systems. The reconstruction of EP values in MREPT is achieved by
solving a partial differential equation derived from Maxwell's equations
that lacks a direct solution. Most conventional MREPT methods suffer
from artifacts caused by the invalidation of the assumption applied for
simplification of the problem and numerical errors caused by numerical
differentiation. Existing deep learning-based (DL-based) MREPT methods
comprise data-driven methods that need to collect massive datasets for
training or model-driven methods that are only validated in trivial
cases. Hence we proposed a model-driven method that learns mapping from
a measured RF, its spatial gradient and Laplacian to EPs using fully
connected networks (FCNNs). The spatial gradient of EP can be computed
through the automatic differentiation of FCNNs and the chain rule. FCNNs
are optimized using the residual of the central physical equation of
convection-reaction MREPT as the loss function ( ${{\mathcal {L}}}{)}$ .
To alleviate the ill condition of the problem, we added
multiconstraints, including the similarity constraint between
permittivity and conductivity and the ${\ell }_{{{1}}}$ norm of spatial
gradients of permittivity and conductivity, to the ${{\mathcal {L}}}$ .
We demonstrate the proposed method with a three-dimensional realistic
head model, a digital phantom simulation, and a practical phantom
experiment at a 9.4T animal MRI system.
本文提出了一种基于物理信息神经网络的磁共振电特性断层扫描(MREPT)的新方法。 MREPT 是一种无创技术,可以从磁共振成像 (MRI) 系统中测量的发射射频 (RF) 中检索扫描组织的电特性 (EP) 的空间分布。 MREPT 中 EP 值的重建是通过求解从缺乏直接解的麦克斯韦方程组导出的偏微分方程来实现的。大多数传统的 MREPT 方法都存在因用于简化问题的假设无效而导致的伪影以及数值微分导致的数值误差。现有的基于深度学习(DL)的 MREPT 方法包括需要收集大量数据集进行训练的数据驱动方法或仅在琐碎情况下验证的模型驱动方法。因此,我们提出了一种模型驱动的方法,该方法使用全连接网络(FCNN)学习从测量的 RF、其空间梯度和拉普拉斯算子到 EP 的映射。 EP的空间梯度可以通过FCNN的自动微分和链式法则来计算。 FCNN 使用对流反应 MREPT 中心物理方程的残差作为损失函数 ( ${{\mathcal {L}}}{)}$ 进行优化。为了缓解这个问题,我们在 ${ {\mathcal {L}}}$ 。我们通过三维逼真头部模型、数字体模模拟以及 9.4T 动物 MRI 系统的实际体模实验演示了所提出的方法。
AU Xu, Jiaxing
Bian, Qingtian
Li, Xinhang
Zhang, Aihu
Ke, Yiping
Qiao, Miao
Zhang, Wei
Sim, Wei Khang Jeremy
Gulyas, Balazs
CA Alzheimers Dis Neuroimaging Initiative
徐AU、卞嘉兴、李庆天、张新航、柯爱虎、乔一平、张淼、辛伟、康伟 Jeremy Gulyas、Balazs CA 阿尔茨海默病神经影像计划
Contrastive Graph Pooling for Explainable Classification of Brain
Networks
用于可解释的大脑网络分类的对比图池
Functional magnetic resonance imaging (fMRI) is a commonly used
technique to measure neural activation. Its application has been
particularly important in identifying underlying neurodegenerative
conditions such as Parkinson's, Alzheimer's, and Autism. Recent analysis
of fMRI data models the brain as a graph and extracts features by graph
neural networks (GNNs). However, the unique characteristics of fMRI data
require a special design of GNN. Tailoring GNN to generate effective and
domain-explainable features remains challenging. In this paper, we
propose a contrastive dual-attention block and a differentiable graph
pooling method called ContrastPool to better utilize GNN for brain
networks, meeting fMRI-specific requirements. We apply our method to 5
resting-state fMRI brain network datasets of 3 diseases and demonstrate
its superiority over state-of-the-art baselines. Our case study confirms
that the patterns extracted by our method match the domain knowledge in
neuroscience literature, and disclose direct and interesting insights.
Our contributions underscore the potential of ContrastPool for advancing
the understanding of brain networks and neurodegenerative conditions.
The source code is available at
https://github.com/AngusMonroe/ContrastPool.
功能磁共振成像(fMRI)是测量神经激活的常用技术。它的应用对于识别潜在的神经退行性疾病(如帕金森病、阿尔茨海默病和自闭症)尤其重要。最近对功能磁共振成像数据的分析将大脑建模为图形,并通过图形神经网络 (GNN) 提取特征。然而,fMRI数据的独特特征需要GNN的特殊设计。定制 GNN 以生成有效且领域可解释的特征仍然具有挑战性。在本文中,我们提出了一种对比性双注意力块和一种称为 ContrastPool 的可微图池方法,以更好地将 GNN 用于大脑网络,满足 fMRI 的特定要求。我们将我们的方法应用于 3 种疾病的 5 个静息态 fMRI 脑网络数据集,并证明了其相对于最先进基线的优越性。我们的案例研究证实,我们的方法提取的模式与神经科学文献中的领域知识相匹配,并揭示了直接且有趣的见解。我们的贡献强调了 ContrastPool 在促进对大脑网络和神经退行性疾病的理解方面的潜力。源代码可在 https://github.com/AngusMonroe/ContrastPool 获取。
AU Kim, Boah
Zhuang, Yan
Mathai, Tejas Sudharshan
Summers, Ronald M
AU Kim、Boah Zhuang、Yan Mathai、Tejas Sudharshan Summers、Ronald M
OTMorph: Unsupervised Multi-domain Abdominal Medical Image Registration
Using Neural Optimal Transport.
OTMorph:使用神经最优传输的无监督多域腹部医学图像配准。
Deformable image registration is one of the essential processes in
analyzing medical images. In particular, when diagnosing abdominal
diseases such as hepatic cancer and lymphoma, multi-domain images
scanned from different modalities or different imaging protocols are
often used. However, they are not aligned due to scanning times, patient
breathing, movement, etc. Although recent learning-based approaches can
provide deformations in real-time with high performance, multi-domain
abdominal image registration using deep learning is still challenging
since the images in different domains have different characteristics
such as image contrast and intensity ranges. To address this, this paper
proposes a novel unsupervised multi-domain image registration framework
using neural optimal transport, dubbed OTMorph. When moving and fixed
volumes are given as input, a transport module of our proposed model
learns the optimal transport plan to map data distributions from the
moving to the fixed volumes and estimates a domain-transported volume.
Subsequently, a registration module taking the transported volume can
effectively estimate the deformation field, leading to deformation
performance improvement. Experimental results on multi-domain image
registration using multi-modality and multi-parametric abdominal medical
images demonstrate that the proposed method provides superior deformable
registration via the domain-transported image that alleviates the domain
gap between the input images. Also, we attain the improvement even on
out-of-distribution data, which indicates the superior generalizability
of our model for the registration of various medical images. Our source
code is available at https://github.com/boahK/OTMorph.
变形图像配准是分析医学图像的基本过程之一。特别是,在诊断肝癌和淋巴瘤等腹部疾病时,经常使用从不同模式或不同成像协议扫描的多域图像。然而,由于扫描时间、患者呼吸、运动等原因,它们并没有对齐。尽管最近基于学习的方法可以高性能地实时提供变形,但使用深度学习的多域腹部图像配准仍然具有挑战性,因为图像不同领域具有不同的特征,例如图像对比度和强度范围。为了解决这个问题,本文提出了一种使用神经最优传输的新型无监督多域图像配准框架,称为 OTMorph。当移动和固定体积作为输入时,我们提出的模型的传输模块学习最佳传输计划以将数据分布从移动体积映射到固定体积并估计域传输体积。随后,采用传输体积的配准模块可以有效地估计变形场,从而提高变形性能。使用多模态和多参数腹部医学图像进行多域图像配准的实验结果表明,该方法通过域传输图像提供了优异的变形配准,从而减轻了输入图像之间的域间隙。此外,我们甚至在分布外数据上也取得了改进,这表明我们的模型对于各种医学图像的配准具有卓越的通用性。我们的源代码可在 https://github.com/boahK/OTMorph 获取。
AU Ali, Rehman
Mitcham, Trevor M.
Brevett, Thurston
Agudo, Oscar Calderon
Martinez, Cristina Duran
Li, Cuiping
Doyley, Marvin M.
Duric, Nebojsa
AU Ali、Rehman Mitcham、Trevor M. Brevett、Thurston Agudo、Oscar Calderon Martinez、Cristina Duran Li、Cuiping Doyley、Marvin M. Duric、Nebojsa
2-D Slicewise Waveform Inversion of Sound Speed and Acoustic Attenuation
for Ring Array Ultrasound Tomography Based on a Block LU Solver
基于 Block LU 求解器的环形阵列超声断层扫描声速和声衰减的二维切片波形反演
Ultrasound tomography is an emerging imaging modality that uses the
transmission of ultrasound through tissue to reconstruct images of its
mechanical properties. Initially, ray-based methods were used to
reconstruct these images, but their inability to account for diffraction
often resulted in poor resolution. Waveform inversion overcame this
limitation, providing high-resolution images of the tissue. Most
clinical implementations, often directed at breast cancer imaging,
currently rely on a frequency-domain waveform inversion to reduce
computation time. For ring arrays, ray tomography was long considered a
necessary step prior to waveform inversion in order to avoid cycle
skipping. However, in this paper, we demonstrate that frequency-domain
waveform inversion can reliably reconstruct high-resolution images of
sound speed and attenuation without relying on ray tomography to provide
an initial model. We provide a detailed description of our
frequency-domain waveform inversion algorithm with open-source code and
data that we make publicly available.
超声断层扫描是一种新兴的成像方式,它利用超声波通过组织的传输来重建其机械特性的图像。最初,使用基于射线的方法来重建这些图像,但它们无法考虑衍射,通常导致分辨率较差。波形反演克服了这一限制,提供了组织的高分辨率图像。大多数临床实施通常针对乳腺癌成像,目前依赖频域波形反演来减少计算时间。对于环形阵列,射线断层扫描长期以来被认为是波形反演之前的必要步骤,以避免周期跳跃。然而,在本文中,我们证明频域波形反演可以可靠地重建声速和衰减的高分辨率图像,而无需依赖射线断层扫描来提供初始模型。我们通过公开的开源代码和数据提供了频域波形反演算法的详细描述。
AU Lin, Wenjun Hu, Yan Fu, Huazhu Yang, Mingming Chng, Chin-Boon Kawasaki, Ryo Chui, Cheekong Liu, Jiang
Instrument-Tissue Interaction Detection Framework for Surgical Video
Understanding
用于手术视频理解的仪器-组织相互作用检测框架
Instrument-tissue interaction detection task, which helps understand
surgical activities, is vital for constructing computer-assisted surgery
systems but with many challenges. Firstly, most models represent
instrument-tissue interaction in a coarse-grained way which only focuses
on classification and lacks the ability to automatically detect
instruments and tissues. Secondly, existing works do not fully consider
relations between intra- and inter-frame of instruments and tissues. In
the paper, we propose to represent instrument-tissue interaction as
instrument class, instrument bounding box, tissue class, tissue bounding
box, action class quintuple and present an Instrument-Tissue Interaction
Detection Network (ITIDNet) to detect the quintuple for surgery videos
understanding. Specifically, we propose a Snippet Consecutive Feature
(SCF) Layer to enhance features by modeling relationships of proposals
in the current frame using global context information in the video
snippet. We also propose a Spatial Corresponding Attention (SCA) Layer
to incorporate features of proposals between adjacent frames through
spatial encoding. To reason relationships between instruments and
tissues, a Temporal Graph (TG) Layer is proposed with intra-frame
connections to exploit relationships between instruments and tissues in
the same frame and inter-frame connections to model the temporal
information for the same instance. For evaluation, we build a cataract
surgery video (PhacoQ) dataset and a cholecystectomy surgery video
(CholecQ) dataset. Experimental results demonstrate the promising
performance of our model, which outperforms other state-of-the-art
models on both datasets.
仪器与组织相互作用检测任务有助于理解手术活动,对于构建计算机辅助手术系统至关重要,但也面临许多挑战。首先,大多数模型以粗粒度的方式表示器械与组织的相互作用,仅注重分类,缺乏自动检测器械和组织的能力。其次,现有的工作没有充分考虑仪器和组织框架内和框架间的关系。在本文中,我们建议将仪器-组织相互作用表示为仪器类、仪器边界框、组织类、组织边界框、动作类五元组,并提出一种仪器-组织相互作用检测网络(ITIDNet)来检测五元组以进行手术视频理解。具体来说,我们提出了一个片段连续特征(SCF)层,通过使用视频片段中的全局上下文信息对当前帧中的提案关系进行建模来增强特征。我们还提出了一个空间对应注意(SCA)层,通过空间编码合并相邻帧之间的提案特征。为了推理仪器和组织之间的关系,提出了具有帧内连接的时间图(TG)层,以利用同一帧中仪器和组织之间的关系,以及帧间连接来对同一实例的时间信息进行建模。为了进行评估,我们构建了白内障手术视频(PhacoQ)数据集和胆囊切除手术视频(CholecQ)数据集。实验结果证明了我们的模型的良好性能,在两个数据集上都优于其他最先进的模型。
AU Wang, Kang
Zheng, Feiyang
Cheng, Lan
Dai, Hong-Ning
Dou, Qi
Qin, Jing
王AU、郑康、程飞扬、戴蓝、窦红宁、秦琪、静
Breast Cancer Classification From Digital Pathology Images via
Connectivity-Aware Graph Transformer
通过连接感知图形转换器对数字病理图像进行乳腺癌分类
Automated classification of breast cancer subtypes from digital
pathology images has been an extremely challenging task due to the
complicated spatial patterns of cells in the tissue micro-environment.
While newly proposed graph transformers are able to capture more
long-range dependencies to enhance accuracy, they largely ignore the
topological connectivity between graph nodes, which is nevertheless
critical to extract more representative features to address this
difficult task. In this paper, we propose a novel connectivity-aware
graph transformer (CGT) for phenotyping the topology connectivity of the
tissue graph constructed from digital pathology images for breast cancer
classification. Our CGT seamlessly integrates connectivity embedding to
node feature at every graph transformer layer by using local
connectivity aggregation, in order to yield more comprehensive graph
representations to distinguish different breast cancer subtypes. In
light of the realistic intercellular communication mode, we then encode
the spatial distance between two arbitrary nodes as connectivity bias in
self-attention calculation, thereby allowing the CGT to distinctively
harness the connectivity embedding based on the distance of two nodes.
We extensively evaluate the proposed CGT on a large cohort of breast
carcinoma digital pathology images stained by Haematoxylin & Eosin.
Experimental results demonstrate the effectiveness of our CGT, which
outperforms state-of-the-art methods by a large margin. Codes are
released on https://github.com/wang-kang-6/CGT.
由于组织微环境中细胞的空间模式复杂,从数字病理图像中自动分类乳腺癌亚型一直是一项极具挑战性的任务。虽然新提出的图转换器能够捕获更多的远程依赖关系以提高准确性,但它们在很大程度上忽略了图节点之间的拓扑连接性,但这对于提取更具代表性的特征来解决这一艰巨的任务至关重要。在本文中,我们提出了一种新颖的连接感知图转换器(CGT),用于对根据乳腺癌分类的数字病理图像构建的组织图的拓扑连接进行表型分析。我们的 CGT 通过使用局部连接聚合将连接嵌入无缝集成到每个图转换器层的节点特征,以便产生更全面的图表示来区分不同的乳腺癌亚型。根据实际的细胞间通信模式,我们将两个任意节点之间的空间距离编码为自注意力计算中的连接偏差,从而使 CGT 能够根据两个节点的距离独特地利用连接嵌入。我们在大量苏木精和伊红染色的乳腺癌数字病理图像上广泛评估了所提出的 CGT。实验结果证明了我们的 CGT 的有效性,其性能大大优于最先进的方法。代码发布在https://github.com/wang-kang-6/CGT。
AU Zhou, Lei
Zhang, Yuzhong
Zhang, Jiadong
Qian, Xuejun
Gong, Chen
Sun, Kun
Ding, Zhongxiang
Wang, Xing
Li, Zhenhui
Liu, Zaiyi
Shen, Dinggang
周AU、张磊、张玉中、钱家栋、宫学军、孙晨、丁坤、王忠祥、李星、刘振辉、沉在义、丁刚
Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation
in DCE-MRI.
用于 DCE-MRI 中乳腺肿瘤分割的原型学习引导混合网络。
Automated breast tumor segmentation on the basis of dynamic
contrast-enhancement magnetic resonance imaging (DCE-MRI) has shown
great promise in clinical practice, particularly for identifying the
presence of breast disease. However, accurate segmentation of breast
tumor is a challenging task, often necessitating the development of
complex networks. To strike an optimal tradeoff between computational
costs and segmentation performance, we propose a hybrid network via the
combination of convolution neural network (CNN) and transformer layers.
Specifically, the hybrid network consists of a encoder-decoder
architecture by stacking convolution and deconvolution layers. Effective
3D transformer layers are then implemented after the encoder
subnetworks, to capture global dependencies between the bottleneck
features. To improve the efficiency of hybrid network, two parallel
encoder sub-networks are designed for the decoder and the transformer
layers, respectively. To further enhance the discriminative capability
of hybrid network, a prototype learning guided prediction module is
proposed, where the category-specified prototypical features are
calculated through online clustering. All learned prototypical features
are finally combined with the features from decoder for tumor mask
prediction. The experimental results on private and public DCE-MRI
datasets demonstrate that the proposed hybrid network achieves superior
performance than the state-of-the-art (SOTA) methods, while maintaining
balance between segmentation accuracy and computation cost. Moreover, we
demonstrate that automatically generated tumor masks can be effectively
applied to identify HER2-positive subtype from HER2-negative subtype
with the similar accuracy to the analysis based on manual tumor
segmentation. The source code is available at
https://github.com/ZhouL-lab/ PLHN.
基于动态对比增强磁共振成像(DCE-MRI)的自动乳腺肿瘤分割在临床实践中显示出巨大的前景,特别是在识别乳腺疾病的存在方面。然而,乳腺肿瘤的准确分割是一项具有挑战性的任务,通常需要开发复杂的网络。为了在计算成本和分割性能之间取得最佳权衡,我们提出了一种通过卷积神经网络(CNN)和变压器层相结合的混合网络。具体来说,混合网络由堆叠卷积层和反卷积层的编码器-解码器架构组成。然后在编码器子网络之后实现有效的 3D 变换层,以捕获瓶颈特征之间的全局依赖性。为了提高混合网络的效率,分别为解码器层和变换器层设计了两个并行编码器子网络。为了进一步增强混合网络的判别能力,提出了一种原型学习引导预测模块,其中通过在线聚类计算类别指定的原型特征。所有学习到的原型特征最终与来自解码器的特征相结合以进行肿瘤掩模预测。在私有和公共 DCE-MRI 数据集上的实验结果表明,所提出的混合网络比最先进的(SOTA)方法具有更优越的性能,同时保持分割精度和计算成本之间的平衡。此外,我们证明自动生成的肿瘤掩模可以有效地应用于识别 HER2 阳性亚型和 HER2 阴性亚型,其准确性与基于手动肿瘤分割的分析相似。 源代码可在 https://github.com/ZhouL-lab/PLHN 获取。
AU Cai, Zhiyuan
Lin, Li
He, Huaqing
Cheng, Pujin
Tang, Xiaoying
蔡区、林志远、何力、程华清、唐普金、小英
Uni4Eye++: A General Masked Image Modeling Multi-modal Pre-training
Framework for Ophthalmic Image Classification and Segmentation.
Uni4Eye++:用于眼科图像分类和分割的通用掩模图像建模多模态预训练框架。
A large-scale labeled dataset is a key factor for the success of
supervised deep learning in most ophthalmic image analysis scenarios.
However, limited annotated data is very common in ophthalmic image
analysis, since manual annotation is time-consuming and labor-intensive.
Self-supervised learning (SSL) methods bring huge opportunities for
better utilizing unlabeled data, as they do not require massive
annotations. To utilize as many unlabeled ophthalmic images as possible,
it is necessary to break the dimension barrier, simultaneously making
use of both 2D and 3D images as well as alleviating the issue of
catastrophic forgetting. In this paper, we propose a universal
self-supervised Transformer framework named Uni4Eye++ to discover the
intrinsic image characteristic and capture domain-specific feature
embedding in ophthalmic images. Uni4Eye++ can serve as a global feature
extractor, which builds its basis on a Masked Image Modeling task with a
Vision Transformer architecture. On the basis of our previous work
Uni4Eye, we further employ an image entropy guided masking strategy to
reconstruct more-informative patches and a dynamic head generator module
to alleviate modality confusion. We evaluate the performance of our
pre-trained Uni4Eye++ encoder by fine-tuning it on multiple downstream
ophthalmic image classification and segmentation tasks. The superiority
of Uni4Eye++ is successfully established through comparisons to other
state-of-the-art SSL pre-training methods. Our code is available at
Github1.
大规模标记数据集是监督深度学习在大多数眼科图像分析场景中成功的关键因素。然而,有限的注释数据在眼科图像分析中非常常见,因为手动注释既耗时又费力。自监督学习(SSL)方法为更好地利用未标记数据带来了巨大的机会,因为它们不需要大量注释。为了利用尽可能多的未标记眼科图像,有必要打破维度障碍,同时利用 2D 和 3D 图像,并减轻灾难性遗忘的问题。在本文中,我们提出了一种名为 Uni4Eye++ 的通用自监督 Transformer 框架,用于发现内在图像特征并捕获眼科图像中嵌入的特定领域特征。 Uni4Eye++ 可以用作全局特征提取器,它以具有 Vision Transformer 架构的掩模图像建模任务为基础。在我们之前工作 Uni4Eye 的基础上,我们进一步采用图像熵引导掩蔽策略来重建更多信息的补丁和动态头部生成器模块来减轻模态混乱。我们通过在多个下游眼科图像分类和分割任务上对其进行微调来评估预训练 Uni4Eye++ 编码器的性能。通过与其他最先进的 SSL 预训练方法的比较,成功确立了 Uni4Eye++ 的优越性。我们的代码可以在 Github1 上找到。
AU De Marco, Fabio
Andrejewski, Jana
Urban, Theresa
Willer, Konstantin
Gromann, Lukas
Koehler, Thomas
Maack, Hanns-Ingo
Herzen, Julia
Pfeiffer, Franz
AU De Marco、法比奥·安德烈耶夫斯基、贾娜·厄本、特里萨·威勒、康斯坦丁·格罗曼、卢卡斯·克勒、托马斯·麦克、汉斯-英戈·赫尔岑、朱莉娅·菲佛、弗朗兹
X-Ray Dark-Field Signal Reduction Due to Hardening of the Visibility
Spectrum
可见光谱硬化导致 X 射线暗场信号减少
X-ray dark-field imaging enables a spatially-resolved visualization of
ultra-small-angle X-ray scattering. Using phantom measurements, we
demonstrate that a material's effective dark-field signal may be reduced
by modification of the visibility spectrum by other dark-field-active
objects in the beam. This is the dark-field equivalent of conventional
beam-hardening, and is distinct from related, known effects, where the
dark-field signal is modified by attenuation or phase shifts. We present
a theoretical model for this group of effects and verify it by
comparison to the measurements. These findings have significant
implications for the interpretation of dark-field signal strength in
polychromatic measurements.
X 射线暗场成像可实现超小角度 X 射线散射的空间分辨可视化。使用体模测量,我们证明材料的有效暗场信号可以通过光束中其他暗场活跃物体修改可见光谱来减少。这是传统光束硬化的暗场等效,并且与相关的已知效应不同,在已知效应中,暗场信号通过衰减或相移进行修改。我们提出了这组效应的理论模型,并通过与测量结果的比较来验证它。这些发现对于多色测量中暗场信号强度的解释具有重要意义。
AU Rong, Dingyi
Zhao, Zhongyin
Wu, Yue
Ke, Bilian
Ni, Binging
区蓉、赵定一、吴中银、柯岳、倪碧莲、冰冰
Prediction of Myopia Eye Axial Elongation With Orthokeratology Treatment
via Dense I2I Based Corneal Topography Change Analysis
通过基于密集 I2I 的角膜地形变化分析预测近视眼眼轴伸长与角膜塑形治疗
While orthokeratology (OK) has shown effective to slow the progression
of myopia, it remains unknown how spatially distributed structural
stress/tension applying to different regions affects the change of
corneal geometry, and consecutive the outcome of myopia control, at
fine-grained detail. Acknowledging that the underlying working mechanism
of OK lens is essentially mechanics induced refractive parameter
reshaping, in this study, we develop a novel mechanics rule guided deep
image-to-image learning framework, which densely predicts patient's
corneal topography change according to treatment parameters (lens
geometry, wearing time, physiological parameters, etc.), and
consecutively predicts the influence on eye axial length change after OK
treatment. Encapsulated in a U-shaped multi-resolution map-to-map
architecture, the proposed model features two major components. First,
geometric and wearing parameters of OK lens are spatially encoded with
convolutions to form a multi-channel input volume/tensor for latent
encodings of external stress/tension applied to different regions of
cornea. Second, these external latent force maps are progressively
down-sampled and injected into this multi-scale architecture for
predicting the change of corneal topography map. At each feature
learning layer, we formally derive a mathematic framework that simulates
the physical process of corneal deformation induced by lens-to-cornea
interaction and corneal internal tension, which is reformulated into
parameter learnable cross-attention/self-attention modules in the
context of transformer architecture. A total of 1854 eyes of myopia
patients are included in the study and the results show that the
proposed model precisely predicts corneal topography change with a high
PSNR as 28.45dB, as well as a significant accuracy gain for axial
elongation prediction (i.e., 0.0276 in MSE). It is also demonstrated
that our method provides interpretable associations between various OK
treatment parameters and the final control effect.
虽然角膜塑形术(OK)已被证明可以有效减缓近视的进展,但仍不清楚施加到不同区域的空间分布的结构应力/张力如何影响角膜几何形状的变化,以及在细粒度细节上连续影响近视控制的结果。认识到 OK 镜片的基本工作机制本质上是力学引起的屈光参数重塑,在本研究中,我们开发了一种新颖的力学规则引导的深度图像到图像学习框架,该框架根据治疗参数(镜片)密集预测患者的角膜地形变化几何形状、佩戴时间、生理参数等),并连续预测OK治疗后对眼轴长度变化的影响。所提出的模型封装在 U 形多分辨率地图到地图架构中,具有两个主要组成部分。首先,利用卷积对OK镜片的几何和佩戴参数进行空间编码,以形成多通道输入体积/张量,用于对施加到角膜不同区域的外部应力/张力进行潜在编码。其次,这些外部潜在力图被逐步下采样并注入到这个多尺度架构中,以预测角膜地形图的变化。在每个特征学习层,我们正式推导了一个数学框架,该框架模拟由晶状体与角膜相互作用和角膜内部张力引起的角膜变形的物理过程,并在上下文中将其重新表述为参数可学习的交叉注意/自注意模块变压器架构。该研究共纳入了 1854 只近视患者眼睛,结果表明该模型能够准确预测角膜地形图变化,PSNR 高达 28。45dB,以及轴向伸长预测的显着精度增益(即 MSE 中的 0.0276)。还证明我们的方法提供了各种 OK 治疗参数和最终控制效果之间的可解释关联。
AU Cheung, Chim-Lee
Wu, Mengjie
Fang, Ge
Ho, Justin D. L.
Liang, Liyuan
Tan, Kel Vin
Lin, Fa-Hsuan
Chang, Hing-Chiu
Kwok, Ka-Wai
AU Cheung、Chim-Lee Wu、Mengjie Fang、Ge Ho、Justin DL Liang、Liyuan Tan、Kel Vin Lin、Fa-Hsuan Chang、Hing-Chiu Kwok、Ka-Wai
Omnidirectional Monolithic Marker for Intra-Operative MR-Based
Positional Sensing in Closed MRI
用于闭合 MRI 中基于 MR 的术中位置传感的全向整体标记
We present a design of an inductively coupled radio frequency (ICRF)
marker for magnetic resonance (MR)-based positional tracking, enabling
the robust increase of tracking signal at all scanning orientations in
quadrature-excited closed MR imaging (MRI). The marker employs three
curved resonant circuits fully covering a cylindrical surface that
encloses the signal source. Each resonant circuit is a planar spiral
inductor with parallel plate capacitors fabricated monolithically on
flexible printed circuit board (FPC) and bent to achieve the curved
structure. Size of the constructed marker is Phi 3-mm x5 -mm with
quality factor > 22, and its tracking performance was validated with 1.5
T MRI scanner. As result, the marker remains as a high positive contrast
spot under 360(degrees )rotations in 3 axes. The marker can be
accurately localized with a maximum error of 0.56 mm under a
displacement of 56 mm from the isocenter, along with an inherent
standard deviation of 0.1-mm. Accrediting to the high image contrast,
the presented marker enables automatic and real-time tracking in 3D
without dependency on its orientation with respect to the MRI scanner
receive coil. In combination with its small form-factor, the presented
marker would facilitate robust and wireless MR-based tracking for
intervention and clinical diagnosis. This method targets applications
that can involve rotational changes in all axes (X-Y-Z).
我们提出了一种用于基于磁共振 (MR) 的位置跟踪的电感耦合射频 (ICRF) 标记的设计,能够在正交激励闭合 MR 成像 (MRI) 中的所有扫描方向上实现跟踪信号的强劲增加。该标记采用三个弯曲谐振电路,完全覆盖包围信号源的圆柱形表面。每个谐振电路都是平面螺旋电感器和平行板电容器,单片制造在柔性印刷电路板 (FPC) 上并弯曲以实现弯曲结构。构建的标记尺寸为 Phi 3-mm x5-mm,品质因数为 > 22,其跟踪性能通过 1.5 T MRI 扫描仪进行了验证。结果,标记在 3 个轴 360(度)旋转下仍保持高正对比度点。在距等中心点位移 56 毫米的情况下,可以精确定位标记,最大误差为 0.56 毫米,固有标准偏差为 0.1 毫米。由于具有高图像对比度,所提出的标记能够在 3D 中自动实时跟踪,而不依赖于其相对于 MRI 扫描仪接收线圈的方向。结合其小巧的外形,所提出的标记将促进基于 MR 的稳健和无线跟踪,以进行干预和临床诊断。此方法针对可能涉及所有轴 (XYZ) 旋转变化的应用。
AU Yue, Guanghui
Zhang, Lixin
Du, Jingfeng
Zhou, Tianwei
Zhou, Wei
Lin, Weisi
区悦、张光辉、杜立新、周景峰、周天伟、林伟、伟思
Subjective and Objective Quality Assessment of Colonoscopy Videos.
结肠镜检查视频的主观和客观质量评估。
Captured colonoscopy videos usually suffer from multiple real-world
distortions, such as motion blur, low brightness, abnormal exposure, and
object occlusion, which impede visual interpretation. However, existing
works mainly investigate the impacts of synthesized distortions, which
differ from real-world distortions greatly. This research aims to carry
out an in-depth study for colonoscopy Video Quality Assessment (VQA). In
this study, we advance this topic by establishing both subjective and
objective solutions. Firstly, we collect 1,000 colonoscopy videos with
typical visual quality degradation conditions in practice and construct
a multi-attribute VQA database. The quality of each video is annotated
by subjective experiments from five distortion attributes (i.e.,
temporal-spatial visibility, brightness, specular reflection, stability,
and utility), as well as an overall perspective. Secondly, we propose a
Distortion Attribute Reasoning Network (DARNet) for automatic VQA.
DARNet includes two streams to extract features related to spatial and
temporal distortions, respectively. It adaptively aggregates the
attribute-related features through a multi-attribute association module
to predict the quality score of each distortion attribute. Motivated by
the observation that the rating behaviors for all attributes are
different, a behavior guided reasoning module is further used to fuse
the attribute-aware features, resulting in the overall quality.
Experimental results on the constructed database show that our DARNet
correlates well with subjective ratings and is superior nine
state-of-the-art methods.
捕获的结肠镜检查视频通常会遭受多种现实世界的扭曲,例如运动模糊、低亮度、异常曝光和物体遮挡,这些都会妨碍视觉解释。然而,现有的工作主要研究合成扭曲的影响,这与现实世界的扭曲有很大不同。本研究旨在对结肠镜视频质量评估(VQA)进行深入研究。在这项研究中,我们通过建立主观和客观的解决方案来推进这个主题。首先,我们收集了实践中具有典型视觉质量退化情况的1000个结肠镜检查视频,并构建了多属性VQA数据库。每个视频的质量是通过五个失真属性(即时空可见性、亮度、镜面反射、稳定性和实用性)以及整体视角的主观实验来注释的。其次,我们提出了一种用于自动 VQA 的失真属性推理网络 (DARNet)。 DARNet 包括两个流,分别用于提取与空间和时间扭曲相关的特征。它通过多属性关联模块自适应地聚合与属性相关的特征,以预测每个失真属性的质量得分。由于观察到所有属性的评分行为都是不同的,因此进一步使用行为引导推理模块来融合属性感知特征,从而得出整体质量。所构建数据库的实验结果表明,我们的 DARNet 与主观评分具有良好的相关性,并且优于九种最先进的方法。
AU Mineo, Raffaele
Salanitri, F. Proietto
Bellitto, G.
Kavasidis, I.
De Filippo, O.
Millesimo, M.
De Ferrari, G. M.
Aldinucci, M.
Giordano, D.
Palazzo, S.
D'Ascenzo, F.
Spampinato, C.
AU Mineo、Raffaele Salanitri、F. Proietto Bellitto、G. Kavasidis、I. De Filippo、O. Millesimo、M. De Ferrari、GM Aldinucci、M. Giordano、D. Palazzo、S. D'Ascenzo、F. Spampinato、 C.
A Convolutional-Transformer Model for FFR and iFR Assessment From
Coronary Angiography
用于冠状动脉造影 FFR 和 iFR 评估的卷积变压器模型
The quantification of stenosis severity from X-ray catheter angiography
is a challenging task. Indeed, this requires to fully understand the
lesion's geometry by analyzing dynamics of the contrast material, only
relying on visual observation by clinicians. To support decision making
for cardiac intervention, we propose a hybrid CNN-Transformer model for
the assessment of angiography-based non-invasive fractional flow-reserve
(FFR) and instantaneous wave-free ratio (iFR) of intermediate coronary
stenosis. Our approach predicts whether a coronary artery stenosis is
hemodynamically significant and provides direct FFR and iFR estimates.
This is achieved through a combination of regression and classification
branches that forces the model to focus on the cut-off region of FFR
(around 0.8 FFR value), which is highly critical for decision-making. We
also propose a spatio-temporal factorization mechanisms that redesigns
the transformer's self-attention mechanism to capture both local spatial
and temporal interactions between vessel geometry, blood flow dynamics,
and lesion morphology. The proposed method achieves state-of-the-art
performance on a dataset of 778 exams from 389 patients. Unlike existing
methods, our approach employs a single angiography view and does not
require knowledge of the key frame; supervision at training time is
provided by a classification loss (based on a threshold of the FFR/iFR
values) and a regression loss for direct estimation. Finally, the
analysis of model interpretability and calibration shows that, in spite
of the complexity of angiographic imaging data, our method can robustly
identify the location of the stenosis and correlate prediction
uncertainty to the provided output scores.
通过 X 射线导管血管造影量化狭窄严重程度是一项具有挑战性的任务。事实上,这需要通过分析对比材料的动力学来充分了解病变的几何形状,仅依靠临床医生的视觉观察。为了支持心脏介入决策,我们提出了一种混合 CNN-Transformer 模型,用于评估基于血管造影的无创血流储备分数 (FFR) 和中间冠状动脉狭窄的瞬时无波比 (iFR)。我们的方法可以预测冠状动脉狭窄是否具有血流动力学显着性,并提供直接的 FFR 和 iFR 估计。这是通过回归和分类分支的组合来实现的,迫使模型专注于 FFR 的截止区域(大约 0.8 FFR 值),这对于决策非常关键。我们还提出了一种时空分解机制,重新设计了变压器的自注意力机制,以捕获血管几何形状、血流动力学和病变形态之间的局部空间和时间相互作用。所提出的方法在 389 名患者的 778 项检查数据集上实现了最先进的性能。与现有方法不同,我们的方法采用单一血管造影视图,不需要关键帧的知识;训练时的监督由分类损失(基于 FFR/iFR 值的阈值)和用于直接估计的回归损失提供。最后,模型可解释性和校准的分析表明,尽管血管造影成像数据很复杂,但我们的方法可以稳健地识别狭窄的位置,并将预测不确定性与提供的输出分数相关联。
AU Li, Xibao
Ouyang, Xi
Zhang, Jiadong
Ding, Zhongxiang
Zhang, Yuyao
Xue, Zhong
Shi, Feng
Shen, Dinggang
AU Li, 欧阳喜宝, 张喜, 丁家栋, 张忠祥, 薛玉瑶, 石钟, 沉峰, 丁刚
Carotid Vessel Wall Segmentation Through Domain Aligner, Topological
Learning, and Segment Anything Model for Sparse Annotation in MR Images.
通过域对准器、拓扑学习和分段任意模型进行颈动脉血管壁分割,以实现 MR 图像中的稀疏注释。
Medical image analysis poses significant challenges due to limited
availability of clinical data, which is crucial for training accurate
models. This limitation is further compounded by the specialized and
labor-intensive nature of the data annotation process. For example,
despite the popularity of computed tomography angiography (CTA) in
diagnosing atherosclerosis with an abundance of annotated datasets,
magnetic resonance (MR) images stand out with better visualization for
soft plaque and vessel wall characterization. However, the higher cost
and limited accessibility of MR, as well as time-consuming nature of
manual labeling, contribute to fewer annotated datasets. To address
these issues, we formulate a multi-modal transfer learning network,
named MT-Net, designed to learn from unpaired CTA and sparsely-annotated
MR data. Additionally, we harness the Segment Anything Model (SAM) to
synthesize additional MR annotations, enriching the training process.
Specifically, our method first segments vessel lumen regions followed by
precise characterization of carotid artery vessel walls, thereby
ensuring both segmentation accuracy and clinical relevance. Validation
of our method involved rigorous experimentation on publicly available
datasets from COSMOS and CARE-II challenge, demonstrating its superior
performance compared to existing state-of-the-art techniques.
由于临床数据的可用性有限,医学图像分析面临重大挑战,而临床数据对于训练准确的模型至关重要。数据注释过程的专业性和劳动密集型性质进一步加剧了这种限制。例如,尽管计算机断层扫描血管造影 (CTA) 在诊断动脉粥样硬化方面很受欢迎,并且具有大量带注释的数据集,但磁共振 (MR) 图像在软斑块和血管壁表征方面具有更好的可视化效果,因此脱颖而出。然而,MR 的成本较高、可访问性有限,以及手动标记的耗时性,导致带注释的数据集较少。为了解决这些问题,我们制定了一个多模态迁移学习网络,名为 MT-Net,旨在从不成对的 CTA 和稀疏注释的 MR 数据中学习。此外,我们利用分段任意模型 (SAM) 来合成额外的 MR 注释,丰富训练过程。具体来说,我们的方法首先分割血管腔区域,然后精确表征颈动脉血管壁,从而确保分割准确性和临床相关性。我们的方法的验证涉及对 COSMOS 和 CARE-II 挑战赛的公开数据集进行严格的实验,证明其与现有最先进技术相比具有卓越的性能。
AU Wang, Jian
Qiao, Liang
Zhou, Shichong
Zhou, Jin
Wang, Jun
Li, Juncheng
Ying, Shihui
Chang, Cai
Shi, Jun
AU Wang、Jian Qiao、Liang Zhou、Shichong Zhou、Jin Wang、Jun Li、Jun Cheng Ying、Shihui Chang、Cai Shi、Jun
Weakly Supervised Lesion Detection and Diagnosis for Breast Cancers With
Partially Annotated Ultrasound Images
利用部分注释的超声图像对乳腺癌进行弱监督病变检测和诊断
Deep learning (DL) has proven highly effective for ultrasound-based
computer-aided diagnosis (CAD) of breast cancers. In an automatic CAD
system, lesion detection is critical for the following diagnosis.
However, existing DL-based methods generally require voluminous
manually-annotated region of interest (ROI) labels and class labels to
train both the lesion detection and diagnosis models. In clinical
practice, the ROI labels, i.e. ground truths, may not always be optimal
for the classification task due to individual experience of sonologists,
resulting in the issue of coarse annotation to limit the diagnosis
performance of a CAD model. To address this issue, a novel Two-Stage
Detection and Diagnosis Network (TSDDNet) is proposed based on weakly
supervised learning to improve diagnostic accuracy of the
ultrasound-based CAD for breast cancers. In particular, all the initial
ROI-level labels are considered as coarse annotations before model
training. In the first training stage, a candidate selection mechanism
is then designed to refine manual ROIs in the fully annotated images and
generate accurate pseudo-ROIs for the partially annotated images under
the guidance of class labels. The training set is updated with more
accurate ROI labels for the second training stage. A fusion network is
developed to integrate detection network and classification network into
a unified end-to-end framework as the final CAD model in the second
training stage. A self-distillation strategy is designed on this model
for joint optimization to further improves its diagnosis performance.
The proposed TSDDNet is evaluated on three B-mode ultrasound datasets,
and the experimental results indicate that it achieves the best
performance on both lesion detection and diagnosis tasks, suggesting
promising application potential.
事实证明,深度学习 (DL) 对于基于超声的乳腺癌计算机辅助诊断 (CAD) 非常有效。在自动 CAD 系统中,病变检测对于后续诊断至关重要。然而,现有的基于深度学习的方法通常需要大量手动注释的感兴趣区域(ROI)标签和类别标签来训练病变检测和诊断模型。在临床实践中,由于超声医师的个人经验,ROI 标签(即基本事实)可能并不总是最适合分类任务,从而导致粗略注释的问题,从而限制了 CAD 模型的诊断性能。为了解决这个问题,提出了一种基于弱监督学习的新型两阶段检测和诊断网络(TSDDNet),以提高基于超声的 CAD 对乳腺癌的诊断准确性。特别是,在模型训练之前,所有初始 ROI 级别标签都被视为粗略注释。在第一个训练阶段,设计候选选择机制来细化完全注释图像中的手动 ROI,并在类标签的指导下为部分注释图像生成准确的伪 ROI。在第二个训练阶段,训练集会更新为更准确的 ROI 标签。开发融合网络,将检测网络和分类网络集成到统一的端到端框架中,作为第二训练阶段的最终CAD模型。在此模型上设计了自蒸馏策略进行联合优化,以进一步提高其诊断性能。 所提出的 TSDDNet 在三个 B 型超声数据集上进行了评估,实验结果表明它在病变检测和诊断任务上均取得了最佳性能,表明其具有广阔的应用潜力。
AU Liu, Yuedong
Zhou, Xuan
Wei, Cunfeng
Xu, Qiong
刘AU、周跃东、韦宣、徐存峰、琼
Sparse-view Spectral CT Reconstruction and Material Decomposition based
on Multi-channel SGM.
基于多通道SGM的稀疏视能谱CT重建与材料分解。
In medical applications, the diffusion of contrast agents in tissue can
reflect the physiological function of organisms, so it is valuable to
quantify the distribution and content of contrast agents in the body
over a period. Spectral CT has the advantages of multi-energy projection
acquisition and material decomposition, which can quantify K-edge
contrast agents. However, multiple repetitive spectral CT scans can
cause excessive radiation doses. Sparse-view scanning is commonly used
to reduce dose and scan time, but its reconstructed images are usually
accompanied by streaking artifacts, which leads to inaccurate
quantification of the contrast agents. To solve this problem, an
unsupervised sparse-view spectral CT reconstruction and material
decomposition algorithm based on the multi-channel score-based
generative model (SGM) is proposed in this paper. First, multi-energy
images and tissue images are used as multi-channel input data for SGM
training. Secondly, the organism is multiply scanned in sparse views,
and the trained SGM is utilized to generate multi-energy images and
tissue images driven by sparse-view projections. After that, a material
decomposition algorithm using tissue images generated by SGM as prior
images for solving contrast agent images is established. Finally, the
distribution and content of the contrast agents are obtained. The
comparison and evaluation of this method are given in this paper, and a
series of mouse scanning experiments are carried out to verify the
effectiveness of the method.
在医学应用中,造影剂在组织中的扩散可以反映生物体的生理功能,因此量化造影剂在一段时间内在体内的分布和含量具有重要价值。能谱CT具有多能量投影采集和物质分解的优点,可以量化K边造影剂。然而,多次重复的能谱 CT 扫描可能会导致辐射剂量过多。稀疏视图扫描通常用于减少剂量和扫描时间,但其重建图像通常伴有条纹伪影,从而导致造影剂定量不准确。针对这一问题,本文提出一种基于多通道评分生成模型(SGM)的无监督稀疏视图能谱CT重建和材料分解算法。首先,使用多能量图像和组织图像作为SGM训练的多通道输入数据。其次,在稀疏视图中对生物体进行多次扫描,并利用经过训练的 SGM 生成由稀疏视图投影驱动的多能量图像和组织图像。之后,建立了一种利用SGM生成的组织图像作为先验图像来求解造影剂图像的材料分解算法。最后获得造影剂的分布和含量。本文对该方法进行了比较和评价,并进行了一系列小鼠扫描实验来验证该方法的有效性。
EI 1558-254X
DA 2024-06-18
UT MEDLINE:38865221
PM 38865221
ER
EI 1558-254X DA 2024-06-18 UT MEDLINE:38865221 PM 38865221 ER
AU Billot, Benjamin
Dey, Neel
Moyer, Daniel
Hoffmann, Malte
Turk, Esra Abaci
Gagoski, Borjan
Ellen Grant, P
Golland, Polina
AU Billot、本杰明·戴伊、尼尔·莫耶、丹尼尔·霍夫曼、马尔特·特克、埃斯拉·阿巴奇·加戈斯基、博尔扬·艾伦·格兰特、P Golland、波利纳
SE(3)-Equivariant and Noise-Invariant 3D Rigid Motion Tracking in Brain
MRI.
SE(3)-脑 MRI 中的等变和噪声不变 3D 刚性运动跟踪。
Rigid motion tracking is paramount in many medical imaging applications
where movements need to be detected, corrected, or accounted for. Modern
strategies rely on convolutional neural networks (CNN) and pose this
problem as rigid registration. Yet, CNNs do not exploit natural
symmetries in this task, as they are equivariant to translations (their
outputs shift with their inputs) but not to rotations. Here we propose
EquiTrack, the first method that uses recent steerable SE(3)-equivariant
CNNs (E-CNN) for motion tracking. While steerable E-CNNs can extract
corresponding features across different poses, testing them on noisy
medical images reveals that they do not have enough learning capacity to
learn noise invariance. Thus, we introduce a hybrid architecture that
pairs a denoiser with an E-CNN to decouple the processing of
anatomically irrelevant intensity features from the extraction of
equivariant spatial features. Rigid transforms are then estimated in
closed-form. EquiTrack outperforms state-of-the-art learning and
optimisation methods for motion tracking in adult brain MRI and fetal
MRI time series. Our code is available at
https://github.com/BBillot/EquiTrack.
刚性运动跟踪在许多需要检测、校正或解释运动的医学成像应用中至关重要。现代策略依赖于卷积神经网络(CNN),并将这个问题称为刚性配准。然而,CNN 在此任务中并未利用自然对称性,因为它们与平移等价(它们的输出随输入变化),但与旋转不同。在这里,我们提出了 EquiTrack,这是第一个使用最新的可操纵 SE(3) 等变 CNN (E-CNN) 进行运动跟踪的方法。虽然可操纵的 E-CNN 可以提取不同姿势的相应特征,但在噪声医学图像上测试它们表明它们没有足够的学习能力来学习噪声不变性。因此,我们引入了一种混合架构,将降噪器与 E-CNN 配对,以将解剖上不相关的强度特征的处理与等变空间特征的提取解耦。然后以封闭形式估计刚性变换。 EquiTrack 在成人大脑 MRI 和胎儿 MRI 时间序列的运动跟踪方面优于最先进的学习和优化方法。我们的代码可在 https://github.com/BBillot/EquiTrack 获取。
AU Chen, Fang
Han, Haojie
Wan, Peng
Chen, Lingyu
Kong, Wentao
Liao, Hongen
Wen, Baojie
Liu, Chunrui
Zhang, Daoqiang
陈AU、韩芳、万浩杰、陈鹏、孔令宇、廖文涛、温洪恩、刘宝杰、张春瑞、道强
Do as Sonographers Think: Contrast-enhanced Ultrasound for Thyroid
Nodules Diagnosis via Microvascular Infiltrative Awareness.
按照超声医师的想法去做:对比增强超声通过微血管浸润意识诊断甲状腺结节。
Dynamic contrast-enhanced ultrasound (CEUS) imaging can reflect the
microvascular distribution and blood flow perfusion, thereby holding
clinical significance in distinguishing between malignant and benign
thyroid nodules. Notably, CEUS offers a meticulous visualization of the
microvascular distribution surrounding the nodule, leading to an
apparent increase in tumor size compared to gray-scale ultrasound (US).
In the dual-image obtained, the lesion size enlarged from gray-scale US
to CEUS, as the microvascular appeared to be continuously infiltrating
the surrounding tissue. Although the infiltrative dilatation of
microvasculature remains ambiguous, sonographers believe it may promote
the diagnosis of thyroid nodules. We propose a deep learning model
designed to emulate the diagnostic reasoning process employed by
sonographers. This model integrates the observation of microvascular
infiltration on dynamic CEUS, leveraging the additional insights
provided by gray-scale US for enhanced diagnostic support. Specifically,
temporal projection attention is implemented on time dimension of
dynamic CEUS to represent the microvascular perfusion. Additionally, we
employ a group of confidence maps with flexible Sigmoid Alpha Functions
to aware and describe the infiltrative dilatation process. Moreover, a
self-adaptive integration mechanism is introduced to dynamically
integrate the assisted gray-scale US and the confidence maps of CEUS for
individual patients, ensuring a trustworthy diagnosis of thyroid
nodules. In this retrospective study, we collected a thyroid nodule
dataset of 282 CEUS videos. The method achieves a superior diagnostic
accuracy and sensitivity of 89.52% and 93.75%, respectively. These
results suggest that imitating the diagnostic thinking of sonographers,
encompassing dynamic microvascular perfusion and infiltrative expansion,
proves beneficial for CEUS-based thyroid nodule diagnosis.
动态超声造影(CEUS)成像可以反映微血管分布和血流灌注情况,对鉴别甲状腺结节的良恶性具有临床意义。值得注意的是,CEUS 可以对结节周围的微血管分布进行细致的可视化,与灰度超声 (US) 相比,导致肿瘤尺寸明显增加。在获得的双图像中,病变尺寸从灰度US放大到CEUS,因为微血管似乎不断浸润周围组织。尽管微脉管系统的浸润性扩张仍然不明确,但超声检查医师认为它可能促进甲状腺结节的诊断。我们提出了一种深度学习模型,旨在模拟超声医师所采用的诊断推理过程。该模型整合了动态 CEUS 上微血管浸润的观察,利用灰度超声提供的额外见解来增强诊断支持。具体来说,在动态CEUS的时间维度上实施时间投影注意力来表示微血管灌注。此外,我们采用一组具有灵活的 Sigmoid Alpha 函数的置信图来感知和描述渗透扩张过程。此外,引入自适应整合机制,动态整合个体患者的辅助灰度超声和CEUS置信度图,确保甲状腺结节的诊断可信。在这项回顾性研究中,我们收集了 282 个 CEUS 视频的甲状腺结节数据集。该方法的诊断准确率和灵敏度分别为 89.52% 和 93.75%。 这些结果表明,模仿超声医师的诊断思维,包括动态微血管灌注和浸润扩张,被证明有利于基于 CEUS 的甲状腺结节诊断。
EI 1558-254X
DA 2024-05-31
UT MEDLINE:38801692
PM 38801692
ER
EI 1558-254X DA 2024-05-31 UT MEDLINE:38801692 PM 38801692 ER
AU Khan, M Owais
Seresti, Anahita A
Menon, Karthik
Marsden, Alison L
Nieman, Koen
AU Khan、M Owais Seresti、Anahita A Menon、Karthik Marsden、Alison L Nieman、Koen
Quantification and Visualization of CT Myocardial Perfusion Imaging to
Detect Ischemia-Causing Coronary Arteries.
CT 心肌灌注成像的量化和可视化以检测引起缺血的冠状动脉。
Coronary computed tomography angiography (cCTA) has poor specificity to
identify coronary stenosis that limit blood flow to the myocardial
tissue. Integration of dynamic CT myocardial perfusion imaging (CT-MPI)
can potentially improve the diagnostic accuracy. We propose a method
that integrates cCTA and CT-MPI to identify culprit coronary lesions
that limit blood flow to the myocardium. Coronary arteries and left
ventricle surfaces were segmented from cCTA and registered to CT-MPI.
Myocardial blood flow (MBF) was derived from CT-MPI. A ray-casting
approach was developed to project volumetric MBF onto the left ventricle
surface. MBF volume were divided into coronary-specific territories
based on proximity to the nearest coronary artery. MBF and normalized
MBF were computed for the myocardium and each of the coronary artery.
Projection of MBF onto cCTA allowed for direct visualization of
perfusion defects. Normalized MBF had higher correlation with ischemic
myocardial territory compared to MBF (MBF: R2=0.81 and Index MBF:
R2=0.90). There were 18 vessels that showed angiographic disease
(stenosis >50%); however, normalized MBF demonstrated only 5 coronary
territories to be ischemic. These findings demonstrate that cCTA and
CT-MPI can be integrated to visualize myocardial defects and detect
culprit coronary arteries responsible for perfusion defects. These
methods can allow for non-invasive detection of ischemia-causing
coronary lesions and ultimately help guide clinicians to deliver more
targeted coronary interventions.
冠状动脉计算机断层扫描血管造影 (cCTA) 识别限制流向心肌组织的血流的冠状动脉狭窄的特异性较差。动态 CT 心肌灌注成像 (CT-MPI) 的集成可以潜在地提高诊断准确性。我们提出了一种整合 cCTA 和 CT-MPI 的方法来识别限制心肌血流的罪魁祸首冠状动脉病变。冠状动脉和左心室表面从 cCTA 中分割出来并注册到 CT-MPI 上。心肌血流量 (MBF) 源自 CT-MPI。开发了一种射线投射方法,将体积 MBF 投射到左心室表面。 MBF 体积根据与最近冠状动脉的接近程度分为冠状动脉特定区域。计算心肌和每条冠状动脉的MBF 和归一化MBF。 MBF 投影到 cCTA 上可以直接显示灌注缺陷。与 MBF 相比,归一化 MBF 与缺血心肌区域的相关性更高(MBF:R2=0.81,指数 MBF:R2=0.90)。有18条血管显示血管造影疾病(狭窄>50%);然而,标准化的 MBF 表明只有 5 个冠状动脉区域缺血。这些发现表明,cCTA 和 CT-MPI 可以整合起来,以可视化心肌缺陷并检测导致灌注缺陷的罪魁祸首冠状动脉。这些方法可以对引起缺血的冠状动脉病变进行无创检测,并最终帮助指导临床医生提供更有针对性的冠状动脉干预措施。
AU Jin, Yifei
Meng, Ling-Jian
区金、孟亦飞、凌健
Exploration of Coincidence Detection of Cascade Photons to Enhance
Preclinical Multi-Radionuclide SPECT Imaging
级联光子符合检测增强临床前多放射性核素 SPECT 成像的探索
We proposed a technique of coincidence detection of cascade photons
(CDCP) to enhance preclinical SPECT imaging of therapeutic radionuclides
emitting cascade photons, such as Lu-177, Ac-225, Ra-223, and In-111. We
have carried out experimental studies to evaluate the proposed
CDCP-SPECT imaging of low-activity radionuclides using a prototype
coincidence detection system constructed with large-volume cadmium zinc
telluride (CZT) imaging spectrometers and a pinhole collimator. With
In-111 in experimental studies, the CDCP technique allows us to improve
the signal-to-contamination in the projection (Projection-SCR) by
similar to 53 times and reduce similar to 98% of the normalized
contamination. Compared to traditional scatter correction, which
achieves a Projection-SCR of 1.00, our CDCP method boosts it to 15.91,
showing enhanced efficacy in reducing down-scattered contamination,
especially at lower activities. The reconstructed images of a line
source demonstrated the dramatic enhancement of the image quality with
CDCP-SPECT compared to conventional and triple-energy-window-corrected
SPECT data acquisition. We also introduced artificial energy blurring
and Monte Carlo simulation to quantify the impact of detector
performance, especially its energy resolution and timing resolution, on
the enhancement through the CDCP technique. We have further demonstrated
the benefits of the CDCP technique with simulation studies, which shows
the potential of improving the signal-to-contamination ratio by 300
times with Ac-225, which emits cascade photons with a decay constant of
similar to 0.1 ns. These results have demonstrated the potential of
CDCP-enhanced SPECT for imaging a super-low level of therapeutic
radionuclides in small animals.
我们提出了一种级联光子符合检测 (CDCP) 技术,以增强发射级联光子的治疗性放射性核素(例如 Lu-177、Ac-225、Ra-223 和 In-111)的临床前 SPECT 成像。我们开展了实验研究,使用由大体积碲化镉锌 (CZT) 成像光谱仪和针孔准直器构建的原型符合检测系统来评估所提出的低活度放射性核素的 CDCP-SPECT 成像。通过实验研究中的 In-111,CDCP 技术使我们能够将投影 (Projection-SCR) 中的信号污染比提高约 53 倍,并减少约 98% 的标准化污染。与传统的散射校正(投影-SCR 达到 1.00)相比,我们的 CDCP 方法将其提高到 15.91,显示出在减少下散射污染方面的增强功效,尤其是在较低的活动情况下。线源的重建图像表明,与传统和三能量窗口校正 SPECT 数据采集相比,CDCP-SPECT 的图像质量显着提高。我们还引入了人工能量模糊和蒙特卡罗模拟来量化探测器性能的影响,特别是其能量分辨率和定时分辨率,对通过CDCP技术增强的影响。我们通过模拟研究进一步证明了CDCP技术的优势,结果表明Ac-225具有将信号污染比提高300倍的潜力,Ac-225发射的级联光子的衰减常数接近0.1 ns。这些结果证明了CDCP增强的SPECT在小动物中对超低水平的治疗性放射性核素进行成像的潜力。
AU Jung, Wonsik
Jeon, Eunjin
Kang, Eunsong
Suk, Heung-Il
欧正、全元植、姜恩珍、石恩松、兴日
EAG-RS: A Novel Explainability-Guided ROI-Selection Framework for ASD
Diagnosis via Inter-Regional Relation Learning
EAG-RS:一种新颖的可解释性引导的 ROI 选择框架,用于通过区域间关系学习进行 ASD 诊断
Deep learning models based on resting-state functional magnetic
resonance imaging (rs-fMRI) have been widely used to diagnose brain
diseases, particularly autism spectrum disorder (ASD). Existing studies
have leveraged the functional connectivity (FC) of rs-fMRI, achieving
notable classification performance. However, they have significant
limitations, including the lack of adequate information while using
linear low-order FC as inputs to the model, not considering individual
characteristics (i.e., different symptoms or varying stages of severity)
among patients with ASD, and the non-explainability of the decision
process. To cover these limitations, we propose a novel
explainability-guided region of interest (ROI) selection (EAG-RS)
framework that identifies non-linear high-order functional associations
among brain regions by leveraging an explainable artificial intelligence
technique and selects class-discriminative regions for brain disease
identification. The proposed framework includes three steps: (i)
inter-regional relation learning to estimate non-linear relations
through random seed-based network masking, (ii) explainable
connection-wise relevance score estimation to explore high-order
relations between functional connections, and (iii) non-linear
high-order FC-based diagnosis-informative ROI selection and classifier
learning to identify ASD. We validated the effectiveness of our proposed
method by conducting experiments using the Autism Brain Imaging Database
Exchange (ABIDE) dataset, demonstrating that the proposed method
outperforms other comparative methods in terms of various evaluation
metrics. Furthermore, we qualitatively analyzed the selected ROIs and
identified ASD subtypes linked to previous neuroscientific studies.
基于静息态功能磁共振成像(rs-fMRI)的深度学习模型已广泛用于诊断脑部疾病,特别是自闭症谱系障碍(ASD)。现有研究利用 rs-fMRI 的功能连接 (FC),取得了显着的分类性能。然而,它们有很大的局限性,包括在使用线性低阶 FC 作为模型的输入时缺乏足够的信息,没有考虑 ASD 患者的个体特征(即不同的症状或不同的严重程度阶段),以及非决策过程的可解释性。为了克服这些局限性,我们提出了一种新颖的可解释性引导的感兴趣区域(ROI)选择(EAG-RS)框架,该框架通过利用可解释的人工智能技术来识别大脑区域之间的非线性高阶功能关联,并选择类别判别性脑部疾病识别区域。所提出的框架包括三个步骤:(i)区域间关系学习,通过基于随机种子的网络掩码来估计非线性关系,(ii)可解释的连接相关性得分估计,以探索功能连接之间的高阶关系,以及(iii) 基于非线性高阶 FC 的诊断信息 ROI 选择和分类器学习来识别 ASD。我们通过使用自闭症脑成像数据库交换(ABIDE)数据集进行实验来验证我们提出的方法的有效性,证明所提出的方法在各种评估指标方面优于其他比较方法。此外,我们对选定的 ROI 进行了定性分析,并确定了与之前的神经科学研究相关的 ASD 亚型。
AU Du, Lei
Zhao, Ying
Zhang, Jianting
Shang, Muheng
Zhang, Jin
Han, Junwei
CA Alzheimers Dis Neuroimaging
杜AU, 赵磊, 张颖, 商建婷, 张木恒, 韩进, 俊伟 CA 阿尔茨海默病神经影像学
Identification of Genetic Risk Factors Based on Disease Progression
Derived From Longitudinal Brain Imaging Phenotypes
根据纵向脑成像表型得出的疾病进展识别遗传风险因素
Neurodegenerative disorders usually happen stage-by-stage rather than
overnight. Thus, cross-sectional brain imaging genetic methods could be
insufficient to identify genetic risk factors. Repeatedly collecting
imaging data over time appears to solve the problem. But most existing
imaging genetic methods only use longitudinal imaging phenotypes
straightforwardly, ignoring the disease progression trajectory which
might be a more stable disease signature. In this paper, we propose a
novel sparse multi-task mixed-effects longitudinal imaging genetic
method (SMMLING). In our model, disease progression fitting and genetic
risk factors identification are conducted jointly. Specifically, SMMLING
models the disease progression using longitudinal imaging phenotypes,
and then associates fitted disease progression with genetic variations.
The baseline status and changing rate, i.e., the intercept and slope, of
the progression trajectory thus shoulder the responsibility to discover
loci of interest, which would have superior and stable performance. To
facilitate the interpretation and stability, we employ $\ell _{{2},{1}}$
-norm and the fused group lasso (FGL) penalty to identify loci at both
the individual level and group level. SMMLING can be solved by an
efficient optimization algorithm which is guaranteed to converge to the
global optimum. We evaluate SMMLING on synthetic data and real
longitudinal neuroimaging genetic data. Both results show that, compared
to existing longitudinal methods, SMMLING can not only decrease the
modeling error but also identify more accurate and relevant genetic
factors. Most risk loci reported by SMMLING are missed by comparison
methods, implicating its superiority in genetic risk factors
identification. Consequently, SMMLING could be a promising computational
method for longitudinal imaging genetics.
神经退行性疾病通常是分阶段发生的,而不是一夜之间发生的。因此,横断面脑成像遗传方法可能不足以识别遗传风险因素。随着时间的推移反复收集成像数据似乎可以解决这个问题。但大多数现有的成像遗传学方法仅直接使用纵向成像表型,忽略了疾病进展轨迹,这可能是更稳定的疾病特征。在本文中,我们提出了一种新颖的稀疏多任务混合效应纵向成像遗传方法(SMMLING)。在我们的模型中,疾病进展拟合和遗传风险因素识别是联合进行的。具体来说,SMMLING 使用纵向成像表型对疾病进展进行建模,然后将拟合的疾病进展与遗传变异相关联。因此,进展轨迹的基线状态和变化率,即截距和斜率,担负着发现感兴趣位点的责任,具有优越和稳定的性能。为了促进解释和稳定性,我们采用 $\ell _{{2},{1}}$ -范数和融合组套索(FGL)惩罚来识别个体水平和群体水平的基因座。 SMMLING 可以通过有效的优化算法来解决,该算法保证收敛到全局最优值。我们根据合成数据和真实的纵向神经影像遗传数据评估 SMMLING。这两个结果都表明,与现有的纵向方法相比,SMMLING不仅可以减少建模误差,而且可以识别更准确和相关的遗传因素。 SMMLING 报告的大多数风险位点都被比较方法遗漏,这表明其在遗传风险因素识别方面的优越性。 因此,SMMLING 可能是一种有前途的纵向成像遗传学计算方法。
AU Tan, Zhiwei
Shi, Fei
Zhou, Yi
Wang, Jingcheng
Wang, Meng
Peng, Yuanyuan
Xu, Kai
Liu, Ming
Chen, Xinjian
谭AU、史志伟、周飞、王毅、王景城、彭猛、徐媛媛、刘凯、陈明、新建
A Multi-Scale Fusion and Transformer Based Registration Guided Speckle
Noise Reduction for OCT Images
基于多尺度融合和变压器的 OCT 图像配准引导散斑降噪
Optical coherence tomography (OCT) images are inevitably affected by
speckle noise because OCT is based on low-coherence interference.
Multi-frame averaging is one of the effective methods to reduce speckle
noise. Before averaging, the misalignment between images must be
calibrated. In this paper, in order to reduce misalignment between
images caused during the acquisition, a novel multi-scale fusion and
Transformer based (MsFTMorph) method is proposed for deformable retinal
OCT image registration. The proposed method captures global connectivity
and locality with convolutional vision transformer and also incorporates
a multi-resolution fusion strategy for learning the global affine
transformation. Comparative experiments with other state-of-the-art
registration methods demonstrate that the proposed method achieves
higher registration accuracy. Guided by the registration, subsequent
multi-frame averaging shows better results in speckle noise reduction.
The noise is suppressed while the edges can be preserved. In addition,
our proposed method has strong cross-domain generalization, which can be
directly applied to images acquired by different scanners with different
modes.
光学相干断层扫描(OCT)图像不可避免地受到散斑噪声的影响,因为OCT基于低相干干涉。多帧平均是降低散斑噪声的有效方法之一。在平均之前,必须校准图像之间的错位。在本文中,为了减少采集过程中引起的图像之间的错位,提出了一种新颖的基于多尺度融合和Transformer(MsFTMorph)的可变形视网膜OCT图像配准方法。所提出的方法通过卷积视觉变换器捕获全局连通性和局部性,并且还结合了用于学习全局仿射变换的多分辨率融合策略。与其他最先进的配准方法的比较实验表明,所提出的方法实现了更高的配准精度。在配准的指导下,后续的多帧平均显示出更好的散斑噪声抑制效果。噪声被抑制,同时可以保留边缘。此外,我们提出的方法具有很强的跨域泛化性,可以直接应用于不同模式的不同扫描仪获取的图像。
AU Hooshangnejad, Hamed
China, Debarghya
Huang, Yixuan
Zbijewski, Wojciech
Uneri, Ali
McNutt, Todd
Lee, Junghoon
Ding, Kai
AU Hooshangnejad、Hamed China、Debarghya Huang、Yixuan Zbijewski、Wojciech Uneri、Ali McNutt、Todd Lee、Junghoon Ding、Kai
XIOSIS: An X-Ray-Based Intra-Operative Image-Guided Platform for
Oncology Smart Material Delivery
XIOSIS:基于 X 射线的术中图像引导平台,用于肿瘤学智能材料输送
Image-guided interventional oncology procedures can greatly enhance the
outcome of cancer treatment. As an enhancing procedure, oncology smart
material delivery can increase cancer therapy's quality, effectiveness,
and safety. However, the effectiveness of enhancing procedures highly
depends on the accuracy of smart material placement procedures.
Inaccurate placement of smart materials can lead to adverse side effects
and health hazards. Image guidance can considerably improve the safety
and robustness of smart material delivery. In this study, we developed a
novel generative deep-learning platform that highly prioritizes clinical
practicality and provides the most informative intra-operative feedback
for image-guided smart material delivery. XIOSIS generates a
patient-specific 3D volumetric computed tomography (CT) from three
intraoperative radiographs (X-ray images) acquired by a mobile C-arm
during the operation. As the first of its kind, XIOSIS (i) synthesizes
the CT from small field-of-view radiographs;(ii) reconstructs the
intra-operative spacer distribution; (iii) is robust; and (iv) is
equipped with a novel soft-contrast cost function. To demonstrate the
effectiveness of XIOSIS in providing intra-operative image guidance, we
applied XIOSIS to the duodenal hydrogel spacer placement procedure. We
evaluated XIOSIS performance in an image-guided virtual spacer placement
and actual spacer placement in two cadaver specimens. XIOSIS showed a
clinically acceptable performance, reconstructed the 3D intra-operative
hydrogel spacer distribution with an average structural similarity of
0.88 and Dice coefficient of 0.63 and with less than 1 cm difference in
spacer location relative to the spinal cord.
图像引导的介入肿瘤学手术可以极大地提高癌症治疗的效果。作为一种增强程序,肿瘤学智能材料输送可以提高癌症治疗的质量、有效性和安全性。然而,增强程序的有效性很大程度上取决于智能材料放置程序的准确性。智能材料的不准确放置可能会导致不良副作用和健康危害。图像引导可以显着提高智能物料输送的安全性和稳健性。在这项研究中,我们开发了一种新颖的生成深度学习平台,该平台高度重视临床实用性,并为图像引导的智能材料输送提供最丰富的术中反馈。 XIOSIS 根据手术期间移动 C 形臂采集的三张术中 X 光照片(X 射线图像)生成患者特定的 3D 体积计算机断层扫描 (CT)。作为同类产品中的第一个,XIOSIS (i) 通过小视场 X 线照片合成 CT;(ii) 重建术中垫片分布; (iii) 稳健; (iv) 配备了新颖的软对比成本函数。为了证明 XIOSIS 在提供术中图像引导方面的有效性,我们将 XIOSIS 应用于十二指肠水凝胶间隔物放置过程。我们评估了 XIOSIS 在图像引导的虚拟垫片放置和两个尸体标本中的实际垫片放置方面的性能。 XIOSIS 显示了临床上可接受的性能,重建了 3D 术中水凝胶垫片分布,平均结构相似性为 0.88,Dice 系数为 0.63,垫片相对于脊髓的位置差异小于 1 cm。
AU Wei, Xingyue
Ge, Lin
Huang, Lijie
Luo, Jianwen
Xu, Yan
区伟、葛星月、黄林、罗丽杰、徐建文、严
Unsupervised Non-rigid Histological Image Registration Guided by
Keypoint Correspondences Based on Learnable Deep Features with Iterative
Training.
基于可学习深度特征和迭代训练的关键点对应引导的无监督非刚性组织学图像配准。
Histological image registration is a fundamental task in histological
image analysis. It is challenging because of substantial appearance
differences due to multiple staining. Keypoint correspondences, i.e.,
matched keypoint pairs, have been introduced to guide unsupervised deep
learning (DL) based registration methods to handle such a registration
task. This paper proposes an iterative keypoint correspondence-guided
(IKCG) unsupervised network for non-rigid histological image
registration. Fixed deep features and learnable deep features are
introduced as keypoint descriptors to automatically establish keypoint
correspondences, the distance between which is used as a loss function
to train the registration network. Fixed deep features extracted from DL
networks that are pre-trained on natural image datasets are more
discriminative than handcrafted ones, benefiting from the deep and
hierarchical nature of DL networks. The intermediate layer outputs of
the registration networks trained on histological image datasets are
extracted as learnable deep features, which reveal unique information
for histological images. An iterative training strategy is adopted to
train the registration network and optimize learnable deep features
jointly. Benefiting from the excellent matching ability of learnable
deep features optimized with the iterative training strategy, the
proposed method can solve the local non-rigid large displacement
problem, an inevitable problem usually caused by misoperation, such as
tears in producing tissue slices. The proposed method is evaluated on
the Automatic Non-rigid Histology Image Registration (ANHIR) website and
AutomatiC Registration Of Breast cAncer Tissue (ACROBAT) website. It
ranked 1st on both websites as of August 6th, 2024.
组织学图像配准是组织学图像分析的一项基本任务。由于多次染色导致外观存在显着差异,因此具有挑战性。已引入关键点对应关系,即匹配的关键点对来指导基于无监督深度学习(DL)的注册方法来处理此类注册任务。本文提出了一种用于非刚性组织学图像配准的迭代关键点对应引导(IKCG)无监督网络。引入固定深度特征和可学习深度特征作为关键点描述符来自动建立关键点对应关系,其之间的距离用作损失函数来训练配准网络。从在自然图像数据集上预先训练的深度学习网络中提取的固定深度特征比手工制作的特征更具辨别力,这得益于深度学习网络的深度和分层性质。在组织学图像数据集上训练的配准网络的中间层输出被提取为可学习的深层特征,这些特征揭示了组织学图像的独特信息。采用迭代训练策略来训练配准网络并联合优化可学习的深层特征。得益于迭代训练策略优化的可学习深度特征的优异匹配能力,该方法可以解决局部非刚性大位移问题,这是通常由误操作引起的不可避免的问题,例如制作组织切片时的撕裂。该方法在自动非刚性组织学图像配准(ANHIR)网站和乳腺癌组织自动配准(ACROBAT)网站上进行了评估。截至2024年8月6日,它在两个网站上均排名第一。
AU Zhang, Yue
Peng, Chengtao
Wang, Qiuli
Song, Dan
Li, Kaiyan
Kevin Zhou, S
AU 张、彭岳、王成涛、宋秋丽、李丹、凯彦 Kevin Zhou、S
Unified Multi-Modal Image Synthesis for Missing Modality Imputation.
用于缺失模态插补的统一多模态图像合成。
Multi-modal medical images provide complementary soft-tissue
characteristics that aid in the screening and diagnosis of diseases.
However, limited scanning time, image corruption and various imaging
protocols often result in incomplete multi-modal images, thus limiting
the usage of multi-modal data for clinical purposes. To address this
issue, in this paper, we propose a novel unified multi-modal image
synthesis method for missing modality imputation. Our method overall
takes a generative adversarial architecture, which aims to synthesize
missing modalities from any combination of available ones with a single
model. To this end, we specifically design a Commonality- and
Discrepancy-Sensitive Encoder for the generator to exploit both
modality-invariant and specific information contained in input
modalities. The incorporation of both types of information facilitates
the generation of images with consistent anatomy and realistic details
of the desired distribution. Besides, we propose a Dynamic Feature
Unification Module to integrate information from a varying number of
available modalities, which enables the network to be robust to random
missing modalities. The module performs both hard integration and soft
integration, ensuring the effectiveness of feature combination while
avoiding information loss. Verified on two public multi-modal magnetic
resonance datasets, the proposed method is effective in handling various
synthesis tasks and shows superior performance compared to previous
methods.
多模态医学图像提供互补的软组织特征,有助于疾病的筛查和诊断。然而,有限的扫描时间、图像损坏和各种成像协议通常会导致多模态图像不完整,从而限制了多模态数据在临床上的使用。为了解决这个问题,在本文中,我们提出了一种用于缺失模态插补的新型统一多模态图像合成方法。我们的方法总体上采用生成对抗架构,其目的是从可用的模式与单个模型的任意组合中合成缺失的模式。为此,我们专门为生成器设计了一个通用性和差异敏感编码器,以利用输入模态中包含的模态不变信息和特定信息。两种类型信息的结合有助于生成具有一致的解剖结构和所需分布的真实细节的图像。此外,我们提出了一个动态特征统一模块来集成来自不同数量的可用模态的信息,这使得网络能够对随机丢失的模态具有鲁棒性。该模块同时进行硬集成和软集成,保证特征组合的有效性,同时避免信息丢失。在两个公共多模态磁共振数据集上进行验证,所提出的方法可以有效处理各种合成任务,并且与以前的方法相比表现出优越的性能。
AU Xiao, Jiayin
Li, Si
Lin, Tongxu
Zhu, Jian
Yuan, Xiaochen
Feng, David Dagan
Sheng, Bin
区晓、李佳音、林思、朱同旭、袁建、冯晓晨、盛大干、斌
Multi-Label Chest X-Ray Image Classification with Single Positive
Labels.
具有单一阳性标签的多标签胸部 X 射线图像分类。
Deep learning approaches for multi-label Chest X-ray (CXR) images
classification usually require large-scale datasets. However, acquiring
such datasets with full annotations is costly, time-consuming, and prone
to noisy labels. Therefore, we introduce a weakly supervised learning
problem called Single Positive Multi-label Learning (SPML) into CXR
images classification (abbreviated as SPML-CXR), in which only one
positive label is annotated per image. A simple solution to SPML-CXR
problem is to assume that all the unannotated pathological labels are
negative, however, it might introduce false negative labels and decrease
the model performance. To this end, we present a Multi-level
Pseudo-label Consistency (MPC) framework for SPML-CXR. First, inspired
by the pseudo-labeling and consistency regularization in semi-supervised
learning, we construct a weak-to-strong consistency framework, where the
model prediction on weakly-augmented image is treated as the pseudo
label for supervising the model prediction on a strongly-augmented
version of the same image, and define an Image-level Perturbation-based
Consistency (IPC) regularization to recover the potential mislabeled
positive labels. Besides, we incorporate Random Elastic Deformation
(RED) as an additional strong augmentation to enhance the perturbation.
Second, aiming to expand the perturbation space, we design a
perturbation stream to the consistency framework at the feature-level
and introduce a Feature-level Perturbation-based Consistency (FPC)
regularization as a supplement. Third, we design a Transformer-based
encoder module to explore the sample relationship within each mini-batch
by a Batch-level Transformer-based Correlation (BTC) regularization.
Extensive experiments on the CheXpert and MIMIC-CXR datasets have shown
the effectiveness of our MPC framework for solving the SPML-CXR problem.
用于多标签胸部 X 射线 (CXR) 图像分类的深度学习方法通常需要大规模数据集。然而,获取具有完整注释的此类数据集成本高昂、耗时,并且容易产生嘈杂的标签。因此,我们将一种称为单正多标签学习(SPML)的弱监督学习问题引入CXR图像分类(缩写为SPML-CXR),其中每张图像仅注释一个正标签。 SPML-CXR 问题的一个简单解决方案是假设所有未注释的病理标签均为阴性,然而,这可能会引入假阴性标签并降低模型性能。为此,我们提出了 SPML-CXR 的多级伪标签一致性(MPC)框架。首先,受到半监督学习中伪标签和一致性正则化的启发,我们构建了一个弱到强的一致性框架,其中弱增强图像上的模型预测被视为伪标签,用于监督弱增强图像上的模型预测。同一图像的强烈增强版本,并定义图像级基于扰动的一致性(IPC)正则化以恢复潜在的错误标记的正标签。此外,我们将随机弹性变形(RED)作为额外的强增强来增强扰动。其次,为了扩展扰动空间,我们设计了特征级一致性框架的扰动流,并引入了特征级基于扰动的一致性(FPC)正则化作为补充。第三,我们设计了一个基于 Transformer 的编码器模块,通过批量级基于 Transformer 的相关性 (BTC) 正则化来探索每个小批量内的样本关系。 对 CheXpert 和 MIMIC-CXR 数据集的大量实验表明,我们的 MPC 框架在解决 SPML-CXR 问题方面的有效性。
AU Yang, Yuming
Duan, Huilong
Zheng, Yinfei
欧阳、段玉明、郑辉龙、银飞
Improved Transcranial Plane-Wave Imaging With Learned Speed-of-Sound
Maps
利用学习的声速图改进经颅平面波成像
Although transcranial ultrasound plane-wave imaging (PWI) has promising
clinical application prospects, studies have shown that variable
speed-of-sound (SoS) would seriously damage the quality of ultrasound
images. The mismatch between the conventional constant velocity
assumption and the actual SoS distribution leads to the general blurring
of ultrasound images. The optimization scheme for reconstructing
transcranial ultrasound image is often solved using iterative methods
like full-waveform inversion. These iterative methods are
computationally expensive and based on prior magnetic resonance imaging
(MRI) or computed tomography (CT) information. In contrast, the
multi-stencils fast marching (MSFM) method can produce accurate time
travel maps for the skull with heterogeneous acoustic speed. In this
study, we first propose a convolutional neural network (CNN) to predict
SoS maps of the skull from PWI channel data. Then, use these maps to
correct the travel time to reduce transcranial aberration. To validate
the performance of the proposed method, numerical, phantom and intact
human skull studies were conducted using a linear array transducer
(L11-5v, 128 elements, pitch = 0.3 mm). Numerical simulations
demonstrate that for point targets, the lateral resolution of
MSFM-restored images increased by 65%, and the center position shift
decreased by 89%. For the cyst targets, the eccentricity of the fitting
ellipse decreased by 75%, and the center position shift decreased by
58%. In the phantom study, the lateral resolution of MSFM-restored
images was increased by 49%, and the position shift was reduced by 1.72
mm. This pipeline, termed AutoSoS, thus shows the potential to correct
distortions in real-time transcranial ultrasound imaging, as
demonstrated by experiments on the intact human skull.
尽管经颅超声平面波成像(PWI)具有广阔的临床应用前景,但研究表明可变声速(SoS)会严重损害超声图像的质量。传统的等速假设与实际的 SoS 分布之间的不匹配导致超声图像普遍模糊。重建经颅超声图像的优化方案通常使用全波形反演等迭代方法来求解。这些迭代方法的计算成本很高,并且基于先前的磁共振成像 (MRI) 或计算机断层扫描 (CT) 信息。相比之下,多模板快速行进(MSFM)方法可以为具有异质声速的头骨生成准确的时间旅行图。在这项研究中,我们首先提出了一个卷积神经网络(CNN)来根据 PWI 通道数据预测头骨的 SoS 图。然后,使用这些图来校正行进时间以减少经颅像差。为了验证所提出方法的性能,使用线性阵列传感器(L11-5v,128 个元件,节距 = 0.3 mm)进行了数值、模型和完整的人类头骨研究。数值模拟表明,对于点目标,MSFM恢复图像的横向分辨率提高了65%,中心位置偏移降低了89%。对于囊肿目标,拟合椭圆的偏心率降低了75%,中心位置偏移降低了58%。在体模研究中,MSFM恢复图像的横向分辨率提高了49%,位置偏移减少了1.72毫米。 因此,这一管道被称为 AutoSoS,显示出纠正实时经颅超声成像失真的潜力,正如在完整人类头骨上进行的实验所证明的那样。
AU Zhang, Binyu
Meng, Zhu
Li, Hongyuan
Zhao, Zhicheng
Su, Fei
张AU、孟斌宇、朱力、赵宏远、苏志成、费
MTCSNet: One-stage learning and two-point labeling are sufficient for
cell segmentation.
MTCSNet:一阶段学习和两点标记足以进行细胞分割。
Deep convolution neural networks have been widely used in medical image
analysis, such as lesion identification in whole-slide images, cancer
detection, and cell segmentation, etc. However, it is often inevitable
that researchers try their best to refine annotations so as to enhance
the model performance, especially for cell segmentation task. Weakly
supervised learning can greatly reduce the workload of annotations,
while there is still a huge performance gap between the weakly and fully
supervised learning approaches. In this work, we propose a
weakly-supervised cell segmentation method, namely Multi-Task Cell
Segmentation Network (MTCSNet), for multi-modal medical images,
including pathological, brightfield, fluorescent, phase-contrast and
differential interference contrast images. MTCSNet is learnt in a
single-stage training manner, where only two annotated points for each
cell provide supervision information, and the first one is the centroid,
the second one is its boundary. Additionally, five auxiliary tasks are
elaborately designed to train the network, including two pixel-level
classifications, a pixel-level regression, a local temperature scaling
and an instance-level distance regression task, which is proposed to
regress the distances between the cell centroid and its boundaries in
eight orientations. The experimental results indicate that our method
outperforms all state-of-the-art weakly-supervised cell segmentation
approaches on public multi-modal medical image datasets. The promising
performance also shows that a single-stage learning with two-point
labeling approach are sufficient for cell segmentation, instead of fine
contour delineation. The codes are available at:
https://github.com/binging512/MTCSNet.
深度卷积神经网络已广泛应用于医学图像分析,例如全切片图像中的病灶识别、癌症检测和细胞分割等。然而,研究人员往往不可避免地会尽力细化注释以增强注释模型性能,特别是细胞分割任务。弱监督学习可以大大减少注释的工作量,但弱监督学习方法和全监督学习方法之间仍然存在巨大的性能差距。在这项工作中,我们提出了一种弱监督细胞分割方法,即多任务细胞分割网络(MTCSNet),用于多模态医学图像,包括病理、明场、荧光、相差和微分干涉对比图像。 MTCSNet以单阶段训练方式学习,每个单元只有两个注释点提供监督信息,第一个是质心,第二个是其边界。此外,还精心设计了五个辅助任务来训练网络,包括两个像素级分类、一个像素级回归、一个局部温度缩放和一个实例级距离回归任务,该任务旨在回归细胞质心之间的距离及其八个方向的边界。实验结果表明,我们的方法在公共多模态医学图像数据集上优于所有最先进的弱监督细胞分割方法。令人鼓舞的性能还表明,采用两点标记方法的单阶段学习足以进行细胞分割,而不是精细轮廓描绘。代码位于:https://github.com/binging512/MTCSNet。
AU Zhu, Jianjun
Wang, Cheng
Zhang, Yi
Zhan, Meixiao
Zhao, Wei
Teng, Sitong
Lu, Ligong
Teng, Gao-Jun
朱AU、王建军、张成、詹毅、赵美晓、滕伟、陆思同、滕立功、高军
3D/2D Vessel Registration Based on Monte Carlo Tree Search and Manifold
Regularization
基于蒙特卡罗树搜索和流形正则化的 3D/2D 船舶配准
The augmented intra-operative real-time imaging in vascular
interventional surgery, which is generally performed by projecting
preoperative computed tomography angiography images onto intraoperative
digital subtraction angiography (DSA) images, can compensate for the
deficiencies of DSA-based navigation, such as lack of depth information
and excessive use of toxic contrast agents. 3D/2D vessel registration is
the critical step in image augmentation. A 3D/2D registration method
based on vessel graph matching is proposed in this study. For rigid
registration, the matching of vessel graphs can be decomposed into
continuous states, thus 3D/2D vascular registration is formulated as a
search tree problem. The Monte Carlo tree search method is applied to
find the optimal vessel matching associated with the highest rigid
registration score. For nonrigid registration, we propose a novel vessel
deformation model based on manifold regularization. This model
incorporates the smoothness constraint of vessel topology into the
objective function. Furthermore, we derive simplified gradient formulas
that enable fast registration. The proposed technique undergoes
evaluation against seven rigid and three nonrigid methods using a
variety of data - simulated, algorithmically generated, and manually
annotated - across three vascular anatomies: the hepatic artery,
coronary artery, and aorta. Our findings show the proposed method's
resistance to pose variations, noise, and deformations, outperforming
existing methods in terms of registration accuracy and computational
efficiency. The proposed method demonstrates average registration errors
of 2.14 mm and 0.34 mm for rigid and nonrigid registration, and an
average computation time of 0.51 s.
血管介入手术中增强的术中实时成像通常通过将术前计算机断层扫描血管造影图像投影到术中数字减影血管造影(DSA)图像上来进行,可以弥补基于DSA的导航的不足,例如缺乏深度信息和过量使用有毒造影剂。 3D/2D 血管配准是图像增强的关键步骤。本研究提出了一种基于血管图匹配的3D/2D配准方法。对于刚性配准,血管图的匹配可以分解为连续状态,因此3D/2D血管配准被表述为搜索树问题。应用蒙特卡罗树搜索方法来寻找与最高刚性配准分数相关的最佳血管匹配。对于非刚性配准,我们提出了一种基于流形正则化的新型血管变形模型。该模型将容器拓扑的平滑约束纳入目标函数。此外,我们推导了简化的梯度公式,可以实现快速配准。所提出的技术使用各种数据(模拟的、算法生成的和手动注释的)针对七种刚性和三种非刚性方法进行了评估,涉及三种血管解剖结构:肝动脉、冠状动脉和主动脉。我们的研究结果表明,所提出的方法能够抵抗姿势变化、噪声和变形,在配准精度和计算效率方面优于现有方法。该方法的刚性和非刚性配准平均配准误差分别为 2.14 毫米和 0.34 毫米,平均计算时间为 0.51 秒。
AU Guan, Yu
Yu, Chuanming
Cui, Zhuoxu
Zhou, Huilin
Liu, Qiegen
区管、余宇、崔传明、周卓旭、刘慧琳、切根
Correlated and Multi-frequency Diffusion Modeling for Highly
Under-sampled MRI Reconstruction.
用于高度欠采样 MRI 重建的相关多频扩散建模。
Given the obstacle in accentuating the reconstruction accuracy for
diagnostically significant tissues, most existing MRI reconstruction
methods perform targeted reconstruction of the entire MR image without
considering fine details, especially when dealing with highly
under-sampled images. Therefore, a considerable volume of efforts has
been directed towards surmounting this challenge, as evidenced by the
emergence of numerous methods dedicated to preserving high-frequency
content as well as fine textural details in the reconstructed image. In
this case, exploring the merits associated with each method of mining
high-frequency information and formulating a reasonable principle to
maximize the joint utilization of these approaches will be a more
effective solution to achieve accurate reconstruction. Specifically,
this work constructs an innovative principle named Correlated and
Multi-frequency Diffusion Model (CM-DM) for highly under-sampled MRI
reconstruction. In essence, the rationale underlying the establishment
of such principle lies not in assembling arbitrary models, but in
pursuing the effective combinations and replacement of components. It
also means that the novel principle focuses on forming a correlated and
multi-frequency prior through different high-frequency operators in the
diffusion process. Moreover, multi-frequency prior further constraints
the noise term closer to the target distribution in the frequency
domain, thereby making the diffusion process converge faster.
Experimental results verify that the proposed method achieved superior
reconstruction accuracy, with a notable enhancement of approximately 2dB
in PSNR compared to state-of-the-art methods.
鉴于在增强具有诊断意义的组织的重建精度方面存在障碍,大多数现有的 MRI 重建方法对整个 MR 图像进行有针对性的重建,而不考虑精细细节,特别是在处理高度欠采样的图像时。因此,人们付出了大量的努力来克服这一挑战,许多致力于保留重建图像中高频内容以及精细纹理细节的方法的出现就证明了这一点。在这种情况下,探索每种高频信息挖掘方法的优点,并制定合理的原则,最大限度地联合利用这些方法,将是实现精确重建的更有效的解决方案。具体来说,这项工作构建了一种创新原理,称为相关和多频扩散模型(CM-DM),用于高度欠采样 MRI 重建。从本质上讲,建立这一原则的依据不在于任意组装模型,而在于追求构件的有效组合和替换。这也意味着该新颖原理的重点是在扩散过程中通过不同的高频算子形成相关的多频先验。而且,多频先验进一步约束噪声项在频域上更接近目标分布,从而使扩散过程收敛得更快。实验结果验证了所提出的方法实现了优异的重建精度,与最先进的方法相比,PSNR 显着提高了约 2dB。
AU Huang, Peizhou
Zhang, Chaoyi
Zhang, Xiaoliang
Li, Xiaojuan
Dong, Liang
Ying, Leslie
黄AU、张培洲、张超一、李晓亮、董晓娟、梁莹、Leslie
Self-Supervised Deep Unrolled Reconstruction Using Regularization by
Denoising
使用正则化去噪的自监督深度展开重建
Deep learning methods have been successfully used in various computer
vision tasks. Inspired by that success, deep learning has been explored
in magnetic resonance imaging (MRI) reconstruction. In particular,
integrating deep learning and model-based optimization methods has shown
considerable advantages. However, a large amount of labeled training
data is typically needed for high reconstruction quality, which is
challenging for some MRI applications. In this paper, we propose a novel
reconstruction method, named DURED-Net, that enables interpretable
self-supervised learning for MR image reconstruction by combining a
self-supervised denoising network and a plug-and-play method. We aim to
boost the reconstruction performance of Noise2Noise in MR reconstruction
by adding an explicit prior that utilizes imaging physics. Specifically,
the leverage of a denoising network for MRI reconstruction is achieved
using Regularization by Denoising (RED). Experiment results demonstrate
that the proposed method requires a reduced amount of training data to
achieve high reconstruction quality among the state-of-the-art
approaches utilizing Noise2Noise.
深度学习方法已成功应用于各种计算机视觉任务。受这一成功的启发,深度学习在磁共振成像 (MRI) 重建中得到了探索。特别是,深度学习和基于模型的优化方法的结合已经显示出相当大的优势。然而,为了获得高重建质量,通常需要大量标记的训练数据,这对于某些 MRI 应用来说是一个挑战。在本文中,我们提出了一种名为 DURED-Net 的新颖重建方法,该方法通过结合自监督去噪网络和即插即用方法,实现 MR 图像重建的可解释自监督学习。我们的目标是通过添加利用成像物理的显式先验来提高 MR 重建中 Noise2Noise 的重建性能。具体来说,利用去噪正则化 (RED) 来实现 MRI 重建的去噪网络的作用。实验结果表明,在利用 Noise2Noise 的最先进方法中,所提出的方法需要减少训练数据量来实现高重建质量。
AU Pei, Yuchen
Zhao, Fenqiang
Zhong, Tao
Ma, Laifa
Liao, Lufan
Wu, Zhengwang
Wang, Li
Zhang, He
Wang, Lisheng
Li, Gang
AU Pei, 赵雨辰, 钟奋强, 马涛, 廖来发, 吴路凡, 王正旺, 张莉, 王鹤, 李力生, 刚
PETS-Nets: Joint Pose Estimation and Tissue Segmentation of Fetal Brains
Using Anatomy-Guided Networks
PETS-Nets:使用解剖引导网络进行胎儿大脑的联合姿势估计和组织分割
Fetal Magnetic Resonance Imaging (MRI) is challenged by fetal movements
and maternal breathing. Although fast MRI sequences allow artifact free
acquisition of individual 2D slices, motion frequently occurs in the
acquisition of spatially adjacent slices. Motion correction for each
slice is thus critical for the reconstruction of 3D fetal brain MRI. In
this paper, we propose a novel multi-task learning framework that adopts
a coarse-to-fine strategy to jointly learn the pose estimation
parameters for motion correction and tissue segmentation map of each
slice in fetal MRI. Particularly, we design a regression-based
segmentation loss as a deep supervision to learn anatomically more
meaningful features for pose estimation and segmentation. In the coarse
stage, a U-Net-like network learns the features shared for both tasks.
In the refinement stage, to fully utilize the anatomical information,
signed distance maps constructed from the coarse segmentation are
introduced to guide the feature learning for both tasks. Finally,
iterative incorporation of the signed distance maps further improves the
performance of both regression and segmentation progressively.
Experimental results of cross-validation across two different fetal
datasets acquired with different scanners and imaging protocols
demonstrate the effectiveness of the proposed method in reducing the
pose estimation error and obtaining superior tissue segmentation results
simultaneously, compared with state-of-the-art methods.
胎儿磁共振成像 (MRI) 受到胎儿运动和母亲呼吸的挑战。尽管快速 MRI 序列允许无伪影地采集单个 2D 切片,但在采集空间相邻切片时经常会发生运动。因此,每个切片的运动校正对于 3D 胎儿脑 MRI 的重建至关重要。在本文中,我们提出了一种新颖的多任务学习框架,该框架采用从粗到细的策略来共同学习胎儿 MRI 中每个切片的运动校正和组织分割图的姿势估计参数。特别是,我们设计了基于回归的分割损失作为深度监督,以学习解剖学上更有意义的特征,以进行姿势估计和分割。在粗略阶段,类似 U-Net 的网络学习两个任务共享的特征。在细化阶段,为了充分利用解剖信息,引入了从粗分割构建的符号距离图来指导这两个任务的特征学习。最后,带符号距离图的迭代合并进一步逐步提高了回归和分割的性能。与最先进的方法相比,使用不同扫描仪和成像协议获取的两个不同胎儿数据集的交叉验证实验结果证明了所提出的方法在减少姿势估计误差和同时获得优异的组织分割结果方面的有效性。
AU van Gogh, Stefano
Mukherjee, Subhadip
Rawlik, Michal
Pereira, Alexandre
Spindler, Simon
Zdora, Marie-Christine
Stauber, Martin
Varga, Zsuzsanna
Stampanoni, Marco
AU 梵高、斯特凡诺·慕克吉、苏巴迪普·罗利克、米哈尔·佩雷拉、亚历山大·斯平德勒、西蒙·兹多拉、玛丽-克里斯蒂娜·施陶伯、马丁·瓦尔加、苏珊娜·斯坦帕诺尼、马可
Data-Driven Gradient Regularization for Quasi-Newton Optimization in
Iterative Grating Interferometry CT Reconstruction
迭代光栅干涉CT重建中数据驱动的梯度正则化拟牛顿优化
Grating interferometry CT (GI-CT) is a promising technology that could
play an important role in future breast cancer imaging. Thanks to its
sensitivity to refraction and small-angle scattering, GI-CT could
augment the diagnostic content of conventional absorption-based CT.
However, reconstructing GI-CT tomographies is a complex task because of
ill problem conditioning and high noise amplitudes. It has previously
been shown that combining data-driven regularization with iterative
reconstruction is promising for tackling challenging inverse problems in
medical imaging. In this work, we present an algorithm that allows
seamless combination of data-driven regularization with quasi-Newton
solvers, which can better deal with ill-conditioned problems compared to
gradient descent-based optimization algorithms. Contrary to most
available algorithms, our method applies regularization in the gradient
domain rather than in the image domain. This comes with a crucial
advantage when applied in conjunction with quasi-Newton solvers: the
Hessian is approximated solely based on denoised data. We apply the
proposed method, which we call GradReg, to both conventional breast CT
and GI-CT and show that both significantly benefit from our approach in
terms of dose efficiency. Moreover, our results suggest that thanks to
its sharper gradients that carry more high spatial-frequency content,
GI-CT can benefit more from GradReg compared to conventional breast CT.
Crucially, GradReg can be applied to any image reconstruction task which
relies on gradient-based updates.
光栅干涉CT(GI-CT)是一项很有前景的技术,可能在未来乳腺癌成像中发挥重要作用。由于其对折射和小角散射的敏感性,GI-CT 可以增强传统吸收 CT 的诊断内容。然而,由于不良问题调节和高噪声幅度,重建 GI-CT 断层扫描是一项复杂的任务。先前的研究表明,将数据驱动的正则化与迭代重建相结合有望解决医学成像中具有挑战性的逆问题。在这项工作中,我们提出了一种算法,可以将数据驱动的正则化与拟牛顿求解器无缝结合,与基于梯度下降的优化算法相比,它可以更好地处理病态问题。与大多数可用算法相反,我们的方法在梯度域而不是图像域中应用正则化。当与拟牛顿求解器结合使用时,这具有至关重要的优势:Hessian 矩阵仅基于去噪数据进行近似。我们将所提出的方法(我们称之为 GradReg)应用于传统乳腺 CT 和 GI-CT,并表明两者在剂量效率方面都显着受益于我们的方法。此外,我们的结果表明,由于其更尖锐的梯度携带更多的高空间频率内容,与传统乳腺 CT 相比,GI-CT 可以从 GradReg 中受益更多。至关重要的是,GradReg 可以应用于任何依赖于基于梯度的更新的图像重建任务。
AU Guo, Pengfei
Mei, Yiqun
Zhou, Jinyuan
Jiang, Shanshan
Patel, Vishal M.
郭AU,梅鹏飞,周逸群,姜金元,珊珊帕特尔,Vishal M.
ReconFormer: Accelerated MRI Reconstruction Using Recurrent Transformer
ReconFormer:使用循环变压器加速 MRI 重建
The accelerating magnetic resonance imaging (MRI) reconstruction process
is a challenging ill-posed inverse problem due to the excessive
under-sampling operation in k -space. In this paper, we propose a
recurrent Transformer model, namely ReconFormer, for MRI reconstruction,
which can iteratively reconstruct high-fidelity magnetic resonance
images from highly under-sampled k-space data (e.g., up to 8 x
acceleration). In particular, the proposed architecture is built upon
Recurrent Pyramid Transformer Layers (RPTLs). The core design of the
proposed method is Recurrent Scale-wise Attention (RSA), which jointly
exploits intrinsic multi-scale information at every architecture unit as
well as the dependencies of the deep feature correlation through
recurrent states. Moreover, benefiting from its recurrent nature,
ReconFormer is lightweight compared to other baselines and only contains
1.1 M trainable parameters. We validate the effectiveness of ReconFormer
on multiple datasets with different magnetic resonance sequences and
show that it achieves significant improvements over the state-of-the-art
methods with better parameter efficiency. The implementation code and
pre-trained weights are available at
https://github.com/guopengf/ReconFormer.
由于k空间中过度的欠采样操作,加速磁共振成像(MRI)重建过程是一个具有挑战性的不适定反问题。在本文中,我们提出了一种用于 MRI 重建的循环 Transformer 模型,即 ReconFormer,它可以从高度欠采样的 k 空间数据(例如,高达 8 倍加速度)迭代重建高保真磁共振图像。特别是,所提出的架构是建立在循环金字塔变压器层(RPTL)之上的。该方法的核心设计是循环尺度注意(RSA),它联合利用每个架构单元的内在多尺度信息以及通过循环状态的深层特征相关性的依赖关系。此外,得益于其循环性质,ReconFormer 与其他基线相比是轻量级的,并且仅包含 110 万个可训练参数。我们在具有不同磁共振序列的多个数据集上验证了 ReconFormer 的有效性,并表明它比最先进的方法取得了显着改进,具有更好的参数效率。实现代码和预训练权重可在 https://github.com/guopengf/ReconFormer 获取。
AU Song, Zhiyun
Du, Penghui
Yan, Junpeng
Li, Kailu
Shou, Jianzhong
Lai, Maode
Fan, Yubo
Xu, Yan
AU宋、杜志云、严鹏辉、李俊鹏、寿凯鲁、赖建中、范茂德、徐宇波、严
Nucleus-Aware Self-Supervised Pretraining Using Unpaired Image-to-Image
Translation for Histopathology Images
使用不成对的图像到图像转换进行组织病理学图像的核感知自监督预训练
Self-supervised pretraining attempts to enhance model performance by
obtaining effective features from unlabeled data, and has demonstrated
its effectiveness in the field of histopathology images. Despite its
success, few works concentrate on the extraction of nucleus-level
information, which is essential for pathologic analysis. In this work,
we propose a novel nucleus-aware self-supervised pretraining framework
for histopathology images. The framework aims to capture the nuclear
morphology and distribution information through unpaired image-to-image
translation between histopathology images and pseudo mask images. The
generation process is modulated by both conditional and stochastic style
representations, ensuring the reality and diversity of the generated
histopathology images for pretraining. Further, an instance segmentation
guided strategy is employed to capture instance-level information. The
experiments on 7 datasets show that the proposed pretraining method
outperforms supervised ones on Kather classification, multiple instance
learning, and 5 dense-prediction tasks with the transfer learning
protocol, and yields superior results than other self-supervised
approaches on 8 semi-supervised tasks. Our project is publicly available
at https://github.com/zhiyuns/UNITPathSSL.
自监督预训练试图通过从未标记数据中获取有效特征来增强模型性能,并已在组织病理学图像领域证明了其有效性。尽管取得了成功,但很少有工作集中于提取细胞核水平信息,这对于病理分析至关重要。在这项工作中,我们提出了一种新颖的用于组织病理学图像的核感知自监督预训练框架。该框架旨在通过组织病理学图像和伪掩模图像之间不成对的图像到图像转换来捕获核形态和分布信息。生成过程通过条件和随机风格表示进行调制,确保生成的用于预训练的组织病理学图像的真实性和多样性。此外,采用实例分割引导策略来捕获实例级信息。在 7 个数据集上的实验表明,所提出的预训练方法在 Kather 分类、多实例学习和使用迁移学习协议的 5 个密集预测任务上优于有监督的方法,并且在 8 个半监督任务上比其他自监督方法产生了更好的结果。我们的项目已公开发布于 https://github.com/zhiyuns/UNITPathSSL。
AU Fontanella, Alessandro
Mair, Grant
Wardlaw, Joanna
Trucco, Emanuele
Storkey, Amos
AU Fontanella、亚历山德罗·梅尔、格兰特·沃德劳、乔安娜·特鲁科、埃马努埃莱·斯托基、阿莫斯
Diffusion Models for Counterfactual Generation and Anomaly Detection in
Brain Images.
脑图像中反事实生成和异常检测的扩散模型。
Segmentation masks of pathological areas are useful in many medical
applications, such as brain tumour and stroke management. Moreover,
healthy counterfactuals of diseased images can be used to enhance
radiologists' training files and to improve the interpretability of
segmentation models. In this work, we present a weakly supervised method
to generate a healthy version of a diseased image and then use it to
obtain a pixel-wise anomaly map. To do so, we start by considering a
saliency map that approximately covers the pathological areas, obtained
with ACAT. Then, we propose a technique that allows to perform targeted
modifications to these regions, while preserving the rest of the image.
In particular, we employ a diffusion model trained on healthy samples
and combine Denoising Diffusion Probabilistic Model (DDPM) and Denoising
Diffusion Implicit Model (DDIM) at each step of the sampling process.
DDPM is used to modify the areas affected by a lesion within the
saliency map, while DDIM guarantees reconstruction of the normal anatomy
outside of it. The two parts are also fused at each timestep, to
guarantee the generation of a sample with a coherent appearance and a
seamless transition between edited and unedited parts. We verify that
when our method is applied to healthy samples, the input images are
reconstructed without significant modifications. We compare our approach
with alternative weakly supervised methods on the task of brain lesion
segmentation, achieving the highest mean Dice and IoU scores among the
models considered.
病理区域的分割掩模在许多医学应用中都很有用,例如脑肿瘤和中风管理。此外,患病图像的健康反事实可用于增强放射科医生的培训文件并提高分割模型的可解释性。在这项工作中,我们提出了一种弱监督方法来生成患病图像的健康版本,然后使用它来获得逐像素异常图。为此,我们首先考虑使用 ACAT 获得的大致覆盖病理区域的显着图。然后,我们提出了一种技术,允许对这些区域进行有针对性的修改,同时保留图像的其余部分。特别是,我们采用在健康样本上训练的扩散模型,并在采样过程的每个步骤中结合去噪扩散概率模型(DDPM)和去噪扩散隐式模型(DDIM)。 DDPM 用于修改显着图中受病变影响的区域,而 DDIM 则保证重建其外部的正常解剖结构。这两个部分也在每个时间步融合,以保证生成具有连贯外观的样本以及编辑和未编辑部分之间的无缝过渡。我们验证了当我们的方法应用于健康样本时,输入图像的重建无需进行重大修改。我们将我们的方法与大脑病变分割任务上的其他弱监督方法进行比较,在所考虑的模型中实现了最高的平均 Dice 和 IoU 分数。
AU Wu, Jianghao
Guo, Dong
Wang, Guotai
Yue, Qiang
Yu, Huijun
Li, Kang
Zhang, Shaoting
吴AU、郭江浩、王栋、岳国泰、于强、李惠军、张康、绍婷
FPL plus : Filtered Pseudo Label-Based Unsupervised Cross-Modality
Adaptation for 3D Medical Image Segmentation
FPL plus:基于过滤伪标签的无监督跨模态适应 3D 医学图像分割
Adapting a medical image segmentation model to a new domain is important
for improving its cross-domain transferability, and due to the expensive
annotation process, Unsupervised Domain Adaptation (UDA) is appealing
where only unlabeled images are needed for the adaptation. Existing UDA
methods are mainly based on image or feature alignment with adversarial
training for regularization, and they are limited by insufficient
supervision in the target domain. In this paper, we propose an enhanced
Filtered Pseudo Label (FPL+)-based UDA method for 3D medical image
segmentation. It first uses cross-domain data augmentation to translate
labeled images in the source domain to a dual-domain training set
consisting of a pseudo source-domain set and a pseudo target-domain set.
To leverage the dual-domain augmented images to train a pseudo label
generator, domain-specific batch normalization layers are used to deal
with the domain shift while learning the domain-invariant structure
features, generating high-quality pseudo labels for target-domain
images. We then combine labeled source-domain images and target-domain
images with pseudo labels to train a final segmentor, where image-level
weighting based on uncertainty estimation and pixel-level weighting
based on dual-domain consensus are proposed to mitigate the adverse
effect of noisy pseudo labels. Experiments on three public multi-modal
datasets for Vestibular Schwannoma, brain tumor and whole heart
segmentation show that our method surpassed ten state-of-the-art UDA
methods, and it even achieved better results than fully supervised
learning in the target domain in some cases.
将医学图像分割模型适应新领域对于提高其跨域可转移性非常重要,并且由于昂贵的注释过程,无监督域适应(UDA)在仅需要未标记图像进行适应的情况下很有吸引力。现有的UDA方法主要基于图像或特征对齐并通过对抗性训练进行正则化,并且受到目标域监督不足的限制。在本文中,我们提出了一种基于增强型过滤伪标签(FPL+)的 UDA 方法,用于 3D 医学图像分割。它首先使用跨域数据增强将源域中的标记图像转换为由伪源域集和伪目标域集组成的双域训练集。为了利用双域增强图像来训练伪标签生成器,使用特定于域的批量归一化层来处理域移位,同时学习域不变的结构特征,为目标域图像生成高质量的伪标签。然后,我们将标记的源域图像和目标域图像与伪标签结合起来训练最终的分割器,其中提出基于不确定性估计的图像级加权和基于双域共识的像素级加权来减轻嘈杂的伪标签。在前庭神经鞘瘤、脑肿瘤和全心脏分割的三个公共多模态数据集上的实验表明,我们的方法超越了十种最先进的 UDA 方法,甚至在某些目标领域取得了比完全监督学习更好的结果。案例。
AU Yang, Chen
Wang, Kailing
Wang, Yuehao
Dou, Qi
Yang, Xiaokang
Shen, Wei
欧阳、王晨、王凯灵、窦跃豪、杨奇、沉小康、魏
Efficient Deformable Tissue Reconstruction via Orthogonal Neural Plane
通过正交神经平面的高效可变形组织重建
Intraoperative imaging techniques for reconstructing deformable tissues
in vivo are pivotal for advanced surgical systems. Existing methods
either compromise on rendering quality or are excessively
computationally intensive, often demanding dozens of hours to perform,
which significantly hinders their practical application. In this paper,
we introduce Fast Orthogonal Plane (Forplane), a novel, efficient
framework based on neural radiance fields (NeRF) for the reconstruction
of deformable tissues. We conceptualize surgical procedures as 4D
volumes, and break them down into static and dynamic fields comprised of
orthogonal neural planes. This factorization discretizes the
four-dimensional space, leading to a decreased memory usage and faster
optimization. A spatiotemporal importance sampling scheme is introduced
to improve performance in regions with tool occlusion as well as large
motions and accelerate training. An efficient ray marching method is
applied to skip sampling among empty regions, significantly improving
inference speed. Forplane accommodates both binocular and monocular
endoscopy videos, demonstrating its extensive applicability and
flexibility. Our experiments, carried out on two in vivo datasets, the
EndoNeRF and Hamlyn datasets, demonstrate the effectiveness of our
framework. In all cases, Forplane substantially accelerates both the
optimization process (by over 100 times) and the inference process (by
over 15 times) while maintaining or even improving the quality across a
variety of non-rigid deformations. This significant performance
improvement promises to be a valuable asset for future intraoperative
surgical applications. The code of our project is now available at
https://github.com/Loping151/ForPlane.
用于重建体内可变形组织的术中成像技术对于先进手术系统至关重要。现有的方法要么会影响渲染质量,要么计算量过大,通常需要数十个小时才能执行,这极大地阻碍了它们的实际应用。在本文中,我们介绍了快速正交平面(Forplane),这是一种基于神经辐射场(NeRF)的新型高效框架,用于重建可变形组织。我们将外科手术概念化为 4D 体积,并将其分解为由正交神经平面组成的静态和动态场。这种因式分解使四维空间离散化,从而减少内存使用并加快优化速度。引入时空重要性采样方案来提高工具遮挡和大运动区域的性能并加速训练。采用高效的光线行进方法来跳过空白区域之间的采样,显着提高推理速度。 Forplane 可容纳双目和单目内窥镜视频,展示了其广泛的适用性和灵活性。我们在两个体内数据集 EndoNeRF 和 Hamlyn 数据集上进行的实验证明了我们框架的有效性。在所有情况下,Forplane 都显着加速了优化过程(超过 100 倍)和推理过程(超过 15 倍),同时保持甚至提高了各种非刚性变形的质量。这一显着的性能改进有望成为未来术中手术应用的宝贵资产。我们项目的代码现在可以在 https://github.com/Loping151/ForPlane 上找到。
AU Xu, Zhenghua
Liu, Yunxin
Xu, Gang
Lukasiewicz, Thomas
AU 徐、刘正华、徐云欣、Gang Lukasiewicz、Thomas
Self-Supervised Medical Image Segmentation Using Deep Reinforced
Adaptive Masking.
使用深度强化自适应掩蔽的自监督医学图像分割。
Self-supervised learning aims to learn transferable representations from
unlabeled data for downstream tasks. Inspired by masked language
modeling in natural language processing, masked image modeling (MIM) has
achieved certain success in the field of computer vision, but its
effectiveness in medical images remains unsatisfactory. This is mainly
due to the high redundancy and small discriminative regions in medical
images compared to natural images. Therefore, this paper proposes an
adaptive hard masking (AHM) approach based on deep reinforcement
learning to expand the application of MIM in medical images. Unlike
predefined random masks, AHM uses an asynchronous advantage actor-critic
(A3C) model to predict reconstruction loss for each patch, enabling the
model to learn where masking is valuable. By optimizing the
non-differentiable sampling process using reinforcement learning, AHM
enhances the understanding of key regions, thereby improving downstream
task performance. Experimental results on two medical image datasets
demonstrate that AHM outperforms state-of-the-art methods. Additional
experiments under various settings validate the effectiveness of AHM in
constructing masked images.
自监督学习旨在从下游任务的未标记数据中学习可转移的表示。受自然语言处理中掩蔽语言建模的启发,掩蔽图像建模(MIM)在计算机视觉领域取得了一定的成功,但其在医学图像中的效果仍不理想。这主要是由于与自然图像相比,医学图像具有高冗余度和小辨别区域。因此,本文提出一种基于深度强化学习的自适应硬掩蔽(AHM)方法,以扩展MIM在医学图像中的应用。与预定义的随机掩码不同,AHM 使用异步优势行动者批评家 (A3C) 模型来预测每个补丁的重建损失,使模型能够了解掩码在何处有价值。通过使用强化学习优化不可微采样过程,AHM 增强了对关键区域的理解,从而提高了下游任务性能。两个医学图像数据集的实验结果表明,AHM 优于最先进的方法。各种设置下的附加实验验证了 AHM 在构建蒙版图像方面的有效性。
EI 1558-254X
DA 2024-08-03
UT MEDLINE:39088493
PM 39088493
ER
EI 1558-254X DA 2024-08-03 UT MEDLINE:39088493 PM 39088493 ER
AU Cai, Linqin
Fang, Haodu
Xu, Nuoying
Ren, Bo
蔡区、方林勤、徐浩都、任诺英、薄
Counterfactual Causal-Effect Intervention for Interpretable Medical
Visual Question Answering.
可解释的医学视觉问答的反事实因果效应干预。
Medical Visual Question Answering (VQA-Med) is a challenging task that
involves answering clinical questions related to medical images.
However, most current VQA-Med methods ignore the causal correlation
between specific lesion or abnormality features and answers, while also
failing to provide accurate explanations for their decisions. To explore
the interpretability of VQA-Med, this paper proposes a novel CCIS-MVQA
model for VQA-Med based on a counterfactual causal-effect intervention
strategy. This model consists of the modified ResNet for image feature
extraction, a GloVe decoder for question feature extraction, a bilinear
attention network for vision and language feature fusion, and an
interpretability generator for producing the interpretability and
prediction results. The proposed CCIS-MVQA introduces a layer-wise
relevance propagation method to automatically generate counterfactual
samples. Additionally, CCIS-MVQA applies counterfactual causal reasoning
throughout the training phase to enhance interpretability and
generalization. Extensive experiments on three benchmark datasets show
that the proposed CCIS-MVQA model outperforms the state-of-the-art
methods. Enough visualization results are produced to analyze the
interpretability and performance of CCIS-MVQA.
医学视觉问答 (VQA-Med) 是一项具有挑战性的任务,涉及回答与医学图像相关的临床问题。然而,当前大多数 VQA-Med 方法忽略了特定病变或异常特征与答案之间的因果关系,同时也未能为其决策提供准确的解释。为了探索 VQA-Med 的可解释性,本文提出了一种基于反事实因果干预策略的 VQA-Med 新型 CCIS-MVQA 模型。该模型由用于图像特征提取的改进的 ResNet、用于问题特征提取的 GloVe 解码器、用于视觉和语言特征融合的双线性注意网络以及用于生成可解释性和预测结果的可解释性生成器组成。所提出的 CCIS-MVQA 引入了分层相关性传播方法来自动生成反事实样本。此外,CCIS-MVQA 在整个训练阶段应用反事实因果推理,以增强可解释性和泛化性。对三个基准数据集的广泛实验表明,所提出的 CCIS-MVQA 模型优于最先进的方法。生成足够的可视化结果来分析 CCIS-MVQA 的可解释性和性能。
AU Zhu, Enjun
Feng, Haiyu
Chen, Long
Lai, Yongqiang
Chai, Senchun
朱AU、冯恩俊、陈海宇、赖龙、柴永强、森春
MP-Net: A Multi-Center Privacy-Preserving Network for Medical Image
Segmentation
MP-Net:用于医学图像分割的多中心隐私保护网络
In this paper, we present the Multi-Center Privacy-Preserving Network
(MP-Net), a novel framework designed for secure medical image
segmentation in multi-center collaborations. Our methodology offers a
new approach to multi-center collaborative learning, capable of reducing
the volume of data transmission and enhancing data privacy protection.
Unlike federated learning, which requires the transmission of model data
between the central server and local servers in each round, our method
only necessitates a single transfer of encrypted data. The proposed
MP-Net comprises a three-layer model, consisting of encryption,
segmentation, and decryption networks. We encrypt the image data into
ciphertext using an encryption network and introduce an improved U-Net
for image ciphertext segmentation. Finally, the segmentation mask is
obtained through a decryption network. This architecture enables
ciphertext-based image segmentation through computable image encryption.
We evaluate the effectiveness of our approach on three datasets,
including two cardiac MRI datasets and a CTPA dataset. Our results
demonstrate that the MP-Net can securely utilize data from multiple
centers to establish a more robust and information-rich segmentation
model.
在本文中,我们提出了多中心隐私保护网络(MP-Net),这是一种专为多中心协作中安全医学图像分割而设计的新颖框架。我们的方法论提供了一种新的多中心协作学习方法,能够减少数据传输量并增强数据隐私保护。与联邦学习每轮都需要在中央服务器和本地服务器之间传输模型数据不同,我们的方法只需要单次传输加密数据。所提出的 MP-Net 包括一个三层模型,由加密、分段和解密网络组成。我们使用加密网络将图像数据加密为密文,并引入改进的 U-Net 进行图像密文分割。最后通过解密网络得到分割掩码。该架构通过可计算图像加密实现基于密文的图像分割。我们评估了我们的方法在三个数据集上的有效性,包括两个心脏 MRI 数据集和一个 CTPA 数据集。我们的结果表明,MP-Net 可以安全地利用来自多个中心的数据来建立更强大且信息丰富的分割模型。
AU Le, Tuan-Anh
Bui, Minh Phu
Hadadian, Yaser
Gadelmowla, Khaled Mohamed
Oh, Seungjun
Im, Chaemin
Hahn, Seungyong
Yoon, Jungwon
AU Le、Tuan-Anh Bui、Minh Phu Hadadian、Yaser Gadelmowla、Khaled Mohamed Oh、Seungjun Im、Chaemin Hahn、Seungyong Yoon、Jungwon
Towards human-scale magnetic particle imaging: development of the first
system with superconductor-based selection coils.
迈向人体规模的磁粒子成像:开发第一个具有基于超导体的选择线圈的系统。
Magnetic Particle Imaging (MPI) is an emerging tomographic modality that
allows for precise three-dimensional (3D) mapping of magnetic
nanoparticles (MNPs) concentration and distribution. Although
significant progress has been made towards improving MPI since its
introduction, scaling it up for human applications has proven
challenging. High-quality images have been obtained in animal-scale MPI
scanners with gradients up to 7 T/m/mu0, however, for MPI systems with
bore diameters around 200 mm the gradients generated by electromagnets
drop significantly to below 0.5 T/m/mu0. Given the current technological
limitations in image reconstruction and the properties of available
MNPs, these low gradients inherently impose limitations on improving MPI
resolution for higher precision medical imaging. Utilizing
superconductors stands out as a promising approach for developing a
human-scale MPI system. In this study, we introduce, for the first time,
a human-scale amplitude-modulated (AM) MPI system with
superconductor-based selection coils. The system achieves an
unprecedented magnetic field gradient of up to 2.5 T/m/mu0 within a 200
mm bore diameter, enabling large fields of view of 100 * 130 * 98 mm3 at
2.5 T/m/mu0 for 3D imaging. While obtained spatial resolution is in the
order of previous animal-scale AM MPIs, incorporating superconductors
for achieving such high gradients in a 200 mm bore diameter marks a
major step toward clinical MPI.
磁粒子成像 (MPI) 是一种新兴的断层扫描模式,可对磁性纳米颗粒 (MNP) 的浓度和分布进行精确的三维 (3D) 测绘。尽管自推出以来在改进 MPI 方面已经取得了重大进展,但将其扩展到人类应用程序已被证明具有挑战性。梯度高达 7 T/m/mu0 的动物级 MPI 扫描仪已获得高质量图像,但是,对于孔径约为 200 mm 的 MPI 系统,电磁体产生的梯度显着下降至 0.5 T/m/mu0 以下。考虑到当前图像重建的技术限制和可用 MNP 的特性,这些低梯度本质上限制了提高 MPI 分辨率以实现更高精度的医学成像。利用超导体是开发人类规模的 MPI 系统的一种有前途的方法。在这项研究中,我们首次引入了具有基于超导体的选择线圈的人体规模调幅 (AM) MPI 系统。该系统在200毫米孔径内实现了前所未有的高达2.5 T/m/mu0的磁场梯度,在2.5 T/m/mu0下实现了100 * 130 * 98 mm3的大视场,用于3D成像。虽然获得的空间分辨率与之前的动物规模 AM MPI 相当,但采用超导体在 200 毫米孔径中实现如此高的梯度标志着向临床 MPI 迈出了重要一步。
AU Wang, Yanyang
Li, Zirong
Wu, Weiwen
王AU、李艳阳、吴子荣、伟文
Time-reversion Fast-sampling Score-based Model for Limited-angle CT
Reconstruction.
用于有限角度 CT 重建的时间反转快速采样评分模型。
The score-based generative model (SGM) has received significant
attention in the field of medical imaging, particularly in the context
of limited-angle computed tomography (LACT). Traditional SGM approaches
achieved robust reconstruction performance by incorporating a
substantial number of sampling steps during the inference phase.
However, these established SGM-based methods require large computational
cost to reconstruct one case. The main challenge lies in achieving
high-quality images with rapid sampling while preserving sharp edges and
small features. In this study, we propose an innovative rapid-sampling
strategy for SGM, which we have aptly named the time-reversion
fast-sampling (TIFA) score-based model for LACT reconstruction. The
entire sampling procedure adheres steadfastly to the principles of
robust optimization theory and is firmly grounded in a comprehensive
mathematical model. TIFA's rapid-sampling mechanism comprises several
essential components, including jump sampling, time-reversion with
re-sampling, and compressed sampling. In the initial jump sampling
stage, multiple sampling steps are bypassed to expedite the attainment
of preliminary results. Subsequently, during the time-reversion process,
the initial results undergo controlled corruption by introducing
small-scale noise. The re-sampling process then diligently refines the
initially corrupted results. Finally, compressed sampling fine-tunes the
refinement outcomes by imposing regularization term. Quantitative and
qualitative assessments conducted on numerical simulations, real
physical phantom, and clinical cardiac datasets, unequivocally
demonstrate that TIFA method (using 200 steps) outperforms other
state-of-the-art methods (using 2000 steps) from available [0°, 90°] and
[0°, 60°]. Furthermore, experimental results underscore that our TIFA
method continues to reconstruct high-quality images even with 10 steps.
Our code at https://github.com/tianzhijiaoziA/TIFADiffusion.
基于评分的生成模型(SGM)在医学成像领域受到了极大的关注,特别是在有限角度计算机断层扫描(LACT)的背景下。传统的 SGM 方法通过在推理阶段结合大量采样步骤来实现稳健的重建性能。然而,这些已建立的基于 SGM 的方法需要大量的计算成本来重建一种情况。主要挑战在于通过快速采样获得高质量图像,同时保留锐利边缘和小特征。在这项研究中,我们提出了一种创新的 SGM 快速采样策略,我们恰当地将其命名为基于时间反转快速采样 (TIFA) 评分的 LACT 重建模型。整个采样过程坚定地遵循鲁棒优化理论的原则,并牢固地建立在综合数学模型的基础上。 TIFA 的快速采样机制由几个基本组件组成,包括跳跃采样、重新采样的时间反转以及压缩采样。在初始跳跃采样阶段,绕过多个采样步骤以加快获得初步结果。随后,在时间反转过程中,初始结果通过引入小规模噪声而遭受受控损坏。然后,重新采样过程会努力完善最初损坏的结果。最后,压缩采样通过施加正则化项来微调细化结果。对数值模拟、真实物理体模和临床心脏数据集进行的定量和定性评估明确证明 TIFA 方法(使用 200 个步骤)优于其他最先进的方法(使用 2000 个步骤)[0°, 90] °]和[0°,60°]。 此外,实验结果强调,我们的 TIFA 方法即使使用 10 个步骤也能继续重建高质量图像。我们的代码位于 https://github.com/tianzhijiaoziA/TIFADiffusion。
AU Guo, Zhanqiang
Tan, Zimeng
Feng, Jianjiang
Zhou, Jie
郭区、谭占强、冯子萌、周建江、杰
3D Vascular Segmentation Supervised by 2D Annotation of Maximum
Intensity Projection
由最大强度投影的 2D 注释监督的 3D 血管分割
Vascular structure segmentation plays a crucial role in medical analysis
and clinical applications. The practical adoption of fully supervised
segmentation models is impeded by the intricacy and time-consuming
nature of annotating vessels in the 3D space. This has spurred the
exploration of weakly-supervised approaches that reduce reliance on
expensive segmentation annotations. Despite this, existing weakly
supervised methods employed in organ segmentation, which encompass
points, bounding boxes, or graffiti, have exhibited suboptimal
performance when handling sparse vascular structure. To alleviate this
issue, we employ maximum intensity projection (MIP) to decrease the
dimensionality of 3D volume to 2D image for efficient annotation, and
the 2D labels are utilized to provide guidance and oversight for
training 3D vessel segmentation model. Initially, we generate
pseudo-labels for 3D blood vessels using the annotations of 2D
projections. Subsequently, taking into account the acquisition method of
the 2D labels, we introduce a weakly-supervised network that fuses 2D-3D
deep features via MIP to further improve segmentation performance.
Furthermore, we integrate confidence learning and uncertainty estimation
to refine the generated pseudo-labels, followed by fine-tuning the
segmentation network. Our method is validated on five datasets
(including cerebral vessel, aorta and coronary artery), demonstrating
highly competitive performance in segmenting vessels and the potential
to significantly reduce the time and effort required for vessel
annotation. Our code is available at:
https://github.com/gzq17/Weakly-Supervised-by-MIP.
血管结构分割在医学分析和临床应用中起着至关重要的作用。 3D 空间中注释血管的复杂性和耗时性阻碍了完全监督分割模型的实际采用。这刺激了对弱监督方法的探索,以减少对昂贵的分割注释的依赖。尽管如此,器官分割中采用的现有弱监督方法(包括点、边界框或涂鸦)在处理稀疏血管结构时表现出次优性能。为了缓解这个问题,我们采用最大强度投影(MIP)将3D体积降维为2D图像以进行有效注释,并利用2D标签为训练3D血管分割模型提供指导和监督。最初,我们使用 2D 投影的注释生成 3D 血管的伪标签。随后,考虑到 2D 标签的获取方法,我们引入了一种弱监督网络,通过 MIP 融合 2D-3D 深度特征,以进一步提高分割性能。此外,我们集成置信度学习和不确定性估计来细化生成的伪标签,然后微调分割网络。我们的方法在五个数据集(包括脑血管、主动脉和冠状动脉)上进行了验证,证明了在分割血管方面具有高度竞争力的性能,并且有可能显着减少血管注释所需的时间和精力。我们的代码位于:https://github.com/gzq17/Weakly-Supervised-by-MIP。
AU Shi, Yongyi
Gao, Yongfeng
Xu, Qiong
Li, Yang
Zhang, Chaoyang
Mou, Xuanqin
Liang, Zhengrong
区石、高永义、徐永峰、李琼、张阳、牟朝阳、梁宣勤、峥嵘
Learned Tensor Neural Network Texture Prior for Photon-Counting CT
Reconstruction.
学习用于光子计数 CT 重建的张量神经网络纹理先验。
Photon-counting computed tomography (PCCT) reconstructs multiple
energy-channel images to describe the same object, where there exists a
strong correlation among different channel images. In addition,
reconstruction of each channel image suffers photon count starving
problem. To make full use of the correlation among different channel
images to suppress the data noise and enhance the texture details in
reconstructing each channel image, this paper proposes a tensor neural
network (TNN) architecture to learn a multi-channel texture prior for
PCCT reconstruction. Specifically, we first learn a spatial texture
prior in each individual channel image by modeling the relationship
between the center pixels and its corresponding neighbor pixels using a
neural network. Then, we merge the single channel spatial texture prior
into multi-channel neural network to learn the spectral local
correlation information among different channel images. Since our
proposed TNN is trained on a series of unpaired small spatial-spectral
cubes which are extracted from one single reference multi-channel image,
the local correlation in the spatial-spectral cubes is considered by
TNN. To boost the TNN performance, a low-rank representation is also
employed to consider the global correlation among different channel
images. Finally, we integrate the learned TNN and the low-rank
representation as priors into Bayesian reconstruction framework. To
evaluate the performance of the proposed method, four references are
considered. One is simulated images from ultra-high-resolution CT. One
is spectral images from dual-energy CT. The other two are animal tissue
and preclinical mouse images from a custom-made PCCT systems. Our TNN
prior Bayesian reconstruction demonstrated better performance than other
state-of-the-art competing algorithms, in terms of not only preserving
texture feature but also suppressing image noise in each channel image.
光子计数计算机断层扫描(PCCT)重建多个能量通道图像来描述同一物体,其中不同通道图像之间存在很强的相关性。此外,每个通道图像的重建都存在光子计数匮乏的问题。为了充分利用不同通道图像之间的相关性来抑制数据噪声并增强重建每个通道图像时的纹理细节,本文提出了一种张量神经网络(TNN)架构来学习PCCT重建的多通道纹理先验。具体来说,我们首先通过使用神经网络对中心像素与其相应的相邻像素之间的关系进行建模来学习每个单独通道图像中的空间纹理先验。然后,我们将单通道空间纹理先验合并到多通道神经网络中,以学习不同通道图像之间的光谱局部相关信息。由于我们提出的 TNN 是在一系列不成对的小空间光谱立方体上进行训练的,这些立方体是从单个参考多通道图像中提取的,因此 TNN 考虑了空间光谱立方体中的局部相关性。为了提高 TNN 性能,还采用低秩表示来考虑不同通道图像之间的全局相关性。最后,我们将学习到的 TNN 和低秩表示作为先验集成到贝叶斯重建框架中。为了评估所提出方法的性能,考虑了四个参考。一种是超高分辨率 CT 的模拟图像。一幅是双能 CT 的光谱图像。另外两个是来自定制 PCCT 系统的动物组织和临床前小鼠图像。 我们的 TNN 先验贝叶斯重建表现出比其他最先进的竞争算法更好的性能,不仅保留了纹理特征,而且还抑制了每个通道图像中的图像噪声。
AU Tenditnaya, Anna
Gabriels, Ruben Y
Hooghiemstra, Wouter T R
Klemm, Uwe
Nagengast, Wouter B
Ntziachristos, Vasilis
Gorpas, Dimitris
AU Tenditnaya、Anna Gabriels、Ruben Y Hooghiemstra、Wouter TR Klemm、Uwe Nagengast、Wouter B Ntziachristos、Vasilis Gorpas、Dimitris
Performance Assessment and Quality Control of Fluorescence Molecular
Endoscopy with a Multi-Parametric Rigid Standard.
使用多参数刚性标准对荧光分子内窥镜进行性能评估和质量控制。
Fluorescence molecular endoscopy (FME) is emerging as a "red-flag"
technique with potential to deliver earlier, faster, and more
personalized detection of disease in the gastrointestinal tract,
including cancer, and to gain insights into novel drug distribution,
dose finding, and response prediction. However, to date, the performance
of FME systems is assessed mainly by endoscopists during a procedure,
leading to arbitrary, potentially biased, and heavily subjective
assessment. This approach significantly affects the repeatability of the
procedures and the interpretation or comparison of the acquired data,
representing a major bottleneck towards the clinical translation of the
technology. Herein, we propose a robust methodology for FME performance
assessment and quality control that is based on a novel multi-parametric
rigid standard. This standard enables the characterization of an FME
system's sensitivity through a single acquisition, performance
comparison of multiple systems, and, for the first time, quality control
of a system as a function of time and number of usages. We show the
photostability of the standard experimentally and demonstrate how it can
be used to characterize the performance of an FME system. Moreover, we
showcase how the standard can be employed for quality control of a
system. In this study, we find that the use of composite fluorescence
standards before endoscopic procedures can ensure that an FME system
meets the performance criteria and that components prone to performance
degradation are replaced in time, avoiding disruption of clinical
endoscopy logistics. This will help overcome a major barrier for the
translation of FME into the clinics.
荧光分子内窥镜 (FME) 正在成为一种“危险信号”技术,有可能提供更早、更快、更个性化的胃肠道疾病(包括癌症)检测,并深入了解新药物分布、剂量发现、和响应预测。然而,迄今为止,FME 系统的性能主要由内窥镜医师在手术过程中进行评估,导致评估随意、可能存在偏见且主观性很强。这种方法显着影响程序的可重复性以及所获取数据的解释或比较,是该技术临床转化的主要瓶颈。在此,我们提出了一种基于新颖的多参数严格标准的稳健的 FME 性能评估和质量控制方法。该标准能够通过单次采集、多个系统的性能比较来表征 FME 系统的灵敏度,并且首次将系统的质量控制作为时间和使用次数的函数。我们通过实验展示了该标准的光稳定性,并演示了如何使用它来表征 FME 系统的性能。此外,我们还展示了如何利用该标准来控制系统的质量。在这项研究中,我们发现在内窥镜手术之前使用复合荧光标准品可以确保 FME 系统满足性能标准,并及时更换容易出现性能下降的组件,从而避免临床内窥镜检查物流中断。这将有助于克服 FME 转化为临床的主要障碍。
AU Huang, Xiaofei
Gong, Hongfang
AU黄、宫晓飞、红芳
A Dual-Attention Learning Network With Word and Sentence Embedding for
Medical Visual Question Answering
用于医学视觉问答的具有词和句子嵌入的双注意学习网络
Research in medical visual question answering (MVQA) can contribute to
the development of computer-aided diagnosis. MVQA is a task that aims to
predict accurate and convincing answers based on given medical images
and associated natural language questions. This task requires extracting
medical knowledge-rich feature content and making fine-grained
understandings of them. Therefore, constructing an effective feature
extraction and understanding scheme are keys to modeling. Existing MVQA
question extraction schemes mainly focus on word information, ignoring
medical information in the text, such as medical concepts and
domain-specific terms. Meanwhile, some visual and textual feature
understanding schemes cannot effectively capture the correlation between
regions and keywords for reasonable visual reasoning. In this study, a
dual-attention learning network with word and sentence embedding
(DALNet-WSE) is proposed. We design a module, transformer with sentence
embedding (TSE), to extract a double embedding representation of
questions containing keywords and medical information. A dual-attention
learning (DAL) module consisting of self-attention and guided attention
is proposed to model intensive intramodal and intermodal interactions.
With multiple DAL modules (DALs), learning visual and textual
co-attention can increase the granularity of understanding and improve
visual reasoning. Experimental results on the ImageCLEF 2019 VQA-MED
(VQA-MED 2019) and VQA-RAD datasets demonstrate that our proposed method
outperforms previous state-of-the-art methods. According to the ablation
studies and Grad-CAM maps, DALNet-WSE can extract rich textual
information and has strong visual reasoning ability.
医学视觉问答(MVQA)的研究有助于计算机辅助诊断的发展。 MVQA 是一项旨在根据给定的医学图像和相关自然语言问题预测准确且令人信服的答案的任务。这项任务需要提取医学知识丰富的特征内容并对其进行细粒度的理解。因此,构建有效的特征提取和理解方案是建模的关键。现有的MVQA问题提取方案主要关注单词信息,忽略文本中的医学信息,例如医学概念和领域特定术语。同时,一些视觉和文本特征理解方案无法有效捕获区域和关键词之间的相关性以进行合理的视觉推理。在这项研究中,提出了一种具有词和句子嵌入的双注意力学习网络(DALNet-WSE)。我们设计了一个模块,带有句子嵌入的变压器(TSE),来提取包含关键词和医疗信息的问题的双重嵌入表示。提出了一种由自注意力和引导注意力组成的双注意力学习(DAL)模块来模拟密集的模式内和模式间交互。通过多个 DAL 模块(DAL),学习视觉和文本共同注意力可以增加理解的粒度并改善视觉推理。 ImageCLEF 2019 VQA-MED (VQA-MED 2019) 和 VQA-RAD 数据集上的实验结果表明,我们提出的方法优于以前的最先进方法。根据消融研究和 Grad-CAM 图,DALNet-WSE 可以提取丰富的文本信息,并具有很强的视觉推理能力。
C1 Changsha Univ Sci & Technol, Sch Math & Stat, Changsha 410114, Peoples R
China
SN 0278-0062
EI 1558-254X
DA 2024-05-25
UT WOS:001203303400010
PM 37812550
ER
C1 长沙理工大学数学与统计学院, 长沙 410114, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-05-25 UT WOS:001203303400010 PM 37812550 ER
AU Li, Wen
An, Nan
Cao, Fuzhi
Wang, Wenli
Wang, Chunhui
Xu, Weinan
Gao, Yang
Ning, Xiaolin
李区、安文、曹楠、王富志、王文丽、徐春辉、高伟南、宁杨、小林
Source Imaging Method based on Spatial Smoothing and Edge Sparsity
(SISSES) and Its Application to OPM-MEG.
基于空间平滑和边缘稀疏性的源成像方法(SISSES)及其在 OPM-MEG 中的应用。
Source estimation in magnetoencephalography (MEG) involves solving a
highly ill-posed problem without a unique solution. Accurate estimation
of the time course and spatial extent of the source is important for
studying the mechanisms of brain activity and preoperative functional
localization. Traditional methods tend to yield small-amplitude diffuse
or large-amplitude focused source estimates. Recently, the structured
sparsity-based source imaging algorithm has emerged as one of the most
promising algorithms for improving source extent estimation. However, it
suffers from a notable amplitude bias. To improve the spatiotemporal
resolution of reconstructed sources, we propose a novel method called
the source imaging method based on spatial smoothing and edge sparsity
(SISSES). In this method, the temporal dynamics of sources are modeled
using a set of temporal basis functions, and the spatial characteristics
of the source are represented by a first-order Markov random field (MRF)
model. In particular, sparse constraints are imposed on the MRF model
residuals in the original and variation domains. Numerical simulations
were conducted to validate the SISSES. The results demonstrate that
SISSES outperforms benchmark methods for estimating the time course,
location, and extent of patch sources. Additionally, auditory and median
nerve stimulation experiments were performed using a 31-channel
optically pumped magnetometer MEG system, and the SISSES was applied to
the source imaging of these data. The results demonstrate that SISSES
correctly identified the source regions in which brain responses
occurred at different times, demonstrating its feasibility for various
practical applications.
脑磁图 (MEG) 中的源估计涉及解决高度不适定问题,而没有唯一的解决方案。准确估计源的时间过程和空间范围对于研究大脑活动机制和术前功能定位具有重要意义。传统方法倾向于产生小幅度扩散或大幅度聚焦源估计。最近,基于结构化稀疏性的源成像算法已成为改进源范围估计的最有前途的算法之一。然而,它存在明显的幅度偏差。为了提高重建源的时空分辨率,我们提出了一种称为基于空间平滑和边缘稀疏性的源成像方法(SISSES)的新方法。在该方法中,使用一组时间基函数对源的时间动态进行建模,并通过一阶马尔可夫随机场(MRF)模型表示源的空间特征。特别是,对原始域和变化域中的 MRF 模型残差施加稀疏约束。进行数值模拟以验证 SISSES。结果表明,在估计补丁源的时间过程、位置和范围方面,SISSES 的性能优于基准方法。此外,使用 31 通道光泵磁力计 MEG 系统进行听觉和正中神经刺激实验,并将 SISSES 应用于这些数据的源成像。结果表明,SISSES 正确识别了不同时间发生大脑反应的源区域,证明了其在各种实际应用中的可行性。
AU Su, Jianpo
Wang, Bo
Fan, Zhipeng
Zhang, Yifan
Zeng, Ling-Li
Shen, Hui
Hu, Dewen
苏AU、王建坡、范博、张志鹏、曾一凡、沉玲丽、胡辉、德文
M2DC: A Meta-Learning Framework for Generalizable Diagnostic
Classification of Major Depressive Disorder.
M2DC:重度抑郁症通用诊断分类的元学习框架。
Psychiatric diseases are bringing heavy burdens for both individual
health and social stability. The accurate and timely diagnosis of the
diseases is essential for effective treatment and intervention. Thanks
to the rapid development of brain imaging technology and machine
learning algorithms, diagnostic classification of psychiatric diseases
can be achieved based on brain images. However, due to divergences in
scanning machines or parameters, the generalization capability of
diagnostic classification models has always been an issue. We propose
Meta-learning with Meta batch normalization and Distance Constraint
(M2DC) for training diagnostic classification models. The framework can
simulate the train-test domain shift situation and promote intra-class
cohesion, as well as inter-class separation, which can lead to clearer
classification margins and more generalizable models. To better encode
dynamic brain graphs, we propose a concatenated spatiotemporal attention
graph isomorphism network (CSTAGIN) as the backbone. The network is
trained for the diagnostic classification of major depressive disorder
(MDD) based on multi-site brain graphs. Extensive experiments on brain
images from over 3261 subjects show that models trained by M2DC achieve
the best performance on cross-site diagnostic classification tasks
compared to various contemporary domain generalization methods and SOTA
studies. The proposed M2DC is by far the first framework for
multi-source closed-set domain generalizable training of diagnostic
classification models for MDD and the trained models can be applied to
reliable auxiliary diagnosis on novel data.
精神疾病给个人健康和社会稳定带来沉重负担。准确、及时的疾病诊断对于有效的治疗和干预至关重要。得益于脑成像技术和机器学习算法的快速发展,基于脑图像可以实现精神疾病的诊断分类。然而,由于扫描机器或参数的差异,诊断分类模型的泛化能力一直是一个问题。我们提出使用元批量归一化和距离约束(M2DC)的元学习来训练诊断分类模型。该框架可以模拟训练-测试领域转移的情况,促进类内凝聚力和类间分离,从而获得更清晰的分类边界和更通用的模型。为了更好地编码动态脑图,我们提出了一个级联时空注意力图同构网络(CSTAGIN)作为主干。该网络经过训练,可根据多部位脑图对重度抑郁症 (MDD) 进行诊断分类。对超过 3261 名受试者的大脑图像进行的广泛实验表明,与各种当代领域泛化方法和 SOTA 研究相比,M2DC 训练的模型在跨站点诊断分类任务上实现了最佳性能。所提出的 M2DC 是迄今为止第一个用于 MDD 诊断分类模型的多源闭集域可推广训练的框架,训练后的模型可以应用于新数据的可靠辅助诊断。
AU Feng, Rui
Yang, Jingwen
Huang, Hao
Chen, Zelin
Feng, Ruiyan
Farrukh Hameed, N U
Zhang, Xudong
Hu, Jie
Chen, Liang
Lu, Shuo
AU Feng、Rui Yang、Jingwen Huang、Hao Chen、Zelin Feng、Ruiyan Farrukh Hameed、NU 张、胡旭东、Jie Chen、Liang Lu、Shuo
Spatiotemporal Microstate Dynamics of Spike-free Scalp EEG Offer a
Potential Biomarker for Refractory Temporal Lobe Epilepsy.
无尖峰头皮脑电图的时空微观状态动力学为难治性颞叶癫痫提供了潜在的生物标志物。
Refractory temporal lobe epilepsy (TLE) is one of the most frequently
observed subtypes of epilepsy and endangers more than 50 million people
world-wide. Although electroencephalogram (EEG) had been widely
recognized as a classic tool to screen and diagnose epilepsy, for many
years it heavily relied on identifying epileptic discharges and
epileptogenic zone localization, which however, limits the understanding
of refractory epilepsy due to the network nature of this disease. This
work hypothesizes that the microstate dynamics based on resting-state
scalp EEG can offer an additional network depiction of the disease and
provide potential complementary evaluation tool for the TLE even without
detectable epileptic discharges on EEG. We propose a novel framework for
EEG microstate spatial-temporal dynamics (EEG-MiSTD) analysis based on
machine learning to comprehensively model millisecond-changing
whole-brain network dynamics. With only 100 seconds of resting-state EEG
even without epileptic discharges, this approach successfully
distinguishes TLE patients from healthy controls and is related to the
lateralization of epileptic focus. Besides, microstate temporal and
spatial features are found to be widely related to clinical parameters,
which further demonstrate that TLE is a network disease. A preliminary
exploration suggests that the spatial topography is sensitive to the
following surgical outcomes. From such a new perspective, our results
suggest that spatiotemporal microstate dynamics is potentially a
biomarker of the disease. The developed EEG-MiSTD framework can probably
be considered as a general tool to examine dynamical brain network
disruption in a user-friendly way for other types of epilepsy.
难治性颞叶癫痫 (TLE) 是最常见的癫痫亚型之一,危害着全球超过 5000 万人。尽管脑电图(EEG)已被广泛认为是筛查和诊断癫痫的经典工具,但多年来它严重依赖于识别癫痫放电和致痫区定位,然而,由于其网络性质,这限制了对难治性癫痫的理解。疾病。这项工作假设基于静息态头皮脑电图的微态动力学可以提供疾病的额外网络描述,并为 TLE 提供潜在的补充评估工具,即使脑电图上没有可检测到的癫痫放电。我们提出了一种基于机器学习的脑电微状态时空动力学(EEG-MiSTD)分析的新框架,以全面模拟毫秒变化的全脑网络动力学。即使没有癫痫放电,这种方法也只需 100 秒的静息态脑电图即可成功区分 TLE 患者与健康对照,并且与癫痫病灶的偏侧化有关。此外,微状态时间和空间特征被发现与临床参数广泛相关,这进一步证明TLE是一种网络疾病。初步探索表明,空间地形对以下手术结果敏感。从这样一个新的角度来看,我们的结果表明时空微态动力学可能是该疾病的生物标志物。开发的 EEG-MiSTD 框架可能被视为一种通用工具,以用户友好的方式检查其他类型癫痫的动态脑网络中断。
AU Cui, Yue
Li, Chengyi
Lu, Yuheng
Ma, Liang
Cheng, Luqi
Cao, Long
Yu, Shan
Jiang, Tianzi
崔AU、李悦、路成毅、马宇恒、程亮、曹露琪、余龙、姜山、田子
Multimodal Connectivity-Based Individual Parcellation and Analysis for
Humans and Rhesus Monkeys
基于多模态连接的人类和恒河猴个体划分和分析
Individual brains vary greatly in morphology, connectivity and
organization. Individualized brain parcellation is capable of precisely
localizing subject-specific functional regions. However, most
individualization approaches have examined single modalities of data and
have not generalized to nonhuman primates. The present study proposed a
novel multimodal connectivity-based individual parcellation (MCIP)
method, which optimizes within-region homogeneity, spatial continuity
and similarity to a reference atlas with the fusion of personal
functional and anatomical connectivity. Comprehensive evaluation
demonstrated that MCIP outperformed state-of-the-art multimodal
individualization methods in terms of functional and anatomical
homogeneity, predictability of cognitive measures, heritability,
reproducibility and generalizability across species. Comparative
investigation showed a higher topographic variability in humans than
that in macaques. Therefore, MCIP provides improved accurate and
reliable mapping of brain functional regions over existing methods at an
individual level across species, and could facilitate comparative and
translational neuroscience research.
个体大脑在形态、连接性和组织方面存在很大差异。个性化的大脑分区能够精确定位受试者特定的功能区域。然而,大多数个体化方法都检查了单一的数据模式,并没有推广到非人类灵长类动物。本研究提出了一种新颖的基于多模态连接的个体分割(MCIP)方法,该方法通过融合个人功能和解剖连接来优化区域内的均匀性、空间连续性和与参考图集的相似性。综合评估表明,MCIP 在功能和解剖同质性、认知测量的可预测性、遗传性、再现性和跨物种普遍性方面优于最先进的多模式个体化方法。比较研究表明,人类的地形变异性高于猕猴。因此,与现有方法相比,MCIP 在跨物种的个体水平上提供了更准确、更可靠的大脑功能区域图谱,并且可以促进比较和转化神经科学研究。
AU Chikontwe, Philip
Kim, Meejeong
Jeong, Jaehoon
Sung, Hyun Jung
Go, Heounjeong
Nam, Soo Jeong
Park, Sang Hyun
AU Chikontwe、Philip Kim、Meejeong Jeong、Jaehoon Sung、Hyun Jung Go、Heounjeong Nam、Soo Jeong Park、Sang Hyun
FR-MIL: Distribution Re-calibration based Multiple Instance Learning
with Transformer for Whole Slide Image Classification.
FR-MIL:基于分布重新校准的多实例学习,使用 Transformer 进行整个幻灯片图像分类。
In digital pathology, whole slide images (WSI) are crucial for cancer
prognostication and treatment planning. WSI classification is generally
addressed using multiple instance learning (MIL), alleviating the
challenge of processing billions of pixels and curating rich
annotations. Though recent MIL approaches leverage variants of the
attention mechanism to learn better representations, they scarcely study
the properties of the data distribution itself i.e., different staining
and acquisition protocols resulting in intra-patch and inter-slide
variations. In this work, we first introduce a distribution
re-calibration strategy to shift the feature distribution of a WSI bag
(instances) using the statistics of the max-instance (critical) feature.
Second, we enforce class (bag) separation via a metric loss assuming
that positive bags exhibit larger magnitudes than negatives. We also
introduce a generative process leveraging Vector Quantization (VQ) for
improved instance discrimination i.e., VQ helps model bag latent factors
for improved classification. To model spatial and context information, a
position encoding module (PEM) is employed with transformer-based
pooling by multi-head self-attention (PMSA). Evaluation of popular WSI
benchmark datasets reveals our approach improves over state-of-the-art
MIL methods. Further, we validate the general applicability of our
method on classic MIL benchmark tasks and for point cloud classification
with limited points https://github.com/PhilipChicco/FRMIL.
在数字病理学中,全幻灯片图像 (WSI) 对于癌症预测和治疗计划至关重要。 WSI 分类通常使用多实例学习 (MIL) 来解决,从而减轻了处理数十亿像素和管理丰富注释的挑战。尽管最近的 MIL 方法利用注意力机制的变体来学习更好的表示,但它们很少研究数据分布本身的属性,即不同的染色和采集协议导致补丁内和幻灯片间的变化。在这项工作中,我们首先引入一种分布重新校准策略,使用最大实例(关键)特征的统计数据来改变 WSI 包(实例)的特征分布。其次,我们通过度量损失强制进行类(袋)分离,假设正袋表现出比负袋更大的量级。我们还引入了利用矢量量化 (VQ) 来改进实例辨别的生成过程,即 VQ 有助于对包潜在因子进行建模以改进分类。为了对空间和上下文信息进行建模,位置编码模块(PEM)与多头自注意力(PMSA)基于变压器的池化结合使用。对流行的 WSI 基准数据集的评估表明,我们的方法比最先进的 MIL 方法有所改进。此外,我们验证了我们的方法在经典 MIL 基准任务和有限点点云分类上的普遍适用性 https://github.com/PhilipChicco/FRMIL。
AU Alkan, Cagan
Mardani, Morteza
Liao, Congyu
Li, Zhitao
Vasanawala, Shreyas S
Pauly, John M
AU Alkan、Cagan Mardani、Morteza Liao、李从宇、Zhitao Vasanawala、Shreyas S Pauly、John M
AutoSamp: Autoencoding k-space Sampling via Variational Information
Maximization for 3D MRI.
AutoSamp:通过 3D MRI 的变分信息最大化自动编码 k 空间采样。
Accelerated MRI protocols routinely involve a predefined sampling
pattern that undersamples the k-space. Finding an optimal pattern can
enhance the reconstruction quality, however this optimization is a
challenging task. To address this challenge, we introduce a novel deep
learning framework, AutoSamp, based on variational information
maximization that enables joint optimization of sampling pattern and
reconstruction of MRI scans. We represent the encoder as a non-uniform
Fast Fourier Transform that allows continuous optimization of k-space
sample locations on a non-Cartesian plane, and the decoder as a deep
reconstruction network. Experiments on public 3D acquired MRI datasets
show improved reconstruction quality of the proposed AutoSamp method
over the prevailing variable density and variable density Poisson disc
sampling for both compressed sensing and deep learning reconstructions.
We demonstrate that our data-driven sampling optimization method
achieves 4.4dB, 2.0dB, 0.75dB, 0.7dB PSNR improvements over
reconstruction with Poisson Disc masks for acceleration factors of R =
5, 10, 15, 25, respectively. Prospectively accelerated acquisitions with
3D FSE sequences using our optimized sampling patterns exhibit improved
image quality and sharpness. Furthermore, we analyze the characteristics
of the learned sampling patterns with respect to changes in acceleration
factor, measurement noise, underlying anatomy, and coil sensitivities.
We show that all these factors contribute to the optimization result by
affecting the sampling density, k-space coverage and point spread
functions of the learned sampling patterns.
加速 MRI 协议通常涉及对 k 空间进行欠采样的预定义采样模式。找到最佳模式可以提高重建质量,但是这种优化是一项具有挑战性的任务。为了应对这一挑战,我们引入了一种新颖的深度学习框架 AutoSamp,它基于变分信息最大化,可以联合优化采样模式和 MRI 扫描重建。我们将编码器表示为非均匀快速傅里叶变换,允许在非笛卡尔平面上连续优化 k 空间样本位置,并将解码器表示为深度重建网络。对公共 3D 采集的 MRI 数据集进行的实验表明,对于压缩感知和深度学习重建,所提出的 AutoSamp 方法的重建质量优于流行的可变密度和可变密度泊松盘采样。我们证明,在加速因子 R = 5、10、15、25 的情况下,我们的数据驱动采样优化方法比使用泊松盘掩模重建的 PSNR 分别提高了 4.4dB、2.0dB、0.75dB、0.7dB。使用我们优化的采样模式通过 3D FSE 序列进行前瞻性加速采集,可提高图像质量和清晰度。此外,我们还分析了学习到的采样模式在加速因子、测量噪声、底层解剖结构和线圈灵敏度方面的变化特征。我们表明,所有这些因素都通过影响学习采样模式的采样密度、k 空间覆盖范围和点扩散函数来对优化结果做出贡献。
AU Pei, Jialun
Guo, Diandian
Zhang, Jingyang
Lin, Manxi
Jin, Yueming
Heng, Pheng-Ann
裴AU、郭嘉伦、张典典、林景阳、金曼希、衡月明、彭安
S2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph
Generation in OR.
S2Former-OR:用于 OR 中场景图生成的单级双模态变压器。
Scene graph generation (SGG) of surgical procedures is crucial in
enhancing holistically cognitive intelligence in the operating room
(OR). However, previous works have primarily relied on multi-stage
learning, where the generated semantic scene graphs depend on
intermediate processes with pose estimation and object detection. This
pipeline may potentially compromise the flexibility of learning
multimodal representations, consequently constraining the overall
effectiveness. In this study, we introduce a novel single-stage bi-modal
transformer framework for SGG in the OR, termed, S2Former-OR, aimed to
complementally leverage multi-view 2D scenes and 3D point clouds for SGG
in an end-to-end manner. Concretely, our model embraces a View-Sync
Transfusion scheme to encourage multi-view visual information
interaction. Concurrently, a Geometry-Visual Cohesion operation is
designed to integrate the synergic 2D semantic features into 3D point
cloud features. Moreover, based on the augmented feature, we propose a
novel relation-sensitive transformer decoder that embeds dynamic
entity-pair queries and relational trait priors, which enables the
direct prediction of entity-pair relations for graph generation without
intermediate steps. Extensive experiments have validated the superior
SGG performance and lower computational cost of S2Former-OR on 4D-OR
benchmark, compared with current OR-SGG methods, e.g., 3 percentage
points Precision increase and 24.2M reduction in model parameters. We
further compared our method with generic single-stage SGG methods with
broader metrics for a comprehensive evaluation, with consistently better
performance achieved. Our source code can be made available at:
https://github.com/PJLallen/S2Former-OR.
外科手术的场景图生成 (SGG) 对于增强手术室 (OR) 的整体认知智能至关重要。然而,以前的工作主要依赖于多阶段学习,其中生成的语义场景图依赖于姿态估计和对象检测的中间过程。该管道可能会损害学习多模态表示的灵活性,从而限制整体有效性。在这项研究中,我们为 OR 中的 SGG 引入了一种新颖的单级双模态转换器框架,称为 S2Former-OR,旨在以端到端的方式互补地利用 SGG 的多视图 2D 场景和 3D 点云方式。具体来说,我们的模型采用视图同步传输方案来鼓励多视图视觉信息交互。同时,几何-视觉凝聚操作旨在将协同 2D 语义特征集成到 3D 点云特征中。此外,基于增强的特征,我们提出了一种新颖的关系敏感转换器解码器,它嵌入了动态实体对查询和关系特征先验,这使得能够直接预测实体对关系以进行图生成,而无需中间步骤。大量实验验证了S2Former-OR在4D-OR基准上与现有OR-SGG方法相比具有优越的SGG性能和更低的计算成本,例如精度提高了3个百分点,模型参数减少了24.2M。我们进一步将我们的方法与具有更广泛指标的通用单阶段 SGG 方法进行比较,以进行综合评估,并始终取得更好的性能。我们的源代码可以在以下位置获取:https://github.com/PJLallen/S2Former-OR。
AU Zhang, Jingke
Huang, Chengwu
Lok, U-Wai
Dong, Zhijie
Liu, Hui
Gong, Ping
Song, Pengfei
Chen, Shigao
张AU、黄景科、乐成武、董宇伟、刘志杰、龚慧、宋平、陈鹏飞、石高
Enhancing Row-column array (RCA)-based 3D ultrasound vascular imaging
with spatial-temporal similarity weighting.
通过时空相似性加权增强基于行列阵列 (RCA) 的 3D 超声血管成像。
Ultrasound vascular imaging (UVI) is a valuable tool for monitoring the
physiological states and evaluating the pathological diseases. Advancing
from conventional two-dimensional (2D) to three-dimensional (3D) UVI
would enhance the vasculature visualization, thereby improving its
reliability. Row-column array (RCA) has emerged as a promising approach
for cost-effective ultrafast 3D imaging with a low channel count.
However, ultrafast RCA imaging is often hampered by high-level sidelobe
artifacts and low signal-to-noise ratio (SNR), which makes RCA-based UVI
challenging. In this study, we propose a spatial-temporal similarity
weighting (St-SW) method to overcome these challenges by exploiting the
incoherence of sidelobe artifacts and noise between datasets acquired
using orthogonal transmissions. Simulation, in vitro blood flow phantom,
and in vivo experiments were conducted to compare the proposed method
with existing orthogonal plane wave imaging (OPW), row-column-specific
frame-multiply-and-sum beamforming (RC-FMAS), and XDoppler techniques.
Qualitative and quantitative results demonstrate the superior
performance of the proposed method. In simulations, the proposed method
reduced the sidelobe level by 31.3 dB, 20.8 dB, and 14.0 dB, compared to
OPW, XDoppler, and RC-FMAS, respectively. In the blood flow phantom
experiment, the proposed method significantly improved the
contrast-to-noise ratio (CNR) of the tube by 26.8 dB, 25.5 dB, and 19.7
dB, compared to OPW, XDoppler, and RC-FMAS methods, respectively. In the
human submandibular gland experiment, it not only reconstructed a more
complete vasculature but also improved the CNR by more than 15 dB,
compared to OPW, XDoppler, and RC-FMAS methods. In summary, the proposed
method effectively suppresses the side-lobe artifacts and noise in
images collected using an RCA under low SNR conditions, leading to
improved visualization of 3D vasculatures.
超声血管成像(UVI)是监测生理状态和评估病理疾病的重要工具。从传统的二维 (2D) 发展到三维 (3D) UVI 将增强脉管系统的可视化,从而提高其可靠性。行列阵列 (RCA) 已成为一种具有成本效益的低通道超快 3D 成像的有前途的方法。然而,超快 RCA 成像常常受到高水平旁瓣伪影和低信噪比 (SNR) 的阻碍,这使得基于 RCA 的 UVI 具有挑战性。在本研究中,我们提出了一种时空相似性加权(St-SW)方法,通过利用正交传输获取的数据集之间的旁瓣伪影和噪声的不相干性来克服这些挑战。进行了模拟、体外血流模型和体内实验,以将所提出的方法与现有的正交平面波成像(OPW)、行列特定帧乘和波束形成(RC-FMAS)和XDoppler进行比较技术。定性和定量结果证明了该方法的优越性能。在仿真中,与 OPW、XDoppler 和 RC-FMAS 相比,所提出的方法分别将旁瓣电平降低了 31.3 dB、20.8 dB 和 14.0 dB。在血流模型实验中,与OPW、XDoppler和RC-FMAS方法相比,该方法分别显着提高了管的对比度噪声比(CNR)26.8 dB、25.5 dB和19.7 dB。在人体颌下腺实验中,与OPW、XDoppler和RC-FMAS方法相比,它不仅重建了更完整的脉管系统,而且将CNR提高了15 dB以上。 总之,所提出的方法有效地抑制了在低 SNR 条件下使用 RCA 收集的图像中的旁瓣伪影和噪声,从而改善了 3D 脉管系统的可视化。
AU Khan, Md Hadiur Rahman
Righetti, Raffaella
AU Khan、Md Hadiur Rahman Righetti、Raffaella
A Novel Poroelastography Method for High-quality Estimation of Lateral
Strain, Solid Stress and Fluid Pressure In Vivo.
一种新颖的孔隙弹性成像方法,用于高质量估计体内横向应变、固体应力和流体压力。
Assessment of mechanical and transport properties of tissues using
ultrasound elasticity imaging requires accurate estimations of the
spatiotemporal distribution of volumetric strain. Due to physical
constraints such as pitch limitation and the lack of phase information
in the lateral direction, the quality of lateral strain estimation is
typically significantly lower than the quality of axial strain
estimation. In this paper, a novel lateral strain estimation technique
based on the physics of compressible porous media is developed, tested
and validated. This technique is referred to as "Poroelastography-based
Ultrasound Lateral Strain Estimation" (PULSE). PULSE differs from
previously proposed lateral strain estimators as it uses the underlying
physics of internal fluid flow within a local region of the tissue as
theoretical foundation. PULSE establishes a relation between
spatiotemporal changes in the axial strains and corresponding
spatiotemporal changes in the lateral strains, effectively allowing
assessment of lateral strains with comparable quality of axial strain
estimators. We demonstrate that PULSE can also be used to accurately
track compression-induced solid stresses and fluid pressure in cancers
using ultrasound poroelastography (USPE). In this study, we report the
theoretical formulation for PULSE and validation using finite element
(FE) and ultrasound simulations. PULSE-generated results exhibit less
than 5% percentage relative error (PRE) and greater than 90% structural
similarity index (SSIM) compared to ground truth simulations.
Experimental results are included to qualitatively assess the
performance of PULSE in vivo. The proposed method can be used to
overcome the inherent limitations of non-axial strain imaging and
improve clinical translatability of USPE.
使用超声弹性成像评估组织的机械和传输特性需要准确估计体积应变的时空分布。由于诸如俯仰限制和横向方向上相位信息的缺乏等物理限制,横向应变估计的质量通常明显低于轴向应变估计的质量。本文开发、测试和验证了一种基于可压缩多孔介质物理学的新型横向应变估计技术。该技术被称为“基于孔隙弹性成像的超声横向应变估计”(PULSE)。 PULSE 与之前提出的横向应变估计器不同,因为它使用组织局部区域内的内部流体流动的基本物理原理作为理论基础。 PULSE 建立了轴向应变的时空变化与横向应变相应的时空变化之间的关系,有效地允许以与轴向应变估计器相当的质量来评估横向应变。我们证明,PULSE 还可用于使用超声孔隙弹性成像 (USPE) 准确跟踪癌症中压缩引起的固体应力和流体压力。在这项研究中,我们报告了 PULSE 的理论公式以及使用有限元 (FE) 和超声模拟进行的验证。与地面真实模拟相比,PULSE 生成的结果显示出小于 5% 的相对误差百分比 (PRE) 和大于 90% 的结构相似性指数 (SSIM)。实验结果用于定性评估 PULSE 的体内性能。 该方法可用于克服非轴向应变成像的固有局限性并提高USPE的临床可转化性。
AU Wen, Zhijie
Wu, Haixia
Ying, Shihui
区文、吴志杰、应海霞、石慧
Histopathology Image Classification With Noisy Labels via The Ranking
Margins
通过排名边缘使用噪声标签进行组织病理学图像分类
Clinically, histopathology images always offer a golden standard for
disease diagnosis. With the development of artificial intelligence,
digital histopathology significantly improves the efficiency of
diagnosis. Nevertheless, noisy labels are inevitable in histopathology
images, which lead to poor algorithm efficiency. Curriculum learning is
one of the typical methods to solve such problems. However, existing
curriculum learning methods either fail to measure the training priority
between difficult samples and noisy ones or need an extra clean dataset
to establish a valid curriculum scheme. Therefore, a new curriculum
learning paradigm is designed based on a proposed ranking function,
which is named The Ranking Margins (TRM). The ranking function measures
the 'distances' between samples and decision boundaries, which helps
distinguish difficult samples and noisy ones. The proposed method
includes three stages: the warm-up stage, the main training stage and
the fine-tuning stage. In the warm-up stage, the margin of each sample
is obtained through the ranking function. In the main training stage,
samples are progressively fed into the networks for training, starting
from those with larger margins to those with smaller ones. Label
correction is also performed in this stage. In the fine-tuning stage,
the networks are retrained on the samples with corrected labels. In
addition, we provide theoretical analysis to guarantee the feasibility
of TRM. The experiments on two representative histopathologies image
datasets show that the proposed method achieves substantial improvements
over the latest Label Noise Learning (LNL) methods.
临床上,组织病理学图像始终为疾病诊断提供黄金标准。随着人工智能的发展,数字组织病理学显着提高了诊断效率。然而,组织病理学图像中不可避免地会出现噪声标签,这导致算法效率较差。课程学习是解决此类问题的典型方法之一。然而,现有的课程学习方法要么无法衡量困难样本和噪声样本之间的训练优先级,要么需要额外干净的数据集来建立有效的课程方案。因此,基于所提出的排名函数设计了一种新的课程学习范式,称为排名利润(TRM)。排序函数测量样本和决策边界之间的“距离”,这有助于区分困难样本和噪声样本。该方法包括三个阶段:热身阶段、主训练阶段和微调阶段。在预热阶段,通过排序函数获得每个样本的margin。在主要训练阶段,样本被逐步输入网络进行训练,从边缘较大的样本开始,到边缘较小的样本。标签校正也在这个阶段进行。在微调阶段,网络在具有正确标签的样本上进行重新训练。此外,我们还提供了理论分析来保证TRM的可行性。在两个代表性组织病理学图像数据集上的实验表明,所提出的方法比最新的标签噪声学习(LNL)方法取得了实质性改进。
C1 Shanghai Univ, Coll Sci, Dept Math, Shanghai 200444, Peoples R China
SN 0278-0062
EI 1558-254X
DA 2024-08-18
UT WOS:001285367200006
PM 38526889
ER
C1 上海大学理学院数学系, 上海 200444, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-08-18 UT WOS:001285367200006 PM 38526889 ER
AU Muller, Philip
Meissen, Felix
Kaissis, Georgios
Rueckert, Daniel
AU Muller、Philip Meissen、Felix Kaissis、Georgios Rueckert、Daniel
Weakly Supervised Object Detection in Chest X-Rays with Differentiable
ROI Proposal Networks and Soft ROI Pooling.
具有可微分 ROI 建议网络和软 ROI 池化的胸部 X 射线弱监督对象检测。
Weakly supervised object detection (WSup-OD) increases the usefulness
and interpretability of image classification algorithms without
requiring additional supervision. The successes of multiple instance
learning in this task for natural images, however, do not translate well
to medical images due to the very different characteristics of their
objects (i.e. pathologies). In this work, we propose Weakly Supervised
ROI Proposal Networks (WSRPN), a new method for generating bounding box
proposals on the fly using a specialized region of interest-attention
(ROI-attention) module. WSRPN integrates well with classic backbone-head
classification algorithms and is end-to-end trainable with only
image-label supervision. We experimentally demonstrate that our new
method outperforms existing methods in the challenging task of disease
localization in chest X-ray images. Code:
https://anonymous.4open.science/r/WSRPN-DCA1.
弱监督对象检测 (WSup-OD) 提高了图像分类算法的实用性和可解释性,而无需额外的监督。然而,由于自然图像的对象(即病理)的特征非常不同,多实例学习在自然图像任务中的成功并不能很好地转化为医学图像。在这项工作中,我们提出了弱监督 ROI 建议网络(WSRPN),这是一种使用专门的兴趣关注区域(ROI-注意)模块动态生成边界框建议的新方法。 WSRPN 与经典的骨干头部分类算法很好地集成,并且仅通过图像标签监督即可进行端到端训练。我们通过实验证明,在胸部 X 射线图像中疾病定位的挑战性任务中,我们的新方法优于现有方法。代码:https://anonymous.4open.science/r/WSRPN-DCA1。
AU He, Jie
Zhang, Haoran
Li, Yimeng
Li, Guanghui
Lei, Siao
Qian, Zhumei
Xiong, Fei
Feng, Yuan
Zhu, Tao
An, Yu
Tian, Jie
区赫、张杰、李浩然、李亦萌、雷光辉、萧谦、熊朱梅、冯飞、朱元、安涛、田雨、杰
Sequential Scan-Based Single-Dimension Multi-Voxel System Matrix
Calibration for Open-Sided Magnetic Particle Imaging.
用于开放式磁粒子成像的基于顺序扫描的单维多体素系统矩阵校准。
Open-sided magnetic particle imaging (OS-MPI) has garnered significant
interest due to its potential for interventional applications. However,
the system matrix calibration in OS-MPI using sequential scans is a
time-consuming task and susceptible to the low signal-to-noise ratio
(SNR) resulting from the small calibration sample size. These challenges
have hindered the practical implementation of system matrix-based
reconstruction for sequentially scanned OS-MPI. To address these issues,
we propose a novel calibration method, named sequential scan-based
single-dimension multi-voxel calibration (SS-SDMVC), to efficiently
obtain a high-SNR system matrix. This method was implemented in a
cylindrical field of view (FOV), where a bar calibration sample parallel
to the field-free line (FFL) was shifted along a fixed radial direction.
A standard image reconstruction process was also introduced to verify
the feasibility of SS-SDMVC. Through simulations, we analyzed the
effects of noise levels and scanner imperfections on the SS-SDMVC-based
reconstruction and demonstrated its robustness. In experiments, we
compared the imaging performance of SS-SDMVC and the sequential
scan-based traditional cubic-FOV SMC. The results showed that SS-SDMVC
reduced the number of measurements by a factor of 210.94 and achieved
higher reconstruction quality. Therefore, SS-SDMVC is expected to
improve the reconstruction quality of human-scale or high-gradient FFL
MPI scanners.
开放式磁粒子成像(OS-MPI)由于其介入应用的潜力而引起了人们的极大兴趣。然而,使用顺序扫描的 OS-MPI 中的系统矩阵校准是一项耗时的任务,并且容易受到校准样本量小导致的低信噪比 (SNR) 的影响。这些挑战阻碍了顺序扫描 OS-MPI 的基于系统矩阵的重建的实际实现。为了解决这些问题,我们提出了一种新颖的校准方法,称为基于顺序扫描的单维多体素校准(SS-SDMVC),以有效地获得高信噪比系统矩阵。该方法在圆柱视场 (FOV) 中实施,其中平行于无场线 (FFL) 的棒校准样本沿固定径向方向移动。还引入了标准图像重建过程来验证SS-SDMVC的可行性。通过仿真,我们分析了噪声水平和扫描仪缺陷对基于 SS-SDMVC 的重建的影响,并证明了其鲁棒性。在实验中,我们比较了 SS-SDMVC 和基于顺序扫描的传统立方视场 SMC 的成像性能。结果表明,SS-SDMVC将测量次数减少了210.94倍,并获得了更高的重建质量。因此,SS-SDMVC有望提高人体尺度或高梯度FFL MPI扫描仪的重建质量。
AU Luo, Xiang
Li, Zhongyu
Xu, Canhua
Zhang, Bite
Zhang, Liangliang
Zhu, Jihua
Huang, Peng
Wang, Xin
Yang, Meng
Chang, Shi
AU罗、李翔、徐中宇、张灿华、张比特、朱亮亮、黄继华、王鹏、杨欣、常孟、石
Semi-Supervised Thyroid Nodule Detection in Ultrasound Videos
超声视频中的半监督甲状腺结节检测
Deep learning techniques have been investigated for the computer-aided
diagnosis of thyroid nodules in ultrasound images. However, most
existing thyroid nodule detection methods were simply based on static
ultrasound images, which cannot well explore spatial and temporal
information following the clinical examination process. In this paper,
we propose a novel video-based semi-supervised framework for ultrasound
thyroid nodule detection. Especially, considering clinical examinations
that need to detect thyroid nodules at the ultrasonic probe positions,
we first construct an adjacent frame guided detection backbone network
by using adjacent supporting reference frames. To further reduce the
labour-intensive thyroid nodule annotation in ultrasound videos, we
extend the video-based detection in a semi-supervised manner by using
both labeled and unlabeled videos. Based on the detection consistency in
sequential neighbouring frames, a pseudo label adaptation strategy is
proposed for the refinement of unpredicted frames. The proposed
framework is validated on 996 transverse viewed and 1088 longitudinal
viewed ultrasound videos. Experimental results demonstrated the superior
performance of our proposed method in the ultrasound video-based
detection of thyroid nodules.
深度学习技术已被研究用于超声图像中甲状腺结节的计算机辅助诊断。然而,现有的甲状腺结节检测方法大多简单地基于静态超声图像,不能很好地挖掘临床检查过程中的空间和时间信息。在本文中,我们提出了一种用于超声甲状腺结节检测的新型基于视频的半监督框架。特别是,考虑到临床检查需要在超声探头位置检测甲状腺结节,我们首先利用相邻支撑参考帧构建相邻帧引导检测主干网络。为了进一步减少超声视频中劳动密集型甲状腺结节注释,我们通过使用标记和未标记视频以半监督方式扩展基于视频的检测。基于连续相邻帧的检测一致性,提出了一种伪标签自适应策略来细化不可预测的帧。所提出的框架在 996 个横向查看和 1088 个纵向查看的超声视频上进行了验证。实验结果证明了我们提出的方法在基于超声视频的甲状腺结节检测中的优越性能。
AU Pei, Chenhao
Wu, Fuping
Yang, Mingjing
Pan, Lin
Ding, Wangbin
Dong, Jinwei
Huang, Liqin
Zhuang, Xiahai
裴区、吴晨浩、杨富平、潘明镜、丁林、董王斌、黄金伟、庄立勤、夏海
Multi-Source Domain Adaptation for Medical Image Segmentation
医学图像分割的多源域适应
Unsupervised domain adaptation(UDA) aims to mitigate the performance
drop of models tested on the target domain, due to the domain shift from
the target to sources. Most UDA segmentation methods focus on the
scenario of solely single source domain. However, in practical
situations data with gold standard could be available from multiple
sources (domains), and the multi-source training data could provide more
information for knowledge transfer. How to utilize them to achieve
better domain adaptation yet remains to be further explored. This work
investigates multi-source UDA and proposes a new framework for medical
image segmentation. Firstly, we employ a multi-level adversarial
learning scheme to adapt features at different levels between each of
the source domains and the target, to improve the segmentation
performance. Then, we propose a multi-model consistency loss to transfer
the learned multi-source knowledge to the target domain simultaneously.
Finally, we validated the proposed framework on two applications, i.e.,
multi-modality cardiac segmentation and cross-modality liver
segmentation. The results showed our method delivered promising
performance and compared favorably to state-of-the-art approaches.
无监督域适应(UDA)旨在减轻由于域从目标到源的转移而在目标域上测试的模型的性能下降。大多数 UDA 分割方法侧重于单一源域的场景。然而,在实际情况中,具有黄金标准的数据可以从多个来源(领域)获得,并且多源训练数据可以为知识转移提供更多信息。如何利用它们来实现更好的领域适应仍有待进一步探索。这项工作研究了多源 UDA 并提出了一种新的医学图像分割框架。首先,我们采用多级对抗学习方案来适应每个源域和目标之间不同级别的特征,以提高分割性能。然后,我们提出了多模型一致性损失,将学习到的多源知识同时转移到目标领域。最后,我们在两个应用程序上验证了所提出的框架,即多模态心脏分割和跨模态肝脏分割。结果表明,我们的方法具有良好的性能,并且与最先进的方法相比具有优势。
AU Qu, Gang
Orlichenko, Anton
Wang, Junqi
Zhang, Gemeng
Xiao, Li
Zhang, Kun
Wilson, Tony W.
Stephen, Julia M.
Calhoun, Vince D.
Wang, Yu-Ping
AU Qu、Gang Orlichenko、Anton Wang、张俊奇、肖格猛、张力、Kun Wilson、Tony W. Stephen、Julia M. Calhoun、Vince D. Wang、Yu-Ping
Interpretable Cognitive Ability Prediction: A Comprehensive Gated Graph
Transformer Framework for Analyzing Functional Brain Networks
可解释的认知能力预测:用于分析功能性大脑网络的综合门控图转换器框架
Graph convolutional deep learning has emerged as a promising method to
explore the functional organization of the human brain in neuroscience
research. This paper presents a novel framework that utilizes the gated
graph transformer (GGT) model to predict individuals' cognitive ability
based on functional connectivity (FC) derived from fMRI. Our framework
incorporates prior spatial knowledge and uses a random-walk diffusion
strategy that captures the intricate structural and functional
relationships between different brain regions. Specifically, our
approach employs learnable structural and positional encodings (LSPE) in
conjunction with a gating mechanism to efficiently disentangle the
learning of positional encoding (PE) and graph embeddings. Additionally,
we utilize the attention mechanism to derive multi-view node feature
embeddings and dynamically distribute propagation weights between each
node and its neighbors, which facilitates the identification of
significant biomarkers from functional brain networks and thus enhances
the interpretability of the findings. To evaluate our proposed model in
cognitive ability prediction, we conduct experiments on two large-scale
brain imaging datasets: the Philadelphia Neurodevelopmental Cohort (PNC)
and the Human Connectome Project (HCP). The results show that our
approach not only outperforms existing methods in prediction accuracy
but also provides superior explainability, which can be used to identify
important FCs underlying cognitive behaviors.
图卷积深度学习已成为神经科学研究中探索人脑功能组织的一种有前景的方法。本文提出了一种新颖的框架,该框架利用门控图变换器(GGT)模型根据功能磁共振成像得出的功能连接(FC)来预测个体的认知能力。我们的框架结合了先前的空间知识,并使用随机游走扩散策略来捕获不同大脑区域之间复杂的结构和功能关系。具体来说,我们的方法采用可学习的结构和位置编码(LSPE)结合门控机制来有效地解开位置编码(PE)和图嵌入的学习。此外,我们利用注意力机制来导出多视图节点特征嵌入,并在每个节点及其邻居之间动态分配传播权重,这有助于从功能性大脑网络中识别重要的生物标志物,从而增强研究结果的可解释性。为了评估我们提出的认知能力预测模型,我们在两个大规模脑成像数据集上进行了实验:费城神经发育队列(PNC)和人类连接组项目(HCP)。结果表明,我们的方法不仅在预测准确性方面优于现有方法,而且提供了卓越的可解释性,可用于识别认知行为背后的重要 FC。
AU Yang, Han
Wang, Qiuli
Zhang, Yue
An, Zhulin
Liu, Chen
Zhang, Xiaohong
Zhou, S. Kevin
欧阳、王瀚、张秋丽、安悦、刘竹林、张晨、周晓红、S. Kevin
Lung Nodule Segmentation and Uncertain Region Prediction With an
Uncertainty-Aware Attention Mechanism
利用不确定性感知机制进行肺结节分割和不确定区域预测
Radiologists possess diverse training and clinical experiences, leading
to variations in the segmentation annotations of lung nodules and
resulting in segmentation uncertainty. Conventional methods typically
select a single annotation as the learning target or attempt to learn a
latent space comprising multiple annotations. However, these approaches
fail to leverage the valuable information inherent in the consensus and
disagreements among the multiple annotations. In this paper, we propose
an Uncertainty-Aware Attention Mechanism (UAAM) that utilizes consensus
and disagreements among multiple annotations to facilitate better
segmentation. To this end, we introduce the Multi-Confidence Mask (MCM),
which combines a Low-Confidence (LC) Mask and a High-Confidence (HC)
Mask. The LC mask indicates regions with low segmentation confidence,
where radiologists may have different segmentation choices. Following
UAAM, we further design an Uncertainty-Guide Multi-Confidence
Segmentation Network (UGMCS-Net), which contains three modules: a
Feature Extracting Module that captures a general feature of a lung
nodule, an Uncertainty-Aware Module that produces three features for the
annotations' union, intersection, and annotation set, and an
Intersection-Union Constraining Module that uses distances between the
three features to balance the predictions of final segmentation and MCM.
To comprehensively demonstrate the performance of our method, we propose
a Complex-Nodule Validation on LIDC-IDRI, which tests UGMCS-Net's
segmentation performance on lung nodules that are difficult to segment
using common methods. Experimental results demonstrate that our method
can significantly improve the segmentation performance on nodules that
are difficult to segment using conventional methods.
放射科医生拥有不同的培训和临床经验,导致肺结节的分割注释存在差异,从而导致分割的不确定性。传统方法通常选择单个注释作为学习目标或尝试学习包含多个注释的潜在空间。然而,这些方法无法利用多个注释之间的共识和分歧所固有的有价值的信息。在本文中,我们提出了一种不确定性感知注意机制(UAAM),它利用多个注释之间的共识和分歧来促进更好的分割。为此,我们引入了多置信度掩模(MCM),它结合了低置信度(LC)掩模和高置信度(HC)掩模。 LC 掩模表示分割置信度较低的区域,放射科医生可能有不同的分割选择。继UAAM之后,我们进一步设计了一个不确定性引导多置信分割网络(UGMCS-Net),它包含三个模块:一个捕获肺结节一般特征的特征提取模块,一个为肺结节产生三个特征的不确定性感知模块。注释的并集、交集和注释集,以及一个交集并集约束模块,该模块使用三个特征之间的距离来平衡最终分割和 MCM 的预测。为了全面展示我们方法的性能,我们提出了 LIDC-IDRI 上的复杂结节验证,它测试了 UGMCS-Net 对使用常见方法难以分割的肺结节的分割性能。实验结果表明,我们的方法可以显着提高传统方法难以分割的结节的分割性能。
AU Ling, Yating
Wang, Yuling
Dai, Wenli
Yu, Jie
Liang, Ping
Kong, Dexing
区玲、王雅婷、戴玉玲、余文丽、梁杰、孔平、德兴
MTANet: Multi-Task Attention Network for Automatic Medical Image
Segmentation and Classification
MTANet:用于自动医学图像分割和分类的多任务注意网络
Medical image segmentation and classification are two of the most key
steps in computer-aided clinical diagnosis. The region of interest were
usually segmented in a proper manner to extract useful features for
further disease classification. However, these methods are
computationally complex and time-consuming. In this paper, we proposed a
one-stage multi-task attention network (MTANet) which efficiently
classifies objects in an image while generating a high-quality
segmentation mask for each medical object. A reverse addition attention
module was designed in the segmentation task to fusion areas in global
map and boundary cues in high-resolution features, and an attention
bottleneck module was used in the classification task for image feature
and clinical feature fusion. We evaluated the performance of MTANet with
CNN-based and transformer-based architectures across three imaging
modalities for different tasks: CVC-ClinicDB dataset for polyp
segmentation, ISIC-2018 dataset for skin lesion segmentation, and our
private ultrasound dataset for liver tumor segmentation and
classification. Our proposed model outperformed state-of-the-art models
on all three datasets and was superior to all 25 radiologists for liver
tumor diagnosis.
医学图像分割和分类是计算机辅助临床诊断中最关键的两个步骤。感兴趣的区域通常以适当的方式进行分割,以提取有用的特征以进行进一步的疾病分类。然而,这些方法计算复杂且耗时。在本文中,我们提出了一种单阶段多任务注意网络(MTANet),它可以有效地对图像中的对象进行分类,同时为每个医疗对象生成高质量的分割掩模。在分割任务中设计了反向加法注意力模块,以融合全局地图中的区域和高分辨率特征中的边界线索,并在图像特征和临床特征融合的分类任务中使用注意力瓶颈模块。我们评估了 MTANet 的性能,采用基于 CNN 和基于 Transformer 的架构,跨三种成像模式执行不同的任务:用于息肉分割的 CVC-ClinicDB 数据集、用于皮肤病变分割的 ISIC-2018 数据集以及用于肝脏肿瘤分割和诊断的私人超声数据集。分类。我们提出的模型在所有三个数据集上都优于最先进的模型,并且在肝脏肿瘤诊断方面优于所有 25 名放射科医生。
AU Chen, Zeyuan
Zheng, Yuanjie
Gee, James C.
AU Chen, 郑泽元, 吉元杰, James C.
TransMatch: A Transformer-Based Multilevel Dual-Stream Feature Matching
Network for Unsupervised Deformable Image Registration
TransMatch:基于 Transformer 的多级双流特征匹配网络,用于无监督可变形图像配准
Feature matching, which refers to establishing the correspondence of
regions between two images (usually voxel features), is a crucial
prerequisite of feature-based registration. For deformable image
registration tasks, traditional feature-based registration methods
typically use an iterative matching strategy for interest region
matching, where feature selection and matching are explicit, but
specific feature selection schemes are often useful in solving
application-specific problems and require several minutes for each
registration. In the past few years, the feasibility of learning-based
methods, such as VoxelMorph and TransMorph, has been proven, and their
performance has been shown to be competitive compared to traditional
methods. However, these methods are usually single-stream, where the two
images to be registered are concatenated into a 2-channel whole, and
then the deformation field is output directly. The transformation of
image features into interimage matching relationships is implicit. In
this paper, we propose a novel end-to-end dual-stream unsupervised
framework, named TransMatch, where each image is fed into a separate
stream branch, and each branch performs feature extraction
independently. Then, we implement explicit multilevel feature matching
between image pairs via the query-key matching idea of the
self-attention mechanism in the Transformer model. Comprehensive
experiments are conducted on three 3D brain MR datasets, LPBA40, IXI,
and OASIS, and the results show that the proposed method achieves
state-of-the-art performance in several evaluation metrics compared to
the commonly utilized registration methods, including SyN, NiftyReg,
VoxelMorph, CycleMorph, ViT-V-Net, and TransMorph, demonstrating the
effectiveness of our model in deformable medical image registration.
特征匹配是指建立两幅图像之间区域(通常是体素特征)的对应关系,是基于特征配准的重要前提。对于可变形图像配准任务,传统的基于特征的配准方法通常使用迭代匹配策略进行兴趣区域匹配,其中特征选择和匹配是明确的,但特定的特征选择方案通常在解决特定应用问题时有用,并且需要几分钟的时间每次注册。在过去的几年中,基于学习的方法(例如VoxelMorph和TransMorph)的可行性已被证明,并且与传统方法相比,它们的性能已被证明具有竞争力。然而,这些方法通常是单流的,将待配准的两个图像连接成一个2通道整体,然后直接输出变形场。图像特征到图像间匹配关系的转换是隐式的。在本文中,我们提出了一种新颖的端到端双流无监督框架,名为 TransMatch,其中每个图像被馈送到单独的流分支中,每个分支独立地执行特征提取。然后,我们通过 Transformer 模型中自注意力机制的查询键匹配思想实现图像对之间的显式多级特征匹配。 在三个 3D 大脑 MR 数据集 LPBA40、IXI 和 OASIS 上进行了综合实验,结果表明,与常用的配准方法(包括 SyN、 NiftyReg、VoxelMorph、CycleMorph、ViT-V-Net 和 TransMorph,展示了我们的模型在可变形医学图像配准中的有效性。
AU Lin, Zefan
Quan, Guotao
Qu, Haixian
Du, Yanfeng
Zhao, Jun
区林、权泽凡、曲国涛、杜海贤、赵彦峰、Jun
LOQUAT: Low-Rank Quaternion Reconstruction for Photon-Counting CT.
LOQUAT:光子计数 CT 的低阶四元数重建。
Photon-counting computed tomography (PCCT) may dramatically benefit
clinical practice due to its versatility such as dose reduction and
material characterization. However, the limited number of photons
detected in each individual energy bin can induce severe noise
contamination in the reconstructed image. Fortunately, the notable
low-rank prior inherent in the PCCT image can guide the reconstruction
to a denoised outcome. To fully excavate and leverage the intrinsic
low-rankness, we propose a novel reconstruction algorithm based on
quaternion representation (QR), called low-rank quaternion
reconstruction (LOQUAT). First, we organize a group of nonlocal similar
patches into a quaternion matrix. Then, an adjusted weighted Schatten-p
norm (AWSN) is introduced and imposed on the matrix to enforce its
low-rank nature. Subsequently, we formulate an AWSN-regularized model
and devise an alternating direction method of multipliers (ADMM)
framework to solve it. Experiments on simulated and real-world data
substantiate the superiority of the LOQUAT technique over several
state-of-the-art competitors in terms of both visual inspection and
quantitative metrics. Moreover, our QR-based method exhibits lower
computational complexity than some popular tensor representation (TR)
based counterparts. Besides, the global convergence of LOQUAT is
theoretically established under a mild condition. These properties
bolster the robustness and practicality of LOQUAT, facilitating its
application in PCCT clinical scenarios. The source code will be
available at https://github.com/linzf23/LOQUAT.
光子计数计算机断层扫描(PCCT)由于其多功能性(例如剂量减少和材料表征)可能会极大地有益于临床实践。然而,在每个单独的能量箱中检测到的有限数量的光子可能会在重建图像中引起严重的噪声污染。幸运的是,PCCT 图像中固有的显着低秩先验可以指导重建得到去噪结果。为了充分挖掘和利用内在的低秩性,我们提出了一种基于四元数表示(QR)的新型重建算法,称为低秩四元数重建(LOQUAT)。首先,我们将一组非局部相似斑块组织成一个四元数矩阵。然后,引入调整后的加权 Schatten-p 范数 (AWSN) 并将其强加于矩阵以强化其低秩性质。随后,我们制定了 AWSN 正则化模型,并设计了乘子交替方向法 (ADMM) 框架来解决它。对模拟和真实世界数据的实验证实了 LOQUAT 技术在目视检查和定量指标方面优于几个最先进的竞争对手。此外,我们基于 QR 的方法比一些流行的基于张量表示(TR)的方法表现出更低的计算复杂度。此外,枇杷的全局收敛性在理论上是在温和条件下成立的。这些特性增强了LOQUAT的稳健性和实用性,促进其在PCCT临床场景中的应用。源代码可在 https://github.com/linzf23/LOQUAT 获取。
EI 1558-254X
DA 2024-09-04
UT MEDLINE:39226197
PM 39226197
ER
EI 1558-254X DA 2024-09-04 UT MEDLINE:39226197 PM 39226197 ER
AU Wang, Sen Yang, Yirong Stevens, Grant M Yin, Zhye Wang, Adam S
Emulating Low-Dose PCCT Image Pairs with Independent Noise for
Self-Supervised Spectral Image Denoising.
模拟具有独立噪声的低剂量 PCCT 图像对,以实现自监督光谱图像去噪。
Photon counting CT (PCCT) acquires spectral measurements and enables
generation of material decomposition (MD) images that provide distinct
advantages in various clinical situations. However, noise amplification
is observed in MD images, and denoising is typically applied. Clean or
high-quality references are rare in clinical scans, often making
supervised learning (Noise2Clean) impractical. Noise2Noise is a
self-supervised counterpart, using noisy images and corresponding noisy
references with zero-mean, independent noise. PCCT counts transmitted
photons separately, and raw measurements are assumed to follow a Poisson
distribution in each energy bin, providing the possibility to create
noise-independent pairs. The approach is to use binomial selection to
split the counts into two low-dose scans with independent noise. We
prove that the reconstructed spectral images inherit the noise
independence from counts domain through noise propagation analysis and
also validated it in numerical simulation and experimental phantom
scans. The method offers the flexibility to split measurements into
desired dose levels while ensuring the reconstructed images share
identical underlying features, thereby strengthening the model's
robustness for input dose levels and capability of preserving fine
details. In both numerical simulation and experimental phantom scans, we
demonstrated that Noise2Noise with binomial selection outperforms other
common self-supervised learning methods based on different presumptive
conditions.
光子计数 CT (PCCT) 获取光谱测量结果并生成材料分解 (MD) 图像,在各种临床情况下具有独特的优势。然而,在MD图像中观察到噪声放大,并且通常应用去噪。干净或高质量的参考在临床扫描中很少见,这通常使得监督学习 (Noise2Clean) 不切实际。 Noise2Noise 是一种自我监督的对应物,使用噪声图像和具有零均值、独立噪声的相应噪声参考。 PCCT 分别对发射的光子进行计数,并且假设原始测量结果遵循每个能量仓中的泊松分布,从而提供了创建与噪声无关的对的可能性。该方法是使用二项式选择将计数分成具有独立噪声的两次低剂量扫描。我们通过噪声传播分析证明了重建的光谱图像继承了计数域的噪声独立性,并在数值模拟和实验体模扫描中对其进行了验证。该方法提供了将测量结果划分为所需剂量水平的灵活性,同时确保重建图像具有相同的基础特征,从而增强了模型对输入剂量水平的鲁棒性和保留精细细节的能力。在数值模拟和实验模型扫描中,我们证明了基于不同假设条件的二项式选择的 Noise2Noise 优于其他常见的自监督学习方法。
AU Dabrowski, Oscar
Falcone, Jean-Luc
Klauser, Antoine
Songeon, Julien
Kocher, Michel
Chopard, Bastien
Lazeyras, Francois
Courvoisier, Sebastien
AU Dabrowski、奥斯卡·法尔科内、让-吕克·克劳瑟、安托万·松琼、朱利安·科赫、米歇尔·萧邦、巴斯蒂安·拉泽拉斯、弗朗索瓦·库瓦西耶、塞巴斯蒂安
SISMIK for brain MRI: Deep-learning-based motion estimation and
model-based motion correction in k-space.
用于脑 MRI 的 SISMIK:k 空间中基于深度学习的运动估计和基于模型的运动校正。
MRI, a widespread non-invasive medical imaging modality, is highly
sensitive to patient motion. Despite many attempts over the years,
motion correction remains a difficult problem and there is no general
method applicable to all situations. We propose a retrospective method
for motion estimation and correction to tackle the problem of in-plane
rigid-body motion, apt for classical 2D Spin-Echo scans of the brain,
which are regularly used in clinical practice. Due to the sequential
acquisition of k-space, motion artifacts are well localized. The method
leverages the power of deep neural networks to estimate motion
parameters in k-space and uses a model-based approach to restore
degraded images to avoid "hallucinations". Notable advantages are its
ability to estimate motion occurring in high spatial frequencies without
the need of a motion-free reference. The proposed method operates on the
whole k-space dynamic range and is moderately affected by the lower SNR
of higher harmonics. As a proof of concept, we provide models trained
using supervised learning on 600k motion simulations based on
motion-free scans of 43 different subjects. Generalization performance
was tested with simulations as well as in-vivo. Qualitative and
quantitative evaluations are presented for motion parameter estimations
and image reconstruction. Experimental results show that our approach is
able to obtain good generalization performance on simulated data and
in-vivo acquisitions. We provide a Python implementation at
https://gitlab.unige.ch/Oscar.Dabrowski/sismik_mri/.
MRI 是一种广泛使用的非侵入性医学成像方式,对患者运动高度敏感。尽管多年来进行了许多尝试,运动校正仍然是一个难题,并且没有适用于所有情况的通用方法。我们提出了一种用于运动估计和校正的回顾性方法,以解决平面内刚体运动问题,适用于临床实践中经常使用的经典大脑 2D 自旋回波扫描。由于 k 空间的顺序采集,运动伪影被很好地定位。该方法利用深度神经网络的力量来估计 k 空间中的运动参数,并使用基于模型的方法来恢复退化的图像以避免“幻觉”。显着的优点是它能够估计高空间频率中发生的运动,而无需无运动参考。该方法在整个 k 空间动态范围内运行,并且受到高次谐波较低 SNR 的适度影响。作为概念证明,我们提供了使用监督学习对 600k 运动模拟进行训练的模型,这些运动模拟基于 43 个不同受试者的无运动扫描。泛化性能通过模拟和体内测试进行了测试。对运动参数估计和图像重建进行了定性和定量评估。实验结果表明,我们的方法能够在模拟数据和体内采集上获得良好的泛化性能。我们在 https://gitlab.unige.ch/Oscar.Dabrowski/sismik_mri/ 提供了 Python 实现。
AU Zhang, Mengliang
Hu, Xinyue
Gu, Lin
Liu, Liangchen
Kobayashi, Kazuma
Harada, Tatsuya
Yan, Yan
Summers, Ronald M
Zhu, Yingying
张AU、胡孟良、顾新月、刘林、小林良臣、原田一马、严达也、严萨默斯、朱明明、莹莹
A New Benchmark: Clinical Uncertainty and Severity Aware Labeled Chest
X-Ray Images with Multi-Relationship Graph Learning.
新基准:具有多关系图学习的临床不确定性和严重性感知标记胸部 X 射线图像。
Chest radiography, commonly known as CXR, is frequently utilized in
clinical settings to detect cardiopulmonary conditions. However, even
seasoned radiologists might offer different evaluations regarding the
seriousness and uncertainty associated with observed abnormalities.
Previous research has attempted to utilize clinical notes to extract
abnormal labels for training deep-learning models in CXR image
diagnosis. However, these methods often neglected the varying degrees of
severity and uncertainty linked to different labels. In our study, we
initially assembled a comprehensive new dataset of CXR images based on
clinical textual data, which incorporated radiologists' assessments of
uncertainty and severity. Using this dataset, we introduced a
multi-relationship graph learning framework that leverages spatial and
semantic relationships while addressing expert uncertainty through a
dedicated loss function. Our research showcases a notable enhancement in
CXR image diagnosis and the interpretability of the diagnostic model,
surpassing existing state-of-the-art methodologies. The dataset address
of disease severity and uncertainty we extracted is:
https://physionet.org/content/cad-chest/1.0/.
胸部X光检查,通常称为CXR,在临床环境中经常用于检测心肺状况。然而,即使是经验丰富的放射科医生也可能对与观察到的异常相关的严重性和不确定性提供不同的评估。先前的研究尝试利用临床记录来提取异常标签,用于训练 CXR 图像诊断中的深度学习模型。然而,这些方法常常忽略了与不同标签相关的不同程度的严重性和不确定性。在我们的研究中,我们最初根据临床文本数据组装了一个全面的新 CXR 图像数据集,其中纳入了放射科医生对不确定性和严重性的评估。使用该数据集,我们引入了一个多关系图学习框架,该框架利用空间和语义关系,同时通过专用损失函数解决专家的不确定性。我们的研究展示了 CXR 图像诊断和诊断模型可解释性的显着增强,超越了现有的最先进方法。我们提取的疾病严重程度和不确定性的数据集地址为:https://physionet.org/content/cad-chest/1.0/。
AU Liu, Jinduo
Han, Lu
Ji, Junzhong
刘AU、韩金铎、陆吉、俊忠
MCAN: Multimodal Causal Adversarial Networks for Dynamic Effective
Connectivity Learning From fMRI and EEG Data
MCAN:多模态因果对抗网络,用于从功能磁共振成像和脑电图数据中进行动态有效连接学习
Dynamic effective connectivity (DEC) is the accumulation of effective
connectivity in the time dimension, which can describe the continuous
neural activities in the brain. Recently, learning DEC from functional
magnetic resonance imaging (fMRI) and electroencephalography (EEG) data
has attracted the attention of neuroinformatics researchers. However,
the current methods fail to consider the gap between the fMRI and EEG
modality, which can not precisely learn the DEC network from multimodal
data. In this paper, we propose a multimodal causal adversarial network
for DEC learning, named MCAN. The MCAN contains two modules: multimodal
causal generator and multimodal causal discriminator. First, MCAN
employs a multimodal causal generator with an attention-guided layer to
produce a posterior signal and output a set of DEC networks. Then, the
proposed method uses a multimodal causal discriminator to unsupervised
calculate the joint gradient, which directs the update of the whole
network. The experimental results on simulated data sets show that MCAN
is superior to other state-of-the-art methods in learning the network
structure of DEC and can effectively estimate the brain states. The
experimental results on real data sets show that MCAN can better reveal
abnormal patterns of brain activity and has good application potential
in brain network analysis.
动态有效连接(DEC)是有效连接在时间维度上的积累,可以描述大脑中连续的神经活动。最近,从功能磁共振成像(fMRI)和脑电图(EEG)数据中学习DEC引起了神经信息学研究人员的关注。然而,当前的方法未能考虑fMRI和EEG模态之间的差距,无法从多模态数据中精确地学习DEC网络。在本文中,我们提出了一种用于 DEC 学习的多模态因果对抗网络,名为 MCAN。 MCAN 包含两个模块:多模态因果生成器和多模态因果鉴别器。首先,MCAN 采用带有注意力引导层的多模态因果生成器来生成后验信号并输出一组 DEC 网络。然后,该方法使用多模态因果判别器无监督地计算联合梯度,从而指导整个网络的更新。模拟数据集上的实验结果表明,MCAN 在学习 DEC 网络结构方面优于其他最先进的方法,并且可以有效地估计大脑状态。真实数据集上的实验结果表明,MCAN能够更好地揭示大脑活动的异常模式,在脑网络分析中具有良好的应用潜力。
AU Zhu, Cheng Tan, Ying Yang, Shuqi Miao, Jiaqing Zhu, Jiayi Huang, Huan Yao, Dezhong Luo, Cheng
Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia
Classification and Lateralization Analysis.
用于精神分裂症分类和偏侧化分析的时间动态同步功能脑网络。
Available evidence suggests that dynamic functional connectivity can
capture time-varying abnormalities in brain activity in resting-state
cerebral functional magnetic resonance imaging (rs-fMRI) data and has a
natural advantage in uncovering mechanisms of abnormal brain activity in
schizophrenia (SZ) patients. Hence, an advanced dynamic brain network
analysis model called the temporal brain category graph convolutional
network (Temporal-BCGCN) was employed. Firstly, a unique dynamic brain
network analysis module, DSF-BrainNet, was designed to construct dynamic
synchronization features. Subsequently, a revolutionary graph
convolution method, TemporalConv, was proposed based on the synchronous
temporal properties of features. Finally, the first modular test tool
for abnormal hemispherical lateralization in deep learning based on
rs-fMRI data, named CategoryPool, was proposed. This study was validated
on COBRE and UCLA datasets and achieved 83.62% and 89.71% average
accuracies, respectively, outperforming the baseline model and other
state-of-the-art methods. The ablation results also demonstrate the
advantages of TemporalConv over the traditional edge feature graph
convolution approach and the improvement of CategoryPool over the
classical graph pooling approach. Interestingly, this study showed that
the lower-order perceptual system and higher-order network regions in
the left hemisphere are more severely dysfunctional than in the right
hemisphere in SZ, reaffirmings the importance of the left medial
superior frontal gyrus in SZ. Our code was available at:
https://github.com/swfen/Temporal-BCGCN.
现有证据表明,动态功能连接可以捕获静息态脑功能磁共振成像(rs-fMRI)数据中大脑活动随时间变化的异常,并且在揭示精神分裂症(SZ)患者异常大脑活动的机制方面具有天然优势。因此,采用了一种先进的动态脑网络分析模型,称为时间脑类别图卷积网络(Temporal-BCGCN)。首先,设计了独特的动态脑网络分析模块DSF-BrainNet来构建动态同步特征。随后,基于特征的同步时间特性,提出了一种革命性的图卷积方法TemporalConv。最后,提出了第一个基于rs-fMRI数据的深度学习异常半球偏侧化的模块化测试工具,名为CategoryPool。这项研究在 COBRE 和 UCLA 数据集上进行了验证,平均准确率分别达到 83.62% 和 89.71%,优于基线模型和其他最先进的方法。消融结果还证明了 TemporalConv 相对于传统边缘特征图卷积方法的优势以及 CategoryPool 相对于经典图池化方法的改进。有趣的是,这项研究表明,左半球的低阶感知系统和高阶网络区域的功能障碍比右半球更严重,这再次证实了左内侧额上回在 SZ 中的重要性。我们的代码位于:https://github.com/swfen/Temporal-BCGCN。
AU Gros, Romane
Rodriguez-Nunez, Omar
Felger, Leonard
Moriconi, Stefano
McKinley, Richard
Pierangelo, Angelo
Novikova, Tatiana
Vassella, Erik
Schucht, Philippe
Hewer, Ekkehard
Maragkou, Theoni
AU Gros、罗曼·罗德里格斯-努涅斯、奥马尔·菲尔格、伦纳德·莫里科尼、斯特凡诺·麦金利、理查德·皮耶朗杰洛、安吉洛·诺维科娃、塔蒂亚娜·瓦塞拉、埃里克·舒赫特、菲利普·休尔、埃克哈德·马拉格库、Theoni
Characterization of Polarimetric Properties in Various Brain Tumor Types
Using Wide-Field Imaging Mueller Polarimetry.
使用广域成像穆勒偏振法表征各种脑肿瘤类型的偏振特性。
Neuro-oncological surgery is the primary brain cancer treatment, yet it
faces challenges with gliomas due to their invasiveness and the need to
preserve neurological function. Hence, radical resection is often
unfeasible, highlighting the importance of precise tumor margin
delineation to prevent neurological deficits and improve prognosis.
Imaging Mueller polarimetry, an effective modality in various organ
tissues, seems a promising approach for tumor delineation in
neurosurgery. To further assess its use, we characterized the
polarimetric properties by analysing 45 polarimetric measurements of 27
fresh brain tumor samples, including different tumor types with a strong
focus on gliomas. Our study integrates a wide-field imaging Mueller
polarimetric system and a novel neuropathology protocol, correlating
polarimetric and histological data for accurate tissue identification.
An image processing pipeline facilitated the alignment and overlay of
polarimetric images and histological masks. Variations in depolarization
values were observed for grey and white matter of brain tumor tissue,
while differences in linear retardance were seen only within white
matter of brain tumor tissue. Notably, we identified pronounced optical
axis azimuth randomization within tumor regions. This study lays the
foundation for machine learning-based brain tumor segmentation
algorithms using polarimetric data, facilitating intraoperative
diagnosis and decision making.
神经肿瘤手术是主要的脑癌治疗方法,但由于神经胶质瘤的侵袭性和保留神经功能的需要,它面临着挑战。因此,根治性切除通常是不可行的,这凸显了精确的肿瘤边缘勾画对于预防神经功能缺损和改善预后的重要性。成像穆勒偏振测定法是多种器官组织中的一种有效方法,似乎是神经外科肿瘤描绘的一种有前途的方法。为了进一步评估其用途,我们通过分析 27 个新鲜脑肿瘤样本(包括重点关注神经胶质瘤的不同肿瘤类型)的 45 个偏振测量值来表征其偏振特性。我们的研究集成了宽视场成像穆勒偏振系统和新颖的神经病理学协议,将偏振和组织学数据关联起来以进行准确的组织识别。图像处理管道促进了偏振图像和组织学掩模的对齐和叠加。观察到脑肿瘤组织的灰质和白质的去极化值的变化,而仅在脑肿瘤组织的白质内观察到线性延迟的差异。值得注意的是,我们在肿瘤区域内发现了明显的光轴方位角随机化。这项研究为使用极化数据的基于机器学习的脑肿瘤分割算法奠定了基础,促进术中诊断和决策。
AU Kuang, Hulin
Wang, Yahui
Liu, Jin
Wang, Jie
Cao, Quanliang
Hu, Bo
Qiu, Wu
Wang, Jianxin
区匡、王虎林、刘亚辉、王进、曹杰、胡全良、邱波、王武、建新
Hybrid CNN-Transformer Network With Circular Feature Interaction for
Acute Ischemic Stroke Lesion Segmentation on Non-Contrast CT Scans
具有圆形特征交互的混合 CNN-Transformer 网络,用于非对比 CT 扫描上的急性缺血性中风病变分割
Lesion segmentation is a fundamental step for the diagnosis of acute
ischemic stroke (AIS). Non-contrast CT (NCCT) is still a mainstream
imaging modality for AIS lesion measurement. However, AIS lesion
segmentation on NCCT is challenging due to low contrast, noise and
artifacts. To achieve accurate AIS lesion segmentation on NCCT, this
study proposes a hybrid convolutional neural network (CNN) and
Transformer network with circular feature interaction and bilateral
difference learning. It consists of parallel CNN and Transformer
encoders, a circular feature interaction module, and a shared CNN
decoder with a bilateral difference learning module. A new Transformer
block is particularly designed to solve the weak inductive bias problem
of the traditional Transformer. To effectively combine features from CNN
and Transformer encoders, we first design a multi-level feature
aggregation module to combine multi-scale features in each encoder and
then propose a novel feature interaction module containing circular
CNN-to-Transformer and Transformer-to-CNN interaction blocks. Besides, a
bilateral difference learning module is proposed at the bottom level of
the decoder to learn the different information between the ischemic and
contralateral sides of the brain. The proposed method is evaluated on
three AIS datasets: the public AISD, a private dataset and an external
dataset. Experimental results show that the proposed method achieves
Dices of 61.39% and 46.74% on the AISD and the private dataset,
respectively, outperforming 17 state-of-the-art segmentation methods.
Besides, volumetric analysis on segmented lesions and external
validation results imply that the proposed method is potential to
provide support information for AIS diagnosis.
病灶分割是诊断急性缺血性卒中(AIS)的基本步骤。非增强 CT (NCCT) 仍然是 AIS 病变测量的主流成像方式。然而,由于对比度低、噪声和伪影,NCCT 上的 AIS 病灶分割具有挑战性。为了在 NCCT 上实现准确的 AIS 病灶分割,本研究提出了一种具有循环特征交互和双边差分学习的混合卷积神经网络(CNN)和 Transformer 网络。它由并行 CNN 和 Transformer 编码器、循环特征交互模块以及带有双边差分学习模块的共享 CNN 解码器组成。新的 Transformer 模块专门为解决传统 Transformer 的弱感应偏置问题而设计。为了有效地结合 CNN 和 Transformer 编码器的特征,我们首先设计了一个多级特征聚合模块来结合每个编码器中的多尺度特征,然后提出一种包含循环 CNN-to-Transformer 和 Transformer-to-CNN 的新型特征交互模块交互块。此外,在解码器的底层提出了双边差异学习模块,以学习缺血侧大脑和对侧大脑之间的差异信息。所提出的方法在三个 AIS 数据集上进行评估:公共 AISD、私有数据集和外部数据集。实验结果表明,该方法在 AISD 和私有数据集上的 Dices 分别达到 61.39% 和 46.74%,优于 17 种最先进的分割方法。此外,分段病灶的体积分析和外部验证结果表明该方法有可能为 AIS 诊断提供支持信息。
AU Lin, Yi
Wang, Zeyu
Zhang, Dong
Cheng, Kwang-Ting
Chen, Hao
AU Lin、王毅、张泽宇、程东、陈光廷、郝
BoNuS: Boundary Mining for Nuclei Segmentation With Partial Point Labels
BoNuS:使用部分点标签进行核分割的边界挖掘
Nuclei segmentation is a fundamental prerequisite in the digital
pathology workflow. The development of automated methods for nuclei
segmentation enables quantitative analysis of the wide existence and
large variances in nuclei morphometry in histopathology images. However,
manual annotation of tens of thousands of nuclei is tedious and
time-consuming, which requires significant amount of human effort and
domain-specific expertise. To alleviate this problem, in this paper, we
propose a weakly-supervised nuclei segmentation method that only
requires partial point labels of nuclei. Specifically, we propose a
novel boundary mining framework for nuclei segmentation, named BoNuS,
which simultaneously learns nuclei interior and boundary information
from the point labels. To achieve this goal, we propose a novel boundary
mining loss, which guides the model to learn the boundary information by
exploring the pairwise pixel affinity in a multiple-instance learning
manner. Then, we consider a more challenging problem, i.e., partial
point label, where we propose a nuclei detection module with curriculum
learning to detect the missing nuclei with prior morphological
knowledge. The proposed method is validated on three public datasets,
MoNuSeg, CPM, and CoNIC datasets. Experimental results demonstrate the
superior performance of our method to the state-of-the-art
weakly-supervised nuclei segmentation methods. Code:
https://github.com/hust-linyi/bonus.
细胞核分割是数字病理工作流程的基本先决条件。细胞核分割自动化方法的发展使得能够对组织病理学图像中细胞核形态测量的广泛存在和巨大差异进行定量分析。然而,对数以万计的细胞核进行手动注释既繁琐又耗时,需要大量的人力和特定领域的专业知识。为了缓解这个问题,在本文中,我们提出了一种弱监督的核分割方法,仅需要核的部分点标签。具体来说,我们提出了一种用于核分割的新型边界挖掘框架,名为BoNuS,它同时从点标签中学习核内部和边界信息。为了实现这一目标,我们提出了一种新颖的边界挖掘损失,它引导模型通过以多实例学习方式探索成对像素亲和力来学习边界信息。然后,我们考虑一个更具挑战性的问题,即部分点标签,我们提出了一个具有课程学习的核检测模块,以利用先验形态学知识来检测丢失的核。所提出的方法在三个公共数据集 MoNuSeg、CPM 和 CoNIC 数据集上进行了验证。实验结果证明我们的方法比最先进的弱监督核分割方法具有优越的性能。代码:https://github.com/hust-linyi/bonus。
AU Liu, Xiao
Sanchez, Pedro
Thermos, Spyridon
O'Neil, Alison Q.
Tsaftaris, Sotirios A.
AU Liu、Xiao Sanchez、Pedro Thermos、Spyridon O'Neil、Alison Q. Tsaftaris、Sotirios A.
Compositionally Equivariant Representation Learning
组成等变表示学习
Deep learning models often need sufficient supervision (i.e., labelled
data) in order to be trained effectively. By contrast, humans can
swiftly learn to identify important anatomy in medical images like MRI
and CT scans, with minimal guidance. This recognition capability easily
generalises to new images from different medical facilities and to new
tasks in different settings. This rapid and generalisable learning
ability is largely due to the compositional structure of image patterns
in the human brain, which are not well represented in current medical
models. In this paper, we study the utilisation of compositionality in
learning more interpretable and generalisable representations for
medical image segmentation. Overall, we propose that the underlying
generative factors that are used to generate the medical images satisfy
compositional equivariance property, where each factor is compositional
(e.g., corresponds to human anatomy) and also equivariant to the task.
Hence, a good representation that approximates well the ground truth
factor has to be compositionally equivariant. By modelling the
compositional representations with learnable von-Mises-Fisher (vMF)
kernels, we explore how different design and learning biases can be used
to enforce the representations to be more compositionally equivariant
under un-, weakly-, and semi-supervised settings. Extensive results show
that our methods achieve the best performance over several strong
baselines on the task of semi-supervised domain-generalised medical
image segmentation. Code will be made publicly available upon acceptance
at https://github.com/vios-s.
深度学习模型通常需要足够的监督(即标记数据)才能有效地进行训练。相比之下,人类可以在最少的指导下快速学会识别 MRI 和 CT 扫描等医学图像中的重要解剖结构。这种识别能力可以轻松推广到来自不同医疗机构的新图像以及不同环境中的新任务。这种快速且普遍的学习能力很大程度上归功于人脑图像模式的组成结构,而这种结构在当前的医学模型中并没有得到很好的体现。在本文中,我们研究了组合性在学习医学图像分割的更多可解释和可概括的表示中的利用。总的来说,我们建议用于生成医学图像的底层生成因素满足成分等变性,其中每个因素都是成分性的(例如,对应于人体解剖学)并且也与任务等变。因此,一个很好地近似真实因子的良好表示必须在成分上是等变的。通过使用可学习的 von-Mises-Fisher (vMF) 内核对组合表示进行建模,我们探索了如何使用不同的设计和学习偏差来强制表示在无监督、弱监督和半监督设置下在组合上更加等变。大量结果表明,我们的方法在半监督域广义医学图像分割任务上在几个强基线上实现了最佳性能。代码将在 https://github.com/vios-s 接受后公开发布。
AU Wegierak, Dana
Cooley, Michaela B.
Perera, Reshani
Wulftange, William J.
Gurkan, Umut A.
Kolios, Michael C.
Exner, Agata A.
AU Wegierak、Dana Cooley、Michaela B. Perera、Reshani Wulftange、William J. Gurkan、Umut A. Kolios、Michael C. Exner、Agata A.
Decorrelation Time Mapping as an Analysis Tool for Nanobubble-Based
Contrast Enhanced Ultrasound Imaging
去相关时间映射作为基于纳米气泡的对比增强超声成像的分析工具
Nanobubbles (NBs; similar to 100-500 nm diameter) are preclinical
ultrasound (US) contrast agents that expand applications of contrast
enhanced US (CEUS). Due to their sub-micron size, high particle density,
and deformable shell, NBs in pathological states of heightened vascular
permeability (e.g. in tumors) extravasate, enabling applications not
possible with microbubbles (similar to 1000-10,000 nm diameter). A
method that can separate intravascular versus extravascular NB signal is
needed as an imaging biomarker for improved tumor detection. We present
a demonstration of decorrelation time (DT) mapping for enhanced tumor
NB-CEUS imaging. In vitro models validated the sensitivity of DT to
agent motion. Prostate cancer mouse models validated in vivo imaging
potential and sensitivity to cancerous tissue. Our findings show that DT
is inversely related to NB motion, offering enhanced detail of NB
dynamics in tumors, and highlighting the heterogeneity of the tumor
environment. Average DT was high in tumor regions (similar to 9 s)
compared to surrounding normal tissue (similar to 1 s) with higher
sensitivity to tumor tissue compared to other mapping techniques.
Molecular NB targeting to tumors further extended DT (11 s) over
non-targeted NBs (6 s), demonstrating sensitivity to NB adherence. From
DT mapping of in vivo NB dynamics we demonstrate the heterogeneity of
tumor tissue while quantifying extravascular NB kinetics and delineating
intra-tumoral vasculature. This new NB-CEUS-based biomarker can be
powerful in molecular US imaging, with improved sensitivity and
specificity to diseased tissue and potential for use as an estimator of
vascular permeability and the enhanced permeability and retention (EPR)
effect in tumors.
纳米气泡(NB;直径类似于 100-500 nm)是临床前超声 (US) 造影剂,可扩展造影增强 US (CEUS) 的应用。由于它们的亚微米尺寸、高颗粒密度和可变形的外壳,处于血管通透性升高的病理状态(例如在肿瘤中)的纳米粒子会外渗,从而实现微泡(类似于1000-10,000 nm直径)不可能的应用。需要一种能够分离血管内和血管外 NB 信号的方法作为成像生物标志物,以改进肿瘤检测。我们展示了增强肿瘤 NB-CEUS 成像的去相关时间 (DT) 映射。体外模型验证了 DT 对药剂运动的敏感性。前列腺癌小鼠模型验证了体内成像潜力和对癌组织的敏感性。我们的研究结果表明,DT 与 NB 运动呈负相关,增强了肿瘤中 NB 动态的细节,并强调了肿瘤环境的异质性。与周围正常组织(大约 1 秒)相比,肿瘤区域的平均 DT 较高(大约 9 秒),与其他绘图技术相比,对肿瘤组织的敏感性更高。靶向肿瘤的分子 NB 比非靶向 NB(6 秒)进一步延长了 DT(11 秒),证明了对 NB 粘附的敏感性。通过体内 NB 动力学的 DT 绘图,我们证明了肿瘤组织的异质性,同时量化了血管外 NB 动力学并描绘了肿瘤内脉管系统。这种基于 NB-CEUS 的新型生物标志物在分子超声成像中具有强大的作用,对病变组织具有更高的敏感性和特异性,并且有可能用作肿瘤中血管通透性和增强通透性和保留 (EPR) 效应的估计器。
AU Kijanka, Piotr
Urban, Matthew W.
AU Kijanka、皮奥特·厄本、马修·W.
Ultrasound Shear Elastography With Expanded Bandwidth (USEWEB): A Novel
Method for 2D Shear Phase Velocity Imaging of Soft Tissues
扩展带宽超声剪切弹性成像 (USEWEB):软组织二维剪切相速度成像的新方法
Ultrasound shear wave elastography (SWE) is a noninvasive approach for
evaluating mechanical properties of soft tissues. In SWE either group
velocity measured in the time-domain or phase velocity measured in the
frequency-domain can be reported. Frequency-domain methods have the
advantage over time-domain methods in providing a response for a
specific frequency, while time-domain methods average the wave velocity
over the entire frequency band. Current frequency-domain approaches
struggle to reconstruct SWE images over full frequency bandwidth. This
is especially important in the case of viscoelastic tissues, where
tissue viscoelasticity is often studied by analyzing the shear wave
phase velocity dispersion. For characterizing cancerous lesions, it has
been shown that considerable biases can occur with group velocity-based
measurements. However, using phase velocities at higher frequencies can
provide more accurate evaluations. In this paper, we propose a new
method called Ultrasound Shear Elastography with Expanded Bandwidth
(USEWEB) used for two-dimensional (2D) shear wave phase velocity
imaging. We tested the USEWEB method on data from homogeneous
tissue-mimicking liver fibrosis phantoms, custom-made viscoelastic
phantom measurements, phantoms with cylindrical inclusions experiments,
and in vivo renal transplants scanned with a clinical scanner. We
compared results from the USEWEB method with a Local Phase Velocity
Imaging (LPVI) approach over a wide frequency range, i.e., up to
200-2000 Hz. Tests carried out revealed that the USEWEB approach
provides 2D phase velocity images with a coefficient of variation below
5% over a wider frequency band for smaller processing window size in
comparison to LPVI, especially in viscoelastic materials. In addition,
USEWEB can produce correct phase velocity images for much higher
frequencies, up to 1800 Hz, compared to LPVI, which can be used to
characterize viscoelastic materials and elastic inclusions.
超声剪切波弹性成像(SWE)是一种评估软组织机械性能的无创方法。在 SWE 中,可以报告时域中测量的群速度或频域中测量的相速度。频域方法在提供特定频率的响应方面比时域方法具有优势,而时域方法则对整个频带上的波速进行平均。当前的频域方法很难在全频率带宽上重建 SWE 图像。这对于粘弹性组织来说尤其重要,组织粘弹性通常通过分析剪切波相速度色散来研究。为了表征癌性病变,已经表明基于群速度的测量可能会出现相当大的偏差。然而,在较高频率下使用相速度可以提供更准确的评估。在本文中,我们提出了一种称为扩展带宽超声剪切弹性成像(USEWEB)的新方法,用于二维(2D)剪切波相速度成像。我们使用来自均质组织模拟肝纤维化模型、定制粘弹性模型测量、具有圆柱形包涵体实验的模型以及使用临床扫描仪扫描的体内肾移植物的数据测试了 USEWEB 方法。我们在较宽的频率范围(即高达 200-2000 Hz)内将 USEWEB 方法与局部相速度成像 (LPVI) 方法的结果进行了比较。进行的测试表明,与 LPVI 相比,USEWEB 方法提供的二维相速度图像在更宽的频带内变化系数低于 5%,处理窗口尺寸更小,尤其是在粘弹性材料中。 此外,与 LPVI 相比,USEWEB 可以在更高的频率(高达 1800 Hz)下生成正确的相速度图像,LPVI 可用于表征粘弹性材料和弹性夹杂物。
AU Liu, Han
Xu, Zhoubing
Gao, Riqiang
Li, Hao
Wang, Jianing
Chabin, Guillaume
Oguz, Ipek
Grbic, Sasa
AU Liu, 徐涵, 高周兵, 李日强, 王浩, Jianing Chabin, Guillaume Oguz, Ipek Grbic, Sasa
COSST: Multi-Organ Segmentation With Partially Labeled Datasets Using
Comprehensive Supervisions and Self-Training
COSST:使用综合监督和自我训练的部分标记数据集的多器官分割
Deep learning models have demonstrated remarkable success in multi-organ
segmentation but typically require large-scale datasets with all organs
of interest annotated. However, medical image datasets are often low in
sample size and only partially labeled, i.e., only a subset of organs
are annotated. Therefore, it is crucial to investigate how to learn a
unified model on the available partially labeled datasets to leverage
their synergistic potential. In this paper, we systematically
investigate the partial-label segmentation problem with theoretical and
empirical analyses on the prior techniques. We revisit the problem from
a perspective of partial label supervision signals and identify two
signals derived from ground truth and one from pseudo labels. We propose
a novel two-stage framework termed COSST, which effectively and
efficiently integrates comprehensive supervision signals with
self-training. Concretely, we first train an initial unified model using
two ground truth-based signals and then iteratively incorporate the
pseudo label signal to the initial model using self-training. To
mitigate performance degradation caused by unreliable pseudo labels, we
assess the reliability of pseudo labels via outlier detection in latent
space and exclude the most unreliable pseudo labels from each
self-training iteration. Extensive experiments are conducted on one
public and three private partial-label segmentation tasks over 12 CT
datasets. Experimental results show that our proposed COSST achieves
significant improvement over the baseline method, i.e., individual
networks trained on each partially labeled dataset. Compared to the
state-of-the-art partial-label segmentation methods, COSST demonstrates
consistent superior performance on various segmentation tasks and with
different training data sizes.
深度学习模型在多器官分割方面取得了显着的成功,但通常需要注释所有感兴趣器官的大规模数据集。然而,医学图像数据集的样本量通常较小,并且仅进行了部分标记,即仅对器官的子集进行了注释。因此,研究如何在可用的部分标记数据集上学习统一模型以利用其协同潜力至关重要。在本文中,我们通过对现有技术的理论和实证分析,系统地研究了部分标签分割问题。我们从部分标签监督信号的角度重新审视这个问题,并识别出两个来自真实标签的信号和一个来自伪标签的信号。我们提出了一种新颖的两阶段框架,称为 COSST,它有效且高效地将全面的监督信号与自我训练相结合。具体来说,我们首先使用两个基于地面实况的信号训练初始统一模型,然后使用自训练迭代地将伪标签信号合并到初始模型中。为了减轻由不可靠的伪标签引起的性能下降,我们通过潜在空间中的异常值检测来评估伪标签的可靠性,并从每次自训练迭代中排除最不可靠的伪标签。在 12 个 CT 数据集上对一个公共部分标签分割任务和三个私有部分标签分割任务进行了广泛的实验。实验结果表明,我们提出的 COSST 比基线方法(即在每个部分标记的数据集上训练的单独网络)取得了显着的改进。 与最先进的部分标签分割方法相比,COSST 在各种分割任务和不同的训练数据大小上表现出一致的优越性能。
AU Yang, Zhuoyue
Pan, Junjun
Dai, Ju
Sun, Zhen
Xiao, Yi
欧阳、潘卓跃、戴军军、孙菊、肖震、易
Self-Supervised Lightweight Depth Estimation in Endoscopy Combining CNN
and Transformer
结合 CNN 和 Transformer 的内窥镜自监督轻量级深度估计
In recent years, an increasing number of medical engineering tasks, such
as surgical navigation, pre-operative registration, and surgical
robotics, rely on 3D reconstruction techniques. Self-supervised depth
estimation has attracted interest in endoscopic scenarios because it
does not require ground truth. Most existing methods depend on expanding
the size of parameters to improve their performance. There, designing a
lightweight self-supervised model that can obtain competitive results is
a hot topic. We propose a lightweight network with a tight coupling of
convolutional neural network (CNN) and Transformer for depth estimation.
Unlike other methods that use CNN and Transformer to extract features
separately and then fuse them on the deepest layer, we utilize the
modules of CNN and Transformer to extract features at different scales
in the encoder. This hierarchical structure leverages the advantages of
CNN in texture perception and Transformer in shape extraction. In the
same scale of feature extraction, the CNN is used to acquire local
features while the Transformer encodes global information. Finally, we
add multi-head attention modules to the pose network to improve the
accuracy of predicted poses. Experiments demonstrate that our approach
obtains comparable results while effectively compressing the model
parameters on two datasets.
近年来,越来越多的医学工程任务,例如手术导航、术前配准和手术机器人,依赖于 3D 重建技术。自监督深度估计引起了人们对内窥镜场景的兴趣,因为它不需要地面实况。大多数现有方法依赖于扩展参数的大小来提高其性能。在那里,设计一个能够获得有竞争力的结果的轻量级自监督模型是一个热门话题。我们提出了一种将卷积神经网络(CNN)和 Transformer 紧密耦合的轻量级网络,用于深度估计。与其他使用 CNN 和 Transformer 分别提取特征然后在最深层融合的方法不同,我们利用 CNN 和 Transformer 的模块在编码器中提取不同尺度的特征。这种层次结构利用了 CNN 在纹理感知方面和 Transformer 在形状提取方面的优势。在相同尺度的特征提取中,CNN用于获取局部特征,而Transformer则对全局信息进行编码。最后,我们将多头注意力模块添加到姿势网络中,以提高预测姿势的准确性。实验表明,我们的方法在有效压缩两个数据集上的模型参数的同时获得了可比较的结果。
AU Zhang, Yiwen
Li, Chuanpu
Zhong, Liming
Chen, Zeli
Yang, Wei
Wang, Xuetao
张AU、李一文、钟传普、陈黎明、杨泽立、王伟、雪涛
DoseDiff: Distance-aware Diffusion Model for Dose Prediction in
Radiotherapy.
DoseDiff:放射治疗剂量预测的距离感知扩散模型。
Treatment planning, which is a critical component of the radiotherapy
workflow, is typically carried out by a medical physicist in a
time-consuming trial-and-error manner. Previous studies have proposed
knowledge-based or deep-learning-based methods for predicting dose
distribution maps to assist medical physicists in improving the
efficiency of treatment planning. However, these dose prediction methods
usually fail to effectively utilize distance information between
surrounding tissues and targets or organs-at-risk (OARs). Moreover, they
are poor at maintaining the distribution characteristics of ray paths in
the predicted dose distribution maps, resulting in a loss of valuable
information. In this paper, we propose a distance-aware diffusion model
(DoseDiff) for precise prediction of dose distribution. We define dose
prediction as a sequence of denoising steps, wherein the predicted dose
distribution map is generated with the conditions of the computed
tomography (CT) image and signed distance maps (SDMs). The SDMs are
obtained by distance transformation from the masks of targets or OARs,
which provide the distance from each pixel in the image to the outline
of the targets or OARs. We further propose a multi-encoder and
multi-scale fusion network (MMFNet) that incorporates multi-scale and
transformer-based fusion modules to enhance information fusion between
the CT image and SDMs at the feature level. We evaluate our model on two
in-house datasets and a public dataset, respectively. The results
demonstrate that our DoseDiff method outperforms state-of-the-art dose
prediction methods in terms of both quantitative performance and visual
quality.
治疗计划是放射治疗工作流程的关键组成部分,通常由医学物理学家以耗时的试错方式进行。先前的研究提出了基于知识或基于深度学习的方法来预测剂量分布图,以帮助医学物理学家提高治疗计划的效率。然而,这些剂量预测方法通常无法有效利用周围组织与目标或危及器官(OAR)之间的距离信息。此外,它们很难维持预测剂量分布图中射线路径的分布特征,从而导致有价值的信息丢失。在本文中,我们提出了一种距离感知扩散模型(DoseDiff),用于精确预测剂量分布。我们将剂量预测定义为一系列去噪步骤,其中预测的剂量分布图是根据计算机断层扫描(CT)图像和符号距离图(SDM)的条件生成的。 SDM 是通过目标或 OAR 掩模的距离变换获得的,它提供了图像中每个像素到目标或 OAR 轮廓的距离。我们进一步提出了一种多编码器和多尺度融合网络(MMFNet),它结合了多尺度和基于变压器的融合模块,以增强 CT 图像和 SDM 之间在特征级别的信息融合。我们分别在两个内部数据集和一个公共数据集上评估我们的模型。结果表明,我们的 DoseDiff 方法在定量性能和视觉质量方面均优于最先进的剂量预测方法。
AU Lachinov, Dmitrii
Chakravarty, Arunava
Grechenig, Christoph
Schmidt-Erfurth, Ursula
Bogunovic, Hrvoje
CA ADNI
AU Lachinov、Dmitrii Chakravarty、Arunava Grechenig、Christoph Schmidt-Erfurth、Ursula Bogunovic、Hrvoje CA ADNI
Learning Spatio-Temporal Model of Disease Progression With NeuralODEs
From Longitudinal Volumetric Data
使用 NeuralODE 从纵向体积数据学习疾病进展的时空模型
Robust forecasting of the future anatomical changes inflicted by an
ongoing disease is an extremely challenging task that is out of grasp
even for experienced healthcare professionals. Such a capability,
however, is of great importance since it can improve patient management
by providing information on the speed of disease progression already at
the admission stage, or it can enrich the clinical trials with fast
progressors and avoid the need for control arms by the means of digital
twins. In this work, we develop a deep learning method that models the
evolution of age-related disease by processing a single medical scan and
providing a segmentation of the target anatomy at a requested future
point in time. Our method represents a time-invariant physical process
and solves a large-scale problem of modeling temporal pixel-level
changes utilizing NeuralODEs. In addition, we demonstrate the approaches
to incorporate the prior domain-specific constraints into our method and
define temporal Dice loss for learning temporal objectives. To evaluate
the applicability of our approach across different age-related diseases
and imaging modalities, we developed and tested the proposed method on
the datasets with 967 retinal OCT volumes of 100 patients with
Geographic Atrophy and 2823 brain MRI volumes of 633 patients with
Alzheimer's Disease. For Geographic Atrophy, the proposed method
outperformed the related baseline models in the atrophy growth
prediction. For Alzheimer's Disease, the proposed method demonstrated
remarkable performance in predicting the brain ventricle changes induced
by the disease, achieving the state-of-the-art result on TADPOLE
cross-sectional prediction challenge dataset.
对当前疾病造成的未来解剖学变化的稳健预测是一项极具挑战性的任务,即使对于经验丰富的医疗保健专业人员来说也是无法掌握的。然而,这种能力非常重要,因为它可以通过提供入院阶段疾病进展速度的信息来改善患者管理,或者可以丰富快速进展者的临床试验,并避免对控制臂的需要。数字孪生的手段。在这项工作中,我们开发了一种深度学习方法,通过处理单个医学扫描并在请求的未来时间点提供目标解剖结构的分割来模拟与年龄相关的疾病的演变。我们的方法代表了一个时不变的物理过程,并解决了利用 NeuralODE 对时间像素级变化进行建模的大规模问题。此外,我们还演示了将先前的特定领域约束纳入我们的方法中的方法,并定义学习时间目标的时间 Dice 损失。为了评估我们的方法在不同年龄相关疾病和成像模式中的适用性,我们在数据集上开发并测试了所提出的方法,其中包括 100 名地理萎缩症患者的 967 个视网膜 OCT 体积和 633 名阿尔茨海默病患者的 2823 个脑部 MRI 体积。对于地理萎缩,所提出的方法在萎缩增长预测方面优于相关基线模型。对于阿尔茨海默病,所提出的方法在预测疾病引起的脑室变化方面表现出卓越的性能,在 TADPOLE 横截面预测挑战数据集上取得了最先进的结果。
AU Nie, Xinyu
Ruan, Jialiang
Otaduy, Maria Concepcion Garcia
Grinberg, Lea Tenenholz
Ringman, John
Shi, Yonggang
AU Nie, 阮新宇, Jialiang Otaduy, Maria Concepcion Garcia Grinberg, Lea Tenenholz Ringman, John Shi, 永刚
Surface-Based Probabilistic Fiber Tracking in Superficial White Matter
浅表白质中基于表面的概率纤维追踪
The short association fibers or U-fibers travel in the superficial white
matter (SWM) beneath the cortical layer. While the U-fibers play a
crucial role in various brain disorders, there is a lack of effective
tools to reconstruct their highly curved trajectory from diffusion MRI
(dMRI). In this work, we propose a novel surface-based framework for the
probabilistic tracking of fibers on the triangular mesh representation
of the SWM. By deriving a closed-form solution to transform the
spherical harmonics (SPHARM) coefficients of 3D fiber orientation
distributions (FODs) to local coordinate systems on each triangle, we
develop a novel approach to project the FODs onto the tangent space of
the SWM. After that, we utilize parallel transport to realize the
intrinsic propagation of streamlines on SWM following probabilistically
sampled fiber directions. Our intrinsic and surface-based method
eliminates the need to perform the necessary but challenging sharp turns
in 3D compared with conventional volume-based tractography methods.
Using data from the Human Connectome Project (HCP), we performed
quantitative comparisons to demonstrate the proposed algorithm can more
effectively reconstruct the U-fibers connecting the precentral and
postcentral gyrus than previous methods. Quantitative validations were
then performed on post-mortem MRIs to show the reconstructed U-fibers
from our method more faithfully follow the SWM than volume-based
tractography. Finally, we applied our algorithm to study the parietal
U-fiber connectivity changes in autosomal dominant Alzheimer's disease
(ADAD) patients and successfully detected significant associations
between U-fiber connectivity and disease severity.
短联合纤维或 U 纤维在皮质层下方的浅表白质 (SWM) 中传播。虽然 U 型纤维在各种脑部疾病中发挥着至关重要的作用,但缺乏有效的工具来通过扩散 MRI (dMRI) 重建其高度弯曲的轨迹。在这项工作中,我们提出了一种新颖的基于表面的框架,用于在 SWM 的三角网格表示上进行纤维的概率跟踪。通过推导闭式解,将 3D 纤维取向分布 (FOD) 的球谐函数 (SPHARM) 系数变换到每个三角形上的局部坐标系,我们开发了一种将 FOD 投影到 SWM 切线空间的新颖方法。之后,我们利用并行传输来实现 SWM 上流线遵循概率采样纤维方向的固有传播。与传统的基于体积的纤维束成像方法相比,我们的内在和基于表面的方法无需在 3D 中执行必要但具有挑战性的急转弯。利用人类连接组计划 (HCP) 的数据,我们进行了定量比较,证明所提出的算法比以前的方法能够更有效地重建连接中央前回和中央后回的 U 纤维。然后对死后 MRI 进行定量验证,以显示我们的方法重建的 U 纤维比基于体积的纤维束成像更忠实地遵循 SWM。最后,我们应用我们的算法来研究常染色体显性阿尔茨海默病 (ADAD) 患者的顶叶 U 纤维连接变化,并成功检测到 U 纤维连接与疾病严重程度之间的显着关联。
AU Su, Ting
Zhu, Jiongtao
Zhang, Xin
Tan, Yuhang
Cui, Han
Zeng, Dong
Guo, Jinchuan
Zheng, Hairong
Ma, Jianhua
Liang, Dong
Ge, Yongshuai
苏AU、朱婷、张炯涛、谭鑫、崔宇航、曾韩、郭东、郑金川、马海荣、梁建华、葛东、永帅
Super Resolution Dual-Energy Cone-Beam CT Imaging With Dual-Layer
Flat-Panel Detector
采用双层平板探测器的超分辨率双能锥束 CT 成像
In flat-panel detector (FPD) based cone-beam computed tomography (CBCT)
imaging, the native receptor array is usually binned into a smaller
matrix size. By doing so, the signal readout speed could be increased by
4-9 times at the expense of a spatial resolution loss of 50%-67%.
Clearly, such manipulation poses a key bottleneck in generating high
spatial and high temporal resolution CBCT images at the same time. In
addition, the conventional FPD is also difficult in generating
dual-energy CBCT images. In this paper, we propose an innovative super
resolution dual-energy CBCT imaging method, named as suRi, based on
dual-layer FPD (DL-FPD) to overcome these aforementioned difficulties at
once. With suRi, specifically, a 1D or 2D sub-pixel (half pixel in this
study) shifted binning is applied instead of the conventionally aligned
binning to double the spatial sampling rate during the dual-energy data
acquisition. As a result, the suRi approach provides a new strategy to
enable high spatial resolution CBCT imaging while at high readout speed.
Moreover, a penalized likelihood material decomposition algorithm is
developed to directly reconstruct the high resolution bases from these
dual-energy CBCT projections containing sub-pixel shifts. Numerical and
physical experiments are performed to validate this newly developed suRi
method with phantoms and biological specimen. Results demonstrate that
suRi can significantly improve the spatial resolution of the CBCT image.
We believe this developed suRi method would greatly enhance the imaging
performance of the DL-FPD based dual-energy CBCT systems in future.
在基于平板探测器 (FPD) 的锥形束计算机断层扫描 (CBCT) 成像中,天然受体阵列通常被合并成较小的矩阵尺寸。通过这样做,信号读出速度可以提高4-9倍,但代价是空间分辨率损失50%-67%。显然,这种操作造成了同时生成高空间和高时间分辨率 CBCT 图像的关键瓶颈。此外,传统的FPD也难以生成双能CBCT图像。在本文中,我们提出了一种基于双层FPD(DL-FPD)的创新超分辨率双能CBCT成像方法,称为suRi,以一次性克服上述困难。具体而言,对于 suRi,应用 1D 或 2D 子像素(本研究中的半像素)移位装箱代替传统的对齐装箱,以在双能数据采集期间使空间采样率加倍。因此,suRi 方法提供了一种新策略,可以在高读出速度的同时实现高空间分辨率 CBCT 成像。此外,还开发了惩罚似然材料分解算法,以直接从这些包含子像素移位的双能 CBCT 投影重建高分辨率基础。通过模型和生物样本进行数值和物理实验来验证这种新开发的 suRi 方法。结果表明suRi可以显着提高CBCT图像的空间分辨率。我们相信,这种开发的 suRi 方法将极大地提高未来基于 DL-FPD 的双能 CBCT 系统的成像性能。
AU Zhong, Liming
Chen, Zeli
Shu, Hai
Zheng, Kaiyi
Li, Yin
Chen, Weicui
Wu, Yuankui
Ma, Jianhua
Feng, Qianjin
Yang, Wei
钟AU、陈黎明、舒泽立、郑海、李开义、陈银、吴伟翠、马元奎、冯建华、杨前进、魏
Multi-Scale Tokens-Aware Transformer Network for Multi-Region and
Multi-Sequence MR-to-CT Synthesis in a Single Model
用于在单一模型中进行多区域和多序列 MR-to-CT 合成的多尺度令牌感知变压器网络
The superiority of magnetic resonance (MR)-only radiotherapy treatment
planning (RTP) has been well demonstrated, benefiting from the synthesis
of computed tomography (CT) images which supplements electron density
and eliminates the errors of multi-modal images registration. An
increasing number of methods has been proposed for MR-to-CT synthesis.
However, synthesizing CT images of different anatomical regions from MR
images with different sequences using a single model is challenging due
to the large differences between these regions and the limitations of
convolutional neural networks in capturing global context information.
In this paper, we propose a multi-scale tokens-aware Transformer network
(MTT-Net) for multi-region and multi-sequence MR-to-CT synthesis in a
single model. Specifically, we develop a multi-scale image tokens
Transformer to capture multi-scale global spatial information between
different anatomical structures in different regions. Besides, to
address the limited attention areas of tokens in Transformer, we
introduce a multi-shape window self-attention into Transformer to
enlarge the receptive fields for learning the multi-directional spatial
representations. Moreover, we adopt a domain classifier in generator to
introduce the domain knowledge for distinguishing the MR images of
different regions and sequences. The proposed MTT-Net is evaluated on a
multi-center dataset and an unseen region, and remarkable performance
was achieved with MAE of 69.33 +/- 10.39 HU, SSIM of 0.778 +/- 0.028,
and PSNR of 29.04 +/- 1.32 dB in head & neck region, and MAE of 62.80
+/- 7.65 HU, SSIM of 0.617 +/- 0.058 and PSNR of 25.94 +/- 1.02 dB in
abdomen region. The proposed MTT-Net outperforms state-of-the-art
methods in both accuracy and visual quality.
得益于计算机断层扫描(CT)图像的合成,补充了电子密度并消除了多模态图像配准的误差,仅磁共振(MR)放射治疗计划(RTP)的优越性已得到充分证明。越来越多的方法被提出用于 MR 到 CT 的合成。然而,使用单一模型从不同序列的 MR 图像合成不同解剖区域的 CT 图像具有挑战性,因为这些区域之间存在巨大差异,并且卷积神经网络在捕获全局上下文信息方面存在局限性。在本文中,我们提出了一种多尺度令牌感知 Transformer 网络(MTT-Net),用于在单个模型中进行多区域和多序列 MR-to-CT 合成。具体来说,我们开发了一个多尺度图像令牌 Transformer 来捕获不同区域的不同解剖结构之间的多尺度全局空间信息。此外,为了解决 Transformer 中 token 的注意力区域有限的问题,我们在 Transformer 中引入了多形状窗口自注意力,以扩大学习多方向空间表示的感受野。此外,我们在生成器中采用领域分类器来引入领域知识来区分不同区域和序列的MR图像。所提出的 MTT-Net 在多中心数据集和未见区域上进行了评估,取得了显着的性能,MAE 为 69.33 +/- 10.39 HU,SSIM 为 0.778 +/- 0.028,PSNR 为 29.04 +/- 1.32 dB头颈部区域的 MAE 为 62.80 +/- 7.65 HU,SSIM 为 0.617 +/- 0.058,腹部区域的 PSNR 为 25.94 +/- 1.02 dB。所提出的 MTT-Net 在准确性和视觉质量方面均优于最先进的方法。
AU Zhang, Jingyang
Pei, Jialun
Xu, Dunyuan
Jin, Yueming
Heng, Pheng-Ann
张AU、裴景阳、徐嘉伦、金敦元、衡月明、彭安
DC2T: Disentanglement-Guided Consolidation and Consistency Training for
Semi-Supervised Cross-Site Continual Segmentation.
DC2T:半监督跨站点连续分割的解开引导整合和一致性训练。
Continual Learning (CL) is recognized to be a storage-efficient and
privacy-protecting approach for learning from sequentially-arriving
medical sites. However, most existing CL methods assume that each site
is fully labeled, which is impractical due to budget and expertise
constraint. This paper studies the Semi-Supervised Continual Learning
(SSCL) that adopts partially-labeled sites arriving over time, with each
site delivering only limited labeled data while the majority remains
unlabeled. In this regard, it is challenging to effectively utilize
unlabeled data under dynamic cross-site domain gaps, leading to
intractable model forgetting on such unlabeled data. To address this
problem, we introduce a novel Disentanglement-guided Consolidation and
Consistency Training (DC2T) framework, which roots in an Online
Semi-Supervised representation Disentanglement (OSSD) perspective to
excavate content representations of partially labeled data from sites
arriving over time. Moreover, these content representations are required
to be consolidated for site-invariance and calibrated for
style-robustness, in order to alleviate forgetting even in the absence
of ground truth. Specifically, for the invariance on previous sites, we
retain historical content representations when learning on a new site,
via a Content-inspired Parameter Consolidation (CPC) method that
prevents altering the model parameters crucial for content preservation.
For the robustness against style variation, we develop a Style-induced
Consistency Training (SCT) scheme that enforces segmentation consistency
over style-related perturbations to recalibrate content encoding. We
extensively evaluate our method on fundus and cardiac image
segmentation, indicating the advantage over existing SSCL methods for
alleviating forgetting on unlabeled data.
持续学习(CL)被认为是一种高效存储且保护隐私的方法,用于从顺序到达的医疗站点进行学习。然而,大多数现有的 CL 方法都假设每个站点都已完全标记,但由于预算和专业知识的限制,这是不切实际的。本文研究了半监督持续学习(SSCL),该学习采用随时间推移到达的部分标记站点,每个站点仅提供有限的标记数据,而大多数站点保持未标记。在这方面,在动态跨站点域间隙下有效利用未标记数据具有挑战性,导致此类未标记数据难以处理的模型遗忘。为了解决这个问题,我们引入了一种新颖的解缠引导巩固和一致性训练(DC2T)框架,该框架植根于在线半监督表示解缠(OSSD)视角,以从随时间到达的站点中挖掘部分标记数据的内容表示。此外,这些内容表示需要针对站点不变性进行整合,并针对风格鲁棒性进行校准,以便即使在没有基本事实的情况下也能减少遗忘。具体来说,为了保持先前站点的不变性,我们在新站点上学习时通过内容启发参数合并(CPC)方法保留历史内容表示,该方法防止更改对内容保存至关重要的模型参数。为了针对风格变化的鲁棒性,我们开发了一种风格诱导的一致性训练(SCT)方案,该方案在与风格相关的扰动上强制执行分段一致性,以重新校准内容编码。 我们对眼底和心脏图像分割方面的方法进行了广泛的评估,表明我们的方法比现有的 SSCL 方法在减少未标记数据的遗忘方面具有优势。
AU Han, Bowen
Sun, Luhao
Li, Chao
Yu, Zhiyong
Jiang, Wenzong
Liu, Weifeng
Tao, Dapeng
Liu, Baodi
AU Han、孙博文、李路豪、余超、蒋志勇、刘文宗、陶伟峰、刘大鹏、宝迪
Deep Location Soft-Embedding-Based Network With Regional Scoring for
Mammogram Classification
基于深度定位软嵌入的网络,具有用于乳房 X 光检查分类的区域评分
Early detection and treatment of breast cancer can significantly reduce
patient mortality, and mammogram is an effective method for early
screening. Computer-aided diagnosis (CAD) of mammography based on deep
learning can assist radiologists in making more objective and accurate
judgments. However, existing methods often depend on datasets with
manual segmentation annotations. In addition, due to the large image
sizes and small lesion proportions, many methods that do not use region
of interest (ROI) mostly rely on multi-scale and multi-feature fusion
models. These shortcomings increase the labor, money, and computational
overhead of applying the model. Therefore, a deep location
soft-embedding-based network with regional scoring (DLSEN-RS) is
proposed. DLSEN-RS is an end-to-end mammography image classification
method containing only one feature extractor and relies on positional
embedding (PE) and aggregation pooling (AP) modules to locate lesion
areas without bounding boxes, transfer learning, or multi-stage
training. In particular, the introduced PE and AP modules exhibit
versatility across various CNN models and improve the model's tumor
localization and diagnostic accuracy for mammography images. Experiments
are conducted on published INbreast and CBIS-DDSM datasets, and compared
to previous state-of-the-art mammographic image classification methods,
DLSEN-RS performed satisfactorily.
乳腺癌的早期发现和治疗可以显着降低患者死亡率,而乳房X光检查是早期筛查的有效方法。基于深度学习的乳腺X线摄影计算机辅助诊断(CAD)可以帮助放射科医生做出更客观、准确的判断。然而,现有方法通常依赖于具有手动分割注释的数据集。此外,由于图像尺寸大、病变比例小,许多不使用感兴趣区域(ROI)的方法大多依赖于多尺度、多特征融合模型。这些缺点增加了应用该模型的劳动力、金钱和计算开销。因此,提出了一种基于深度位置软嵌入的区域评分网络(DLSEN-RS)。 DLSEN-RS 是一种端到端乳腺 X 线摄影图像分类方法,仅包含一个特征提取器,依靠位置嵌入 (PE) 和聚合池 (AP) 模块来定位病变区域,无需边界框、迁移学习或多阶段训练。特别是,引入的 PE 和 AP 模块在各种 CNN 模型中表现出多功能性,并提高了模型的肿瘤定位和乳腺 X 线摄影图像的诊断准确性。在已发布的 INbreast 和 CBIS-DDSM 数据集上进行了实验,与之前最先进的乳腺 X 光图像分类方法相比,DLSEN-RS 的表现令人满意。
AU Bi, Ning
Zakeri, Arezoo
Xia, Yan
Cheng, Nina
Taylor, Zeike A
Frangi, Alejandro F
Gooya, Ali
AU Bi、Ning Zakeri、Arezoo Xia、Yan Cheng、Nina Taylor、Zeike A Frangi、Alejandro F Gooya、Ali
SegMorph: Concurrent Motion Estimation and Segmentation for Cardiac MRI
Sequences.
SegMorph:心脏 MRI 序列的并发运动估计和分割。
We propose a novel recurrent variational network, SegMorph, to perform
concurrent segmentation and motion estimation on cardiac cine magnetic
resonance image (CMR) sequences. Our model establishes a recurrent
latent space that captures spatiotemporal features from cine-MRI
sequences for multitask inference and synthesis. The proposed model
follows a recurrent variational auto-encoder framework and adopts a
learnt prior from the temporal inputs. We utilise a multi-branch decoder
to handle bi-ventricular segmentation and motion estimation
simultaneously. In addition to the spatiotemporal features from the
latent space, motion estimation enriches the supervision of sequential
segmentation tasks by providing pseudo-ground truth. On the other hand,
the segmentation branch helps with motion estimation by predicting
deformation vector fields (DVFs) based on anatomical information.
Experimental results demonstrate that the proposed method performs
better than state-of-the-art approaches qualitatively and quantitatively
for both segmentation and motion estimation tasks. We achieved an 81%
average Dice Similarity Coefficient (DSC) and a less than 3.5 mm average
Hausdorff distance on segmentation. Meanwhile, we achieved a motion
estimation Dice Similarity Coefficient of over 79%, with approximately
0.14% of pixels displaying a negative Jacobian determinant in the
estimated DVFs.
我们提出了一种新颖的循环变分网络 SegMorph,用于对心脏电影磁共振图像 (CMR) 序列执行并发分割和运动估计。我们的模型建立了一个循环潜在空间,可以从电影 MRI 序列中捕获时空特征,以进行多任务推理和合成。所提出的模型遵循循环变分自动编码器框架,并采用从时间输入中学习到的先验知识。我们利用多分支解码器同时处理双心室分割和运动估计。除了来自潜在空间的时空特征之外,运动估计还通过提供伪地面实况丰富了顺序分割任务的监督。另一方面,分割分支通过基于解剖信息预测变形矢量场(DVF)来帮助进行运动估计。实验结果表明,对于分割和运动估计任务,所提出的方法在定性和定量方面都比最先进的方法表现得更好。我们在分割方面实现了 81% 的平均 Dice 相似系数 (DSC) 和小于 3.5 毫米的平均 Hausdorff 距离。同时,我们实现了超过 79% 的运动估计骰子相似系数,大约 0.14% 的像素在估计的 DVF 中显示负雅可比行列式。
AU Bui, Doanh C
Song, Boram
Kim, Kyungeun
Kwak, Jin Tae
AU Bui、Doanh C Song、Boram Kim、Kyungeun Kwak、Jin Tae
Spatially-constrained and -unconstrained bi-graph interaction network
for multi-organ pathology image classification.
用于多器官病理图像分类的空间约束和无约束双图交互网络。
In computational pathology, graphs have shown to be promising for
pathology image analysis. There exist various graph structures that can
discover differing features of pathology images. However, the
combination and interaction between differing graph structures have not
been fully studied and utilized for pathology image analysis. In this
study, we propose a parallel, bi-graph neural network, designated as
SCUBa-Net, equipped with both graph convolutional networks and
Transformers, that processes a pathology image as two distinct graphs,
including a spatially-constrained graph and a spatially-unconstrained
graph. For efficient and effective graph learning, we introduce two
inter-graph interaction blocks and an intra-graph interaction block. The
inter-graph interaction blocks learn the node-to-node interactions
within each graph. The intra-graph interaction block learns the
graph-to-graph interactions at both global- and local-levels with the
help of the virtual nodes that collect and summarize the information
from the entire graphs. SCUBa-Net is systematically evaluated on four
multi-organ datasets, including colorectal, prostate, gastric, and
bladder cancers. The experimental results demonstrate the effectiveness
of SCUBa-Net in comparison to the state-of-the-art convolutional neural
networks, Transformer, and graph neural networks.
在计算病理学中,图表已被证明在病理图像分析中很有前景。存在多种可以发现病理图像的不同特征的图结构。然而,不同图结构之间的组合和相互作用尚未得到充分研究和利用用于病理图像分析。在本研究中,我们提出了一种并行双图神经网络,称为 SCUBa-Net,配备图卷积网络和 Transformer,将病理图像处理为两个不同的图,包括空间约束图和空间约束图。无约束图。为了高效且有效的图学习,我们引入了两个图间交互块和一个图内交互块。图间交互块学习每个图中的节点到节点的交互。图内交互块在收集和总结整个图中信息的虚拟节点的帮助下学习全局和局部级别的图到图交互。 SCUBa-Net 在四个多器官数据集上进行了系统评估,包括结直肠癌、前列腺癌、胃癌和膀胱癌。实验结果证明了 SCUBa-Net 与最先进的卷积神经网络、Transformer 和图神经网络相比的有效性。
AU Deshpande, Rucha
Ozbey, Muzaffer
Li, Hua
Anastasio, Mark A
Brooks, Frank J
AU Deshpande、Rucha Ozbey、Muzaffer Li、Hua Anastasio、Mark A Brooks、Frank J
Assessing the capacity of a denoising diffusion probabilistic model to
reproduce spatial context.
评估去噪扩散概率模型再现空间上下文的能力。
Diffusion models have emerged as a popular family of deep generative
models (DGMs). In the literature, it has been claimed that one class of
diffusion models-denoising diffusion probabilistic models
(DDPMs)-demonstrate superior image synthesis performance as compared to
generative adversarial networks (GANs). To date, these claims have been
evaluated using either ensemble-based methods designed for natural
images, or conventional measures of image quality such as structural
similarity. However, there remains an important need to understand the
extent to which DDPMs can reliably learn medical imaging domain-relevant
information, which is referred to as 'spatial context' in this work. To
address this, a systematic assessment of the ability of DDPMs to learn
spatial context relevant to medical imaging applications is reported for
the first time. A key aspect of the studies is the use of stochastic
context models (SCMs) to produce training data. In this way, the ability
of the DDPMs to reliably reproduce spatial context can be quantitatively
assessed by use of post-hoc image analyses. Error-rates in
DDPM-generated ensembles are reported, and compared to those
corresponding to other modern DGMs. The studies reveal new and important
insights regarding the capacity of DDPMs to learn spatial context.
Notably, the results demonstrate that DDPMs hold significant capacity
for generating contextually correct images that are 'interpolated'
between training samples, which may benefit data-augmentation tasks in
ways that GANs cannot.
扩散模型已成为流行的深度生成模型 (DGM) 系列。在文献中,据称一类扩散模型——去噪扩散概率模型(DDPM)——与生成对抗网络(GAN)相比表现出更优越的图像合成性能。迄今为止,这些声明已使用针对自然图像设计的基于集成的方法或传统的图像质量测量(例如结构相似性)进行了评估。然而,仍然非常需要了解 DDPM 能够在多大程度上可靠地学习医学成像领域相关信息,在本工作中被称为“空间上下文”。为了解决这个问题,首次报告了对 DDPM 学习与医学成像应用相关的空间背景的能力的系统评估。该研究的一个关键方面是使用随机上下文模型(SCM)来生成训练数据。通过这种方式,可以通过使用事后图像分析来定量评估 DDPM 可靠地再现空间上下文的能力。报告了 DDPM 生成的集成中的错误率,并与其他现代 DGM 对应的错误率进行了比较。这些研究揭示了有关 DDPM 学习空间背景能力的新的重要见解。值得注意的是,结果表明 DDPM 具有生成在训练样本之间“插值”的上下文正确图像的强大能力,这可能以 GAN 无法做到的方式有益于数据增强任务。
AU Ren, Jiahao
Li, Jian
Liu, Chang
Chen, Shili
Liang, Lin
Liu, Yang
任仁、李嘉豪、刘健、陈昌、梁诗莉、刘林、杨
Deep Learning With Physics-Embedded Neural Network for Full Waveform
Ultrasonic Brain Imaging
利用物理嵌入式神经网络进行深度学习,实现全波形超声脑成像
The convenience, safety, and affordability of ultrasound imaging make it
a vital non-invasive diagnostic technique for examining soft tissues.
However, significant differences in acoustic impedance between the skull
and soft tissues hinder the successful application of traditional
ultrasound for brain imaging. In this study, we propose a
physics-embedded neural network with deep learning based full waveform
inversion (PEN-FWI), which can achieve reliable quantitative imaging of
brain tissues. The network consists of two fundamental components:
forward convolutional neural network (FCNN) and inversion sub-neural
network (ISNN). The FCNN explores the nonlinear mapping relationship
between the brain model and the wavefield, replacing the tedious
wavefield calculation process based on the finite difference method. The
ISNN implements the mapping from the wavefield to the model. PEN-FWI
includes three iterative steps, each embedding the F CNN into the ISNN,
ultimately achieving tomography from wavefield to brain models.
Simulation and laboratory tests indicate that PEN-FWI can produce
high-quality imaging of the skull and soft tissues, even starting from a
homogeneous water model. PEN-FWI can achieve excellent imaging of clot
models with constant uniform distribution of velocity, randomly Gaussian
distribution of velocity, and irregularly shaped randomly distributed
velocity. Robust differentiation can also be achieved for brain slices
of various tissues and skulls, resulting in high-quality imaging. The
imaging time for a horizontal cross-sectional imag e of the brain is
only 1.13 seconds. This algorithm can effectively promote
ultrasound-based brain tomography and provide feasible solutions in
other fields.
超声成像的便利性、安全性和经济性使其成为检查软组织的重要非侵入性诊断技术。然而,颅骨和软组织之间声阻抗的显着差异阻碍了传统超声在脑成像中的成功应用。在本研究中,我们提出了一种基于深度学习的物理嵌入式神经网络全波形反演(PEN-FWI),它可以实现脑组织的可靠定量成像。该网络由两个基本组件组成:前向卷积神经网络(FCNN)和反演子神经网络(ISNN)。 FCNN探索了脑模型与波场之间的非线性映射关系,取代了基于有限差分法的繁琐的波场计算过程。 ISNN 实现了从波场到模型的映射。 PEN-FWI包括三个迭代步骤,每个步骤将F CNN嵌入到ISNN中,最终实现从波场到脑模型的断层扫描。模拟和实验室测试表明,即使从均质水模型开始,PEN-FWI 也可以产生颅骨和软组织的高质量成像。 PEN-FWI可以对速度恒定均匀分布、速度随机高斯分布和不规则形状随机分布速度的凝块模型实现良好的成像。还可以对各种组织和头骨的脑切片进行稳健区分,从而获得高质量的成像。大脑水平横截面图像的成像时间仅为1.13秒。该算法可以有效推广基于超声的脑断层扫描,并为其他领域提供可行的解决方案。
AU Geng, Haixiao
Fan, Jingfan
Yang, Shuo
Chen, Sigeng
Xiao, Deqiang
Ai, Danni
Fu, Tianyu
Song, Hong
Yuan, Kai
Duan, Feng
Wang, Yongtian
Yang, Jian
区耿、范海晓、杨经帆、陈硕、肖思耕、艾德强、付丹妮、宋天宇、袁宏、段凯、王峰、杨永田、简
DSC-Recon: Dual-Stage Complementary 4D Organ Reconstruction from X-ray
Image Sequence for Intraoperative Fusion.
DSC-Recon:根据 X 射线图像序列进行双阶段互补 4D 器官重建,用于术中融合。
Accurately reconstructing 4D critical organs contributes to the visual
guidance in X-ray image-guided interventional operation. Current methods
estimate intraoperative dynamic meshes by refining a static initial
organ mesh from the semantic information in the single-frame X-ray
images. However, these methods fall short of reconstructing an accurate
and smooth organ sequence due to the distinct respiratory patterns
between the initial mesh and X-ray image. To overcome this limitation,
we propose a novel dual-stage complementary 4D organ reconstruction
(DSC-Recon) model for recovering dynamic organ meshes by utilizing the
preoperative and intraoperative data with different respiratory
patterns. DSC-Recon is structured as a dual-stage framework: 1) The
first stage focuses on addressing a flexible interpolation network
applicable to multiple respiratory patterns, which could generate
dynamic shape sequences between any pair of preoperative 3D meshes
segmented from CT scans. 2) In the second stage, we present a
deformation network to take the generated dynamic shape sequence as the
initial prior and explore the discriminate feature (i.e., target organ
areas and meaningful motion information) in the intraoperative X-ray
images, predicting the deformed mesh by introducing a designed feature
mapping pipeline integrated into the initialized shape refinement
process. Experiments on simulated and clinical datasets demonstrate the
superiority of our method over state-of-the-art methods in both
quantitative and qualitative aspects.
准确重建4D关键器官有助于X射线图像引导介入手术的视觉引导。当前的方法通过根据单帧 X 射线图像中的语义信息细化静态初始器官网格来估计术中动态网格。然而,由于初始网格和 X 射线图像之间不同的呼吸模式,这些方法无法重建准确且平滑的器官序列。为了克服这一限制,我们提出了一种新型双阶段互补 4D 器官重建 (DSC-Recon) 模型,用于通过利用不同呼吸模式的术前和术中数据来恢复动态器官网格。 DSC-Recon 的结构为双阶段框架:1)第一阶段侧重于解决适用于多种呼吸模式的灵活插值网络,该网络可以在从 CT 扫描分段的任何一对术前 3D 网格之间生成动态形状序列。 2)在第二阶段,我们提出一个变形网络,以生成的动态形状序列作为初始先验,探索术中X射线图像中的判别特征(即目标器官区域和有意义的运动信息),预测变形通过引入集成到初始化形状细化过程中的设计特征映射管道来进行网格划分。模拟和临床数据集的实验证明了我们的方法在定量和定性方面都优于最先进的方法。
AU Lu, Yucheng
Xu, Zhixin
Choi, Moon Hyung
Kim, Jimin
Jung, Seung-Won
AU Lu、徐宇成、崔志新、金文亨、Jimin Jung、Seung-Won
Cross-domain Denoising for Low-dose Multi-frame Spiral Computed
Tomography.
低剂量多帧螺旋计算机断层扫描的跨域去噪。
Computed tomography (CT) has been used worldwide as a non-invasive test
to assist in diagnosis. However, the ionizing nature of X-ray exposure
raises concerns about potential health risks such as cancer. The desire
for lower radiation doses has driven researchers to improve
reconstruction quality. Although previous studies on low-dose computed
tomography (LDCT) denoising have demonstrated the effectiveness of
learning-based methods, most were developed on the simulated data.
However, the real-world scenario differs significantly from the
simulation domain, especially when using the multi-slice spiral scanner
geometry. This paper proposes a two-stage method for the commercially
available multi-slice spiral CT scanners that better exploits the
complete reconstruction pipeline for LDCT denoising across different
domains. Our approach makes good use of the high redundancy of
multi-slice projections and the volumetric reconstructions while
leveraging the over-smoothing issue in conventional cascaded frameworks
caused by aggressive denoising. The dedicated design also provides a
more explicit interpretation of the data flow. Extensive experiments on
various datasets showed that the proposed method could remove up to 70%
of noise without compromised spatial resolution, while subjective
evaluations by two experienced radiologists further supported its
superior performance against state-of-the-art methods in clinical
practice. Code is available at https://github.com/YCL92/TMD-LDCT.
计算机断层扫描 (CT) 已在全世界范围内用作辅助诊断的非侵入性检查。然而,X 射线暴露的电离性质引起了人们对癌症等潜在健康风险的担忧。对较低辐射剂量的渴望促使研究人员提高重建质量。尽管之前关于低剂量计算机断层扫描(LDCT)去噪的研究已经证明了基于学习的方法的有效性,但大多数都是在模拟数据上开发的。然而,现实世界的场景与模拟域有很大不同,特别是在使用多层螺旋扫描仪几何结构时。本文提出了一种适用于商用多层螺旋 CT 扫描仪的两阶段方法,该方法可以更好地利用完整的重建流程来跨不同领域进行 LDCT 去噪。我们的方法充分利用了多切片投影和体积重建的高冗余,同时利用了传统级联框架中由激进去噪引起的过度平滑问题。专用设计还提供了对数据流的更明确的解释。对各种数据集的大量实验表明,所提出的方法可以在不影响空间分辨率的情况下消除高达 70% 的噪声,而两位经验丰富的放射科医生的主观评估进一步支持了其在临床实践中相对于最先进方法的优越性能。代码可在 https://github.com/YCL92/TMD-LDCT 获取。
AU Li, Yunxiang
Shao, Hua-Chieh
Liang, Xiao
Chen, Liyuan
Li, Ruiqi
Jiang, Steve
Wang, Jing
Zhang, You
AU Li、邵云翔、梁华杰、陈晓、李丽媛、蒋瑞琪、Steve Wang、张静、尤
Zero-Shot Medical Image Translation via Frequency-Guided Diffusion
Models
通过频率引导扩散模型的零样本医学图像翻译
Recently, the diffusion model has emerged as a superior generative model
that can produce high quality and realistic images. However, for medical
image translation, the existing diffusion models are deficient in
accurately retaining structural information since the structure details
of source domain images are lost during the forward diffusion process
and cannot be fully recovered through learned reverse diffusion, while
the integrity of anatomical structures is extremely important in medical
images. For instance, errors in image translation may distort, shift, or
even remove structures and tumors, leading to incorrect diagnosis and
inadequate treatments. Training and conditioning diffusion models using
paired source and target images with matching anatomy can help. However,
such paired data are very difficult and costly to obtain, and may also
reduce the robustness of the developed model to out-of-distribution
testing data. We propose a frequency-guided diffusion model (FGDM) that
employs frequency-domain filters to guide the diffusion model for
structure-preserving image translation. Based on its design, FGDM allows
zero-shot learning, as it can be trained solely on the data from the
target domain, and used directly for source-to-target domain translation
without any exposure to the source-domain data during training. We
evaluated it on three cone-beam CT (CBCT)-to-CT translation tasks for
different anatomical sites, and a cross-institutional MR imaging
translation task. FGDM outperformed the state-of-the-art methods
(GAN-based, VAE-based, and diffusion-based) in metrics of Frechet
Inception Distance (FID), Peak Signal-to-Noise Ratio (PSNR), and
Structural Similarity Index Measure (SSIM), showing its significant
advantages in zero-shot medical image translation.
最近,扩散模型已成为一种优越的生成模型,可以生成高质量且逼真的图像。然而,对于医学图像翻译,现有的扩散模型在准确保留结构信息方面存在缺陷,因为源域图像的结构细节在前向扩散过程中丢失,并且无法通过学习的反向扩散完全恢复,而解剖结构的完整性也受到影响。在医学图像中极其重要。例如,图像翻译中的错误可能会扭曲、移动甚至移除结构和肿瘤,从而导致错误的诊断和不充分的治疗。使用配对的源图像和目标图像以及匹配的解剖结构来训练和调节扩散模型会有所帮助。然而,获得此类配对数据非常困难且成本高昂,并且还可能降低所开发模型对分布外测试数据的鲁棒性。我们提出了一种频率引导扩散模型(FGDM),它采用频域滤波器来指导扩散模型以实现结构保留图像转换。基于其设计,FGDM 允许零样本学习,因为它可以仅根据目标域的数据进行训练,并直接用于源到目标域的转换,而无需在训练期间接触源域数据。我们在不同解剖部位的三个锥束 CT (CBCT) 到 CT 转换任务以及跨机构 MR 成像转换任务上对其进行了评估。 FGDM 在 Frechet 起始距离 (FID)、峰值信噪比 (PSNR) 和结构相似性指数指标方面优于最先进的方法(基于 GAN、基于 VAE 和基于扩散) Measure(SSIM),在零样本医学图像翻译方面展现出显着优势。
AU Lee, Seungeun
Lee, Seunghwan
Willbrand, Ethan H
Parker, Benjamin J
Bunge, Silvia A
Weiner, Kevin S
Lyu, Ilwoo
AU Lee、Seungeun Lee、Seunghwan Willbrand、Ethan H Parker、Benjamin J Bunge、Silvia A Weiner、Kevin S Lyu、Ilwoo
Leveraging Input-Level Feature Deformation with Guided-Attention for
Sulcal Labeling.
利用输入级特征变形和引导注意力进行脑沟标记。
The identification of cortical sulci is key for understanding functional
and structural development of the cortex. While large, consistent sulci
(or primary/secondary sulci) receive significant attention in most
studies, the exploration of smaller and more variable sulci (or putative
tertiary sulci) remains relatively under-investigated. Despite its
importance, automatic labeling of cortical sulci is challenging due to
(1) the presence of substantial anatomical variability, (2) the
relatively small size of the regions of interest (ROIs) compared to
unlabeled regions, and (3) the scarcity of annotated labels. In this
paper, we propose a novel end-to-end learning framework using a
spherical convolutional neural network (CNN). Specifically, the proposed
method learns to effectively warp geometric features in a direction that
facilitates the labeling of sulci while mitigating the impact of
anatomical variability. Moreover, we introduce a guided-attention
mechanism that takes into account the extent of deformation induced by
the learned warping. This extracts discriminative features that
emphasize sulcal ROIs, while suppressing irrelevant information of
unlabeled regions. In the experiments, we evaluate the proposed method
on 8 sulci of the posterior medial cortex. Our method outperforms
existing methods particularly in the putative tertiary sulci. The code
is publicly available at https://github.com/Shape-Lab/DSPHARM-Net.
皮质沟的识别是了解皮质功能和结构发育的关键。虽然大的、一致的脑沟(或初级/次级脑沟)在大多数研究中受到了极大的关注,但对更小且更多变的脑沟(或推定的三级脑沟)的探索仍然相对不足。尽管其重要性,皮质沟的自动标记仍然具有挑战性,因为(1)存在显着的解剖变异性,(2)与未标记区域相比,感兴趣区域(ROI)的尺寸相对较小,以及(3)缺乏带注释的标签。在本文中,我们提出了一种使用球形卷积神经网络(CNN)的新颖的端到端学习框架。具体来说,所提出的方法学习有效地将几何特征扭曲到有利于脑沟标记的方向,同时减轻解剖变异性的影响。此外,我们引入了一种引导注意机制,该机制考虑了学习到的扭曲引起的变形程度。这提取了强调脑沟 ROI 的区分特征,同时抑制了未标记区域的不相关信息。在实验中,我们在后内侧皮质的 8 个脑沟上评估了所提出的方法。我们的方法优于现有方法,特别是在假定的第三沟中。该代码可在 https://github.com/Shape-Lab/DSPHARM-Net 上公开获取。
EI 1558-254X
DA 2024-09-28
UT MEDLINE:39325613
PM 39325613
ER
EI 1558-254X DA 2024-09-28 UT MEDLINE:39325613 PM 39325613 ER
AU Chen, Xiongchao
Zhou, Bo
Guo, Xueqi
Xie, Huidong
Liu, Qiong
Duncan, James S.
Sinusas, Albert J.
Liu, Chi
AU Chen, 周雄超, 郭博, 谢学奇, 刘慧东, Qiong Duncan, James S. Sinusas, Albert J. Liu, Chi
DuDoCFNet: Dual-Domain Coarse-to-Fine Progressive Network for
Simultaneous Denoising, Limited-View Reconstruction, and Attenuation
Correction of Cardiac SPECT
DuDoCFNet:双域从粗到细渐进网络,用于心脏 SPECT 的同步去噪、有限视图重建和衰减校正
Single-Photon Emission Computed Tomography (SPECT) is widely applied for
the diagnosis of coronary artery diseases. Low-dose (LD) SPECT aims to
minimize radiation exposure but leads to increased image noise.
Limited-view (LV) SPECT, such as the latest GE MyoSPECT ES system,
enables accelerated scanning and reduces hardware expenses but degrades
reconstruction accuracy. Additionally, Computed Tomography (CT) is
commonly used to derive attenuation maps ( $\mu $ -maps) for attenuation
correction (AC) of cardiac SPECT, but it will introduce additional
radiation exposure and SPECT-CT misalignments. Although various methods
have been developed to solely focus on LD denoising, LV reconstruction,
or CT-free AC in SPECT, the solution for simultaneously addressing these
tasks remains challenging and under-explored. Furthermore, it is
essential to explore the potential of fusing cross-domain and
cross-modality information across these interrelated tasks to further
enhance the accuracy of each task. Thus, we propose a Dual-Domain
Coarse-to-Fine Progressive Network (DuDoCFNet), a multi-task learning
method for simultaneous LD denoising, LV reconstruction, and CT-free
-map generation of cardiac SPECT. Paired dual-domain networks in
DuDoCFNet are cascaded using a multi-layer fusion mechanism for
cross-domain and cross-modality feature fusion. Two-stage progressive
learning strategies are applied in both projection and image domains to
achieve coarse-to-fine estimations of SPECT projections and CT-derived
$\mu $ -maps. Our experiments demonstrate DuDoCFNet's superior accuracy
in estimating projections, generating $\mu $ -maps, and AC
reconstructions compared to existing single- or multi-task learning
methods, under various iterations and LD levels. The source code of this
work is available at
https://github.com/XiongchaoChen/DuDoCFNet-MultiTask.
单光子发射计算机断层扫描(SPECT)广泛应用于冠状动脉疾病的诊断。低剂量 (LD) SPECT 旨在最大限度地减少辐射暴露,但会导致图像噪声增加。有限视野 (LV) SPECT,例如最新的 GE MyoSPECT ES 系统,可以加速扫描并减少硬件费用,但会降低重建精度。此外,计算机断层扫描 (CT) 通常用于导出用于心脏 SPECT 衰减校正 (AC) 的衰减图 ( $\mu $ -maps),但它会引入额外的辐射暴露和 SPECT-CT 错位。尽管已经开发出各种方法来专门关注 SPECT 中的 LD 去噪、左心室重建或无 CT 交流,但同时解决这些任务的解决方案仍然具有挑战性且尚未得到充分探索。此外,有必要探索在这些相互关联的任务中融合跨域和跨模态信息的潜力,以进一步提高每个任务的准确性。因此,我们提出了一种双域粗到精渐进网络(DuDoCFNet),这是一种多任务学习方法,用于同时 LD 去噪、左心室重建和心脏 SPECT 的无 CT 地图生成。 DuDoCFNet 中的配对双域网络使用多层融合机制进行级联,以实现跨域和跨模态特征融合。两阶段渐进学习策略应用于投影和图像领域,以实现 SPECT 投影和 CT 导出的 $\mu $ 地图的从粗到细的估计。我们的实验证明,在各种迭代和 LD 级别下,与现有的单任务或多任务学习方法相比,DuDoCFNet 在估计投影、生成 $\mu $ -map 和 AC 重建方面具有卓越的准确性。 这项工作的源代码可以在 https://github.com/XiongchaoChen/DuDoCFNet-MultiTask 获取。
AU Ghoul, Aya
Pan, Jiazhen
Lingg, Andreas
Kuebler, Jens
Krumm, Patrick
Hammernik, Kerstin
Rueckert, Daniel
Gatidis, Sergios
Kuestner, Thomas
AU Ghoul、Aya Pan、Jiazhen Lingg、Andreas Kuebler、Jens Krumm、Patrick Hammernik、Kerstin Rueckert、Daniel Gatidis、Sergios Kuestner、Thomas
Attention-Aware Non-Rigid Image Registration for Accelerated MR Imaging
用于加速 MR 成像的注意力感知非刚性图像配准
Accurate motion estimation at high acceleration factors enables rapid
motion-compensated reconstruction in Magnetic Resonance Imaging (MRI)
without compromising the diagnostic image quality. In this work, we
introduce an attention-aware deep learning-based framework that can
perform non-rigid pairwise registration for fully sampled and
accelerated MRI. We extract local visual representations to build
similarity maps between the registered image pairs at multiple
resolution levels and additionally leverage long-range contextual
information using a transformer-based module to alleviate ambiguities in
the presence of artifacts caused by undersampling. We combine local and
global dependencies to perform simultaneous coarse and fine motion
estimation. The proposed method was evaluated on in-house acquired fully
sampled and accelerated data of 101 patients and 62 healthy subjects
undergoing cardiac and thoracic MRI. The impact of motion estimation
accuracy on the downstream task of motion-compensated reconstruction was
analyzed. We demonstrate that our model derives reliable and consistent
motion fields across different sampling trajectories (Cartesian and
radial) and acceleration factors of up to 16x for cardiac motion and 30x
for respiratory motion and achieves superior image quality in
motion-compensated reconstruction qualitatively and quantitatively
compared to conventional and recent deep learning-based approaches.
高加速因子下的精确运动估计可实现磁共振成像 (MRI) 中的快速运动补偿重建,而不会影响诊断图像质量。在这项工作中,我们引入了一种基于注意力感知的深度学习框架,该框架可以为完全采样和加速的 MRI 执行非刚性成对配准。我们提取局部视觉表示,以在多个分辨率级别的注册图像对之间构建相似性图,并使用基于转换器的模块另外利用远程上下文信息来减轻因欠采样引起的伪像存在时的模糊性。我们结合局部和全局依赖性来同时执行粗略和精细运动估计。该方法根据内部采集的 101 名患者和 62 名接受心脏和胸部 MRI 健康受试者的完全采样和加速数据进行了评估。分析了运动估计精度对运动补偿重建下游任务的影响。我们证明,我们的模型在不同的采样轨迹(笛卡尔和径向)上得出可靠且一致的运动场,心脏运动的加速因子高达 16 倍,呼吸运动的加速因子高达 30 倍,并且与相比,在运动补偿重建中定性和定量地实现了卓越的图像质量。传统和最近基于深度学习的方法。
AU Ma, Wenao
Chen, Cheng
Gong, Yuqi
Chan, Nga Yan
Jiang, Meirui
Mak, Calvin Hoi-Kwan
Abrigo, Jill M.
Dou, Qi
AU Ma, Wenao Chen, Cheng Kong, Yuqi Chan, Nga Yan Jiang, Meirui Mak, Calvin Hoi-Kwan Abrigo, Jill M. Dou, Qi
Causal Effect Estimation on Imaging and Clinical Data for Treatment
Decision Support of Aneurysmal Subarachnoid Hemorrhage
影像学和临床数据因果效应估计对动脉瘤性蛛网膜下腔出血治疗决策的支持
Aneurysmal subarachnoid hemorrhage is a medical emergency of brain that
has high mortality and poor prognosis. Causal effect estimation of
treatment strategies on patient outcomes is crucial for aneurysmal
subarachnoid hemorrhage treatment decision-making. However, most
existing studies on treatment decision-making support of this disease
are unable to simultaneously compare the potential outcomes of different
treatments for a patient. Furthermore, these studies fail to
harmoniously integrate the imaging data with non-imaging clinical data,
both of which are useful in clinical scenarios. In this paper, we
estimate the causal effect of various treatments on patients with
aneurysmal subarachnoid hemorrhage by integrating plain CT with
non-imaging clinical data, which is represented using structured tabular
data. Specifically, we first propose a novel scheme that uses
multi-modality confounders distillation architecture to predict the
treatment outcome and treatment assignment simultaneously. With these
distilled confounder features, we design an imaging and non-imaging
interaction representation learning strategy to use the complementary
information extracted from different modalities to balance the feature
distribution of different treatment groups. We have conducted extensive
experiments using a clinical dataset of 656 subarachnoid hemorrhage
cases, which was collected from the Hospital Authority Data
Collaboration Laboratory in Hong Kong. Our method shows consistent
improvements on the evaluation metrics of treatment effect estimation,
achieving state-of-the-art results over strong competitors. Code is
released at https://github.com/med-air/TOP-aSAH.
动脉瘤性蛛网膜下腔出血是一种死亡率高、预后差的脑部急症。治疗策略对患者预后的因果效应评估对于动脉瘤性蛛网膜下腔出血的治疗决策至关重要。然而,大多数关于该疾病治疗决策支持的现有研究无法同时比较不同治疗对患者的潜在结果。此外,这些研究未能将成像数据与非成像临床数据和谐地整合在一起,而这两者在临床场景中都是有用的。在本文中,我们通过将平扫 CT 与非影像学临床数据(使用结构化表格数据表示)相结合,评估了各种治疗对动脉瘤性蛛网膜下腔出血患者的因果影响。具体来说,我们首先提出了一种新颖的方案,该方案使用多模态混杂因素蒸馏架构来同时预测治疗结果和治疗分配。利用这些提取的混杂特征,我们设计了一种成像和非成像交互表示学习策略,以使用从不同方式提取的补充信息来平衡不同治疗组的特征分布。我们使用从香港医院管理局数据协作实验室收集的 656 例蛛网膜下腔出血病例的临床数据集进行了广泛的实验。我们的方法显示了治疗效果估计的评估指标的持续改进,取得了超越强大竞争对手的最先进的结果。代码发布于 https://github.com/med-air/TOP-aSAH。
AU Liu, Huabing
Huang, Jiawei
Jia, Dengqiang
Wang, Qian
Xu, Jun
Shen, Dinggang
刘AU、黄华兵、贾家伟、王登强、徐谦、沉军、丁刚
Transferring Adult-like Phase Images for Robust Multi-view Isointense
Infant Brain Segmentation.
传输类似成人的相位图像以实现稳健的多视图等强度婴儿大脑分割。
Accurate tissue segmentation of infant brain in magnetic resonance (MR)
images is crucial for charting early brain development and identifying
biomarkers. Due to ongoing myelination and maturation, in the isointense
phase (6-9 months of age), the gray and white matters of infant brain
exhibit similar intensity levels in MR images, posing significant
challenges for tissue segmentation. Meanwhile, in the adult-like phase
around 12 months of age, the MR images show high tissue contrast and can
be easily segmented. In this paper, we propose to effectively exploit
adult-like phase images to achieve robustmulti-view isointense infant
brain segmentation. Specifically, in one way, we transfer adult-like
phase images to the isointense view, which have similar tissue contrast
as the isointense phase images, and use the transferred images to train
an isointense-view segmentation network. On the other way, we transfer
isointense phase images to the adult-like view, which have enhanced
tissue contrast, for training a segmentation network in the adult-like
view. The segmentation networks of different views form a multi-path
architecture that performs multi-view learning to further boost the
segmentation performance. Since anatomy-preserving style transfer is key
to the downstream segmentation task, we develop a Disentangled
Cycle-consistent Adversarial Network (DCAN) with strong regularization
terms to accurately transfer realistic tissue contrast between
isointense and adult-like phase images while still maintaining their
structural consistency. Experiments on both NDAR and iSeg-2019 datasets
demonstrate a significant superior performance of our method over the
state-of-the-art methods.
磁共振 (MR) 图像中婴儿大脑的准确组织分割对于绘制早期大脑发育图和识别生物标志物至关重要。由于持续的髓鞘形成和成熟,在等信号阶段(6-9个月大),婴儿大脑的灰质和白质在MR图像中表现出相似的强度水平,这对组织分割提出了重大挑战。同时,在 12 个月左右的类成人阶段,MR 图像显示出高组织对比度,并且可以轻松分割。在本文中,我们建议有效地利用类似成人的相位图像来实现鲁棒的多视图等强度婴儿大脑分割。具体来说,在一种方式中,我们将类似成人的相位图像转移到等强度视图,其具有与等强度相位图像相似的组织对比度,并使用转移的图像来训练等强度视图分割网络。另一方面,我们将等强度相位图像传输到类似成人的视图,其具有增强的组织对比度,用于在类似成人的视图中训练分割网络。不同视图的分割网络形成多路径架构,执行多视图学习以进一步提高分割性能。由于保留解剖结构的风格转移是下游分割任务的关键,因此我们开发了一种具有强大正则化项的解缠结循环一致对抗网络(DCAN),可以准确地转移等强度和类成人相位图像之间的真实组织对比度,同时仍然保持其结构一致性。在 NDAR 和 iSeg-2019 数据集上的实验表明,我们的方法比最先进的方法具有显着的优越性能。
AU Chen, Tao
Wang, Chenhui
Chen, Zhihao
Lei, Yiming
Shan, Hongming
陈AU、王涛、陈晨辉、雷志豪、单一鸣、洪明
HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation.
HiDiff:用于医学图像分割的混合扩散框架。
Medical image segmentation has been significantly advanced with the
rapid development of deep learning (DL) techniques. Existing DL-based
segmentation models are typically discriminative; i.e., they aim to
learn a mapping from the input image to segmentation masks. However,
these discriminative methods neglect the underlying data distribution
and intrinsic class characteristics, suffering from unstable feature
space. In this work, we propose to complement discriminative
segmentation methods with the knowledge of underlying data distribution
from generative models. To that end, we propose a novel hybrid diffusion
framework for medical image segmentation, termed HiDiff, which can
synergize the strengths of existing discriminative segmentation models
and new generative diffusion models. HiDiff comprises two key
components: discriminative segmentor and diffusion refiner. First, we
utilize any conventional trained segmentation models as discriminative
segmentor, which can provide a segmentation mask prior for diffusion
refiner. Second, we propose a novel binary Bernoulli diffusion model
(BBDM) as the diffusion refiner, which can effectively, efficiently, and
interactively refine the segmentation mask by modeling the underlying
data distribution. Third, we train the segmentor and BBDM in an
alternate-collaborative manner to mutually boost each other. Extensive
experimental results on abdomen organ, brain tumor, polyps, and retinal
vessels segmentation datasets, covering four widely-used modalities,
demonstrate the superior performance of HiDiff over existing medical
segmentation algorithms, including the state-of-the-art transformer- and
diffusion-based ones. In addition, HiDiff excels at segmenting small
objects and generalizing to new datasets. Source codes are made
available at https://github.com/takimailto/HiDiff.
随着深度学习(DL)技术的快速发展,医学图像分割取得了显着的进步。现有的基于深度学习的分割模型通常具有区分性;即,他们的目标是学习从输入图像到分割掩模的映射。然而,这些判别方法忽略了底层数据分布和内在类别特征,导致特征空间不稳定。在这项工作中,我们建议利用生成模型的基础数据分布知识来补充判别性分割方法。为此,我们提出了一种用于医学图像分割的新型混合扩散框架,称为 HiDiff,它可以协同现有判别分割模型和新的生成扩散模型的优势。 HiDiff 包含两个关键组件:判别分割器和扩散细化器。首先,我们利用任何传统的训练分割模型作为判别分割器,它可以为扩散细化器提供先验分割掩模。其次,我们提出了一种新颖的二元伯努利扩散模型(BBDM)作为扩散细化器,它可以通过对底层数据分布进行建模来有效、高效且交互式地细化分割掩模。第三,我们以交替协作的方式训练分段器和BBDM,相互促进。针对腹部器官、脑肿瘤、息肉和视网膜血管分割数据集的广泛实验结果(涵盖四种广泛使用的模式)证明了 HiDiff 优于现有医学分割算法(包括最先进的 Transformer 和 Diffusion)的性能基于的。此外,HiDiff 擅长分割小对象并推广到新的数据集。 源代码可在 https://github.com/takimailto/HiDiff 获取。
AU Li, Zekun
Benabdallah, Nadia
Laforest, Richard
Wahl, Richard L
Thorek, Daniel L J
Jha, Abhinav K
AU Li、Zekun Benabdallah、Nadia Laforest、Richard Wahl、Richard L Thorek、Daniel LJ Jha、Abhinav K
Joint regional uptake quantification of thorium-227 and radium-223 using
a multiple-energy-window projection-domain quantitative SPECT method.
使用多能量窗口投影域定量 SPECT 方法对钍 227 和镭 223 进行联合区域吸收定量。
Thorium-227 (227Th)-based alpha-particle radiopharmaceutical therapies
(alpha-RPTs) are currently being investigated in several clinical and
pre-clinical studies. After administration, 227Th decays to 223Ra,
another alpha-particle-emitting isotope, which redistributes within the
patient. Reliable dose quantification of both 227Th and 223Ra is
clinically important, and SPECT may perform this quantification as these
isotopes also emit X- and gamma-ray photons. However, reliable
quantification is challenging for several reasons: the
orders-of-magnitude lower activity compared to conventional SPECT,
resulting in a very low number of detected counts, the presence of
multiple photopeaks, substantial overlap in the emission spectra of
these isotopes, and the image-degrading effects in SPECT. To address
these issues, we propose a multiple-energy-window projection-domain
quantification (MEW-PDQ) method that jointly estimates the regional
activity uptake of both 227Th and 223Ra directly using the SPECT
projection data from multiple energy windows. We evaluated the method
with realistic simulation studies conducted with anthropomorphic digital
phantoms, including a virtual imaging trial, in the context of imaging
patients with bone metastases of prostate cancer who were treated with
227Th-based alpha-RPTs. The proposed method yielded reliable (accurate
and precise) regional uptake estimates of both isotopes and outperformed
state-of-the-art methods across different lesion sizes and contrasts, as
well as in the virtual imaging trial. This reliable performance was also
observed with moderate levels of intra-regional heterogeneous uptake as
well as when there were moderate inaccuracies in the definitions of the
support of various regions. Additionally, we demonstrated the
effectiveness of using multiple energy windows and the variance of the
estimated uptake using the proposed method approached the
Cramer-Rao-lower-bound-defined theoretical limit. These results provide
strong evidence in support of this method for reliable uptake
quantification in 227Th-based alpha-RPTs.
基于钍 227 (227Th) 的 α 粒子放射性药物疗法 (α-RPT) 目前正在多项临床和临床前研究中进行研究。给药后,227Th 衰变成 223Ra,这是另一种发射α粒子的同位素,在患者体内重新分布。 227Th 和 223Ra 的可靠剂量定量在临床上很重要,SPECT 可以执行这种定量,因为这些同位素也发射 X 射线和伽马射线光子。然而,可靠的定量由于以下几个原因而具有挑战性:与传统 SPECT 相比,活性降低了几个数量级,导致检测到的计数数量非常少、存在多个光峰、这些同位素的发射光谱存在大量重叠,以及SPECT 中的图像降级效应。为了解决这些问题,我们提出了一种多能量窗口投影域量化(MEW-PDQ)方法,该方法直接使用来自多个能量窗口的 SPECT 投影数据联合估计 227Th 和 223Ra 的区域活动吸收。我们通过使用拟人化数字模型进行的真实模拟研究来评估该方法,包括虚拟成像试验,对接受基于 227Th 的 α-RPT 治疗的前列腺癌骨转移患者进行成像。所提出的方法对同位素进行了可靠(准确和精确)的区域摄取估计,并且在不同病变大小和对比度以及虚拟成像试验中优于最先进的方法。在中等水平的区域内异质性吸收以及各个区域的支持定义存在一定程度的不准确的情况下,也观察到了这种可靠的表现。 此外,我们证明了使用多个能量窗口的有效性,以及使用所提出的方法估计吸收的方差接近 Cramer-Rao 下界定义的理论极限。这些结果为支持该方法在基于 227Th 的 α-RPT 中进行可靠的摄取定量提供了强有力的证据。
AU Li, Shiyu
Qiao, Pengchong
Wang, Lin
Ning, Munan
Yuan, Li
Zheng, Yefeng
Chen, Jie
AU Li、乔诗雨、王鹏冲、林宁、袁穆南、李峥、陈业峰、杰
An Organ-aware Diagnosis Framework for Radiology Report Generation.
用于生成放射学报告的器官感知诊断框架。
Radiology report generation (RRG) is crucial to save the valuable time
of radiologists in drafting the report, therefore increasing their work
efficiency. Compared to typical methods that directly transfer image
captioning technologies to RRG, our approach incorporates organ-wise
priors into the report generation. Specifically, in this paper, we
propose Organ-aware Diagnosis (OaD) to generate diagnostic reports
containing descriptions of each physiological organ. During training, we
first develop a task distillation (TD) module to extract organ-level
descriptions from reports. We then introduce an organ-aware report
generation module that, for one thing, provides a specific description
for each organ, and for another, simulates clinical situations to
provide short descriptions for normal cases. Furthermore, we design an
auto-balance mask loss to ensure balanced training for normal/abnormal
descriptions and various organs simultaneously. Being intuitively
reasonable and practically simple, our OaD outperforms SOTA alternatives
by large margins on commonly used IU-Xray and MIMIC-CXR datasets, as
evidenced by a 3.4% BLEU-1 improvement on MIMIC-CXR and 2.0% BLEU-2
improvement on IU-Xray.
放射学报告生成(RRG)对于节省放射科医生起草报告的宝贵时间,从而提高他们的工作效率至关重要。与直接将图像字幕技术转移到 RRG 的典型方法相比,我们的方法将器官方面的先验纳入报告生成中。具体来说,在本文中,我们提出器官感知诊断(OaD)来生成包含每个生理器官描述的诊断报告。在训练过程中,我们首先开发一个任务蒸馏(TD)模块来从报告中提取器官级别的描述。然后,我们引入一个器官感知报告生成模块,一方面,为每个器官提供具体描述,另一方面,模拟临床情况,为正常病例提供简短描述。此外,我们设计了自动平衡掩模损失,以确保正常/异常描述和各个器官同时进行平衡训练。由于直观合理且实际上简单,我们的 OaD 在常用的 IU-Xray 和 MIMIC-CXR 数据集上远远优于 SOTA 替代方案,MIMIC-CXR 上的 BLEU-1 改进为 3.4%,IU 上的 BLEU-2 改进为 2.0% -X射线。
AU He, Hailong
Paetzold, Johannes C.
Boerner, Nils
Riedel, Erik
Gerl, Stefan
Schneider, Simon
Fisher, Chiara
Ezhov, Ivan
Shit, Suprosanna
Li, Hongwei
Ruckert, Daniel
Aguirre, Juan
Biedermann, Tilo
Darsow, Ulf
Menze, Bjoern
Ntziachristos, Vasilis
AU He, Hailong Paetzold, Johannes C. Boerner, Nils Riedel, Erik Gerl, Stefan Schneider, Simon Fisher, Chiara Ezhov, Ivan Shit, Suprosanna Li, Hongwei Ruckert, Daniel Aguirre, Juan Biedermann, Tilo Darsow, Ulf Menze, Bjoern Ntziachristos,瓦西利斯
Machine Learning Analysis of Human Skin by Optoacoustic Mesoscopy for
Automated Extraction of Psoriasis and Aging Biomarkers
通过光声介观镜对人体皮肤进行机器学习分析,自动提取牛皮癣和衰老生物标志物
Ultra-wideband raster-scan optoacoustic mesoscopy (RSOM) is a novel
modality that has demonstrated unprecedented ability to visualize
epidermal and dermal structures in-vivo. However, an automatic and
quantitative analysis of three-dimensional RSOM datasets remains
unexplored. In this work we present our framework: Deep Learning RSOM
Analysis Pipeline (DeepRAP), to analyze and quantify morphological skin
features recorded by RSOM and extract imaging biomarkers for disease
characterization. DeepRAP uses a multi-network segmentation strategy
based on convolutional neural networks with transfer learning. This
strategy enabled the automatic recognition of skin layers and subsequent
segmentation of dermal microvasculature with an accuracy equivalent to
human assessment. DeepRAP was validated against manual segmentation on
25 psoriasis patients under treatment and our biomarker extraction was
shown to characterize disease severity and progression well with a
strong correlation to physician evaluation and histology. In a unique
validation experiment, we applied DeepRAP in a time series sequence of
occlusion-induced hyperemia from 10 healthy volunteers. We observe how
the biomarkers decrease and recover during the occlusion and release
process, demonstrating accurate performance and reproducibility of
DeepRAP. Furthermore, we analyzed a cohort of 75 volunteers and defined
a relationship between aging and microvascular features in-vivo. More
precisely, this study revealed that fine microvascular features in the
dermal layer have the strongest correlation to age. The ability of our
newly developed framework to enable the rapid study of human skin
morphology and microvasculature in-vivo promises to replace biopsy
studies, increasing the translational potential of RSOM.
超宽带光栅扫描光声介观镜(RSOM)是一种新颖的模式,已证明具有前所未有的体内表皮和真皮结构可视化能力。然而,三维 RSOM 数据集的自动定量分析仍有待探索。在这项工作中,我们提出了我们的框架:深度学习 RSOM 分析管道 (DeepRAP),用于分析和量化 RSOM 记录的形态皮肤特征,并提取用于疾病表征的成像生物标志物。 DeepRAP 使用基于具有迁移学习的卷积神经网络的多网络分割策略。该策略能够自动识别皮肤层并随后分割真皮微血管系统,其准确度相当于人类评估。 DeepRAP 针对 25 名接受治疗的银屑病患者的手动分割进行了验证,我们的生物标志物提取被证明可以很好地表征疾病的严重程度和进展,并与医生评估和组织学密切相关。在一项独特的验证实验中,我们在 10 名健康志愿者的闭塞引起充血的时间序列中应用了 DeepRAP。我们观察了生物标志物在闭塞和释放过程中如何减少和恢复,证明了 DeepRAP 的准确性能和可重复性。此外,我们分析了 75 名志愿者的队列,并定义了衰老与体内微血管特征之间的关系。更准确地说,这项研究表明真皮层的精细微血管特征与年龄的相关性最强。我们新开发的框架能够快速研究人体皮肤形态和体内微脉管系统,有望取代活检研究,从而增加 RSOM 的转化潜力。
AU Wang, Hongyi
Luo, Luyang
Wang, Fang
Tong, Ruofeng
Chen, Yen-Wei
Hu, Hongjie
Lin, Lanfen
Chen, Hao
王AU、罗宏毅、王路阳、童芳、陈若峰、胡彦伟、林宏杰、陈兰芬、郝
Rethinking Multiple Instance Learning for Whole Slide Image
Classification: A Bag-Level Classifier is a Good Instance-Level Teacher.
重新思考整个幻灯片图像分类的多实例学习:袋级分类器是一个好的实例级老师。
Multiple Instance Learning (MIL) has demonstrated promise in Whole Slide
Image (WSI) classification. However, a major challenge persists due to
the high computational cost associated with processing these gigapixel
images. Existing methods generally adopt a two-stage approach,
comprising a non-learnable feature embedding stage and a classifier
training stage. Though it can greatly reduce memory consumption by using
a fixed feature embedder pre-trained on other domains, such a scheme
also results in a disparity between the two stages, leading to
suboptimal classification accuracy. To address this issue, we propose
that a bag-level classifier can be a good instance-level teacher. Based
on this idea, we design Iteratively Coupled Multiple Instance Learning
(ICMIL) to couple the embedder and the bag classifier at a low cost.
ICMIL initially fixes the patch embedder to train the bag classifier,
followed by fixing the bag classifier to fine-tune the patch embedder.
The refined embedder can then generate better representations in return,
leading to a more accurate classifier for the next iteration. To realize
more flexible and more effective embedder fine-tuning, we also introduce
a teacher-student framework to efficiently distill the category
knowledge in the bag classifier to help the instance-level embedder
fine-tuning. Intensive experiments were conducted on four distinct
datasets to validate the effectiveness of ICMIL. The experimental
results consistently demonstrated that our method significantly improves
the performance of existing MIL backbones, achieving state-of-the-art
results. The code and the organized datasets can be accessed by:
https://github.com/Dootmaan/ICMIL/tree/confidence-based.
多实例学习 (MIL) 在整个幻灯片图像 (WSI) 分类方面展现出了良好的前景。然而,由于处理这些十亿像素图像的计算成本很高,因此仍然存在一个重大挑战。现有方法通常采用两阶段方法,包括不可学习特征嵌入阶段和分类器训练阶段。虽然通过使用在其他领域预先训练的固定特征嵌入器可以大大减少内存消耗,但这种方案也会导致两个阶段之间的差异,从而导致分类精度不佳。为了解决这个问题,我们建议袋级分类器可以成为很好的实例级老师。基于这个想法,我们设计了迭代耦合多实例学习(ICMIL)来以低成本耦合嵌入器和袋分类器。 ICMIL 首先修复补丁嵌入器来训练袋分类器,然后修复袋分类器以微调补丁嵌入器。然后,经过改进的嵌入器可以生成更好的表示作为回报,从而为下一次迭代提供更准确的分类器。为了实现更灵活、更有效的嵌入器微调,我们还引入了师生框架来有效提炼袋分类器中的类别知识,以帮助实例级嵌入器微调。在四个不同的数据集上进行了深入的实验,以验证 ICMIL 的有效性。实验结果一致表明,我们的方法显着提高了现有 MIL 主干的性能,实现了最先进的结果。代码和组织的数据集可以通过以下方式访问:https://github.com/Dootmaan/ICMIL/tree/confidence-based。
AU Gras, V.
Boulant, N.
Luong, M.
Morel, L.
Le Touz, N.
Adam, J. -P.
Joly, J. -C.
AU Gras、V. Boulant、N. Luong、M. Morel、L. Le Touz、N. Adam、J. -P。乔利,J.-C。
A Mathematical Analysis of Clustering-Free Local SAR Compression
Algorithms for MRI Safety in Parallel Transmission
并行传输中 MRI 安全性无聚类局部 SAR 压缩算法的数学分析
Parallel transmission (pTX) is a versatile solution to enable UHF MRI of
the human body, where radiofrequency (RF) field inhomogeneity appears
very challenging. Today, state of the art monitoring of the local SAR in
pTX consists in evaluating the RF power deposition on specific SAR
matrices called Virtual Observation Points (VOPs). It essentially relies
on accurate electromagnetic simulations able to return the local SAR
distribution inside the body in response to any applied pTX RF waveform.
In order to reduce the number of SAR matrices to a value compatible with
real time SAR monitoring (<< 10(3)) , a VOP set is obtained by
partitioning the SAR model into clusters, and associating a so- called
dominant SAR matrix to every cluster. More recently, a clustering-free
compression method was proposed, allowing for a significant reduction in
the number of SAR matrices. The concept and derivation however assumed
static RF shims and their extension to dynamic pTX is not
straightforward, thereby casting doubt on the strict validity of the
compression approach for these more complicated RF waveforms. In this
work, we provide the mathematical framework to tackle this problem and
find a rigorous justification of this criterion in the light of convex
optimization theory. Our analysis led us to a variant of the
clustering-free compression approach exploiting convex optimization.
This new compression algorithm offers computational gains for large SAR
models and for high-channel count pTX RF coils.
并行传输 (pTX) 是一种多功能解决方案,可实现人体 UHF MRI,其中射频 (RF) 场不均匀性显得非常具有挑战性。如今,pTX 中局部 SAR 的最先进监测包括评估特定 SAR 矩阵(称为虚拟观测点 (VOP))上的射频功率沉积。它本质上依赖于精确的电磁模拟,能够响应任何应用的 pTX RF 波形返回体内的局部 SAR 分布。为了将 SAR 矩阵的数量减少到与实时 SAR 监测兼容的值 (<< 10(3)),通过将 SAR 模型划分为簇并关联所谓的主导 SAR 来获得 VOP 集。每个簇的矩阵。最近,提出了一种无聚类压缩方法,可以显着减少 SAR 矩阵的数量。然而,假设静态射频匀场及其对动态 pTX 的扩展的概念和推导并不简单,因此对这些更复杂的射频波形的压缩方法的严格有效性产生了怀疑。在这项工作中,我们提供了解决这个问题的数学框架,并根据凸优化理论找到了这个标准的严格证明。我们的分析使我们得出了一种利用凸优化的无聚类压缩方法的变体。这种新的压缩算法为大型 SAR 模型和高通道数 pTX RF 线圈提供了计算增益。
AU Ranjbaran, Seyed Mohsen
Aghamiry, Hossein S.
Gholami, Ali
Operto, Stephane
Avanaki, Kamran
AU Ranjbaran、Seyed Mohsen Aghamiry、Hossein S. Gholami、Ali Operto、Stephane Avanaki、Kamran
Quantitative Photoacoustic Tomography Using Iteratively Refined
Wavefield Reconstruction Inversion: A Simulation Study
使用迭代细化波场重建反演的定量光声层析成像:模拟研究
The ultimate goal of photoacoustic tomography is to accurately map the
absorption coefficient throughout the imaged tissue. Most studies either
assume that acoustic properties of biological tissues such as speed of
sound (SOS) and acoustic attenuation are homogeneous or fluence is
uniform throughout the entire tissue. These assumptions reduce the
accuracy of estimations of derived absorption coefficients (DeACs). Our
quantitative photoacoustic tomography (qPAT) method estimates DeACs
using iteratively refined wavefield reconstruction inversion (IR-WRI)
which incorporates the alternating direction method of multipliers to
solve the cycle skipping challenge associated with full wave inversion
algorithms. Our method compensates for SOS inhomogeneity, fluence decay,
and acoustic attenuation. We evaluate the performance of our method on a
neonatal head digital phantom.
光声断层扫描的最终目标是准确绘制整个成像组织的吸收系数。大多数研究要么假设生物组织的声学特性(例如声速 (SOS) 和声衰减)是均匀的,要么假设整个组织的注量是均匀的。这些假设降低了导出吸收系数 (DeAC) 估计的准确性。我们的定量光声断层扫描 (qPAT) 方法使用迭代细化波场重建反演 (IR-WRI) 来估计 DeAC,该方法结合了乘法器的交替方向方法,以解决与全波反演算法相关的周期跳跃挑战。我们的方法补偿了 SOS 不均匀性、能量密度衰减和声学衰减。我们评估了我们的方法在新生儿头部数字模型上的性能。
AU Xing, Paul
Poree, Jonathan
Rauby, Brice
Malescot, Antoine
Martineau, Eric
Perrot, Vincent
Rungta, Ravi L.
Provost, Jean
AU Xing、Paul Poree、Jonathan Rauby、Brice Malescot、Antoine Martineau、Eric Perrot、Vincent Rungta、Ravi L. Provost、Jean
Phase Aberration Correction for In Vivo Ultrasound Localization
Microscopy Using a Spatiotemporal Complex-Valued Neural Network
使用时空复值神经网络对体内超声定位显微镜进行相位像差校正
Ultrasound Localization Microscopy (ULM) can map microvessels at a
resolution of a few micrometers (mu m). Transcranial ULM remains
challenging in presence of aberrations caused by the skull, which lead
to localization errors. Herein, we propose a deep learning approach
based on recently introduced complex-valued convolutional neural
networks (CV-CNNs) to retrieve the aberration function, which can then
be used to form enhanced images using standard delay-and-sum
beamforming. CV-CNNs were selected as they can apply time delays through
multiplication with in-phase quadrature input data. Predicting the
aberration function rather than corrected images also confers enhanced
explainability to the network. In addition, 3D spatiotemporal
convolutions were used for the network to leverage entire microbubble
tracks. For training and validation, we used an anatomically and
hemodynamically realistic mouse brain microvascular network model to
simulate the flow of microbubbles in presence of aberration. The
proposed CV-CNN performance was compared to the coherence-based method
by using microbubble tracks. We then confirmed the capability of the
proposed network to generalize to transcranial in vivo data in the mouse
brain (n=3). Vascular reconstructions using a locally predicted
aberration function included additional and sharper vessels. The CV-CNN
was more robust than the coherence-based method and could perform
aberration correction in a 6-month-old mouse. After correction, we
measured a resolution of 15.6 mu m for younger mice, representing an
improvement of 25.8%, while the resolution was improved by 13.9% for the
6-month-old mouse. This work leads to different applications for
complex-valued convolutions in biomedical imaging and strategies to
perform transcranial ULM.
超声定位显微镜 (ULM) 可以以几微米 (μ m) 的分辨率绘制微血管图。经颅 ULM 仍然具有挑战性,因为存在由颅骨引起的畸变,导致定位错误。在这里,我们提出了一种基于最近引入的复值卷积神经网络(CV-CNN)的深度学习方法来检索像差函数,然后可以使用标准延迟求和波束形成来形成增强图像。选择 CV-CNN 是因为它们可以通过与同相正交输入数据相乘来应用时间延迟。预测像差函数而不是校正图像也增强了网络的可解释性。此外,网络还使用 3D 时空卷积来利用整个微气泡轨迹。为了进行训练和验证,我们使用了解剖学和血流动力学上真实的小鼠大脑微血管网络模型来模拟存在畸变时的微泡流动。使用微泡轨迹将所提出的 CV-CNN 性能与基于相干性的方法进行了比较。然后,我们证实了所提出的网络能够推广到小鼠大脑中经颅体内数据的能力(n=3)。使用局部预测的像差函数进行的血管重建包括额外的和更清晰的血管。 CV-CNN 比基于相干性的方法更稳健,可以对 6 个月大的小鼠进行像差校正。校正后,我们测得年幼小鼠的分辨率为 15.6 μm,提高了 25.8%,而 6 个月大的小鼠分辨率提高了 13.9%。这项工作导致了复值卷积在生物医学成像中的不同应用以及执行经颅 ULM 的策略。
AU Liu, Xuan
Xie, Yaoqin
Diao, Songhui
Tan, Shan
Liang, Xiaokun
刘AU、谢轩、刁耀琴、谭松辉、梁善、晓坤
Unsupervised CT Metal Artifact Reduction by Plugging Diffusion Priors in
Dual Domains.
通过在双域中插入扩散先验来减少无监督 CT 金属伪影。
During the process of computed tomography (CT), metallic implants often
cause disruptive artifacts in the reconstructed images, impeding
accurate diagnosis. Many supervised deep learning-based approaches have
been proposed for metal artifact reduction (MAR). However, these methods
heavily rely on training with paired simulated data, which are
challenging to acquire. This limitation can lead to decreased
performance when applying these methods in clinical practice. Existing
unsupervised MAR methods, whether based on learning or not, typically
work within a single domain, either in the image domain or the sinogram
domain. In this paper, we propose an unsupervised MAR method based on
the diffusion model, a generative model with a high capacity to
represent data distributions. Specifically, we first train a diffusion
model using CT images without metal artifacts. Subsequently, we
iteratively introduce the diffusion priors in both the sinogram domain
and image domain to restore the degraded portions caused by metal
artifacts. Besides, we design temporally dynamic weight masks for the
image-domian fusion. The dual-domain processing empowers our approach to
outperform existing unsupervised MAR methods, including another MAR
method based on diffusion model. The effectiveness has been
qualitatively and quantitatively validated on synthetic datasets.
Moreover, our method demonstrates superior visual results among both
supervised and unsupervised methods on clinical datasets. Codes are
available in github.com/DeepXuan/DuDoDp-MAR.
在计算机断层扫描 (CT) 过程中,金属植入物常常会在重建图像中造成破坏性伪影,从而妨碍准确诊断。人们提出了许多基于监督深度学习的方法来减少金属伪影(MAR)。然而,这些方法严重依赖于配对模拟数据的训练,而获取这些数据具有挑战性。在临床实践中应用这些方法时,这种限制可能会导致性能下降。现有的无监督 MAR 方法,无论是否基于学习,通常在单个域内工作,无论是图像域还是正弦图域。在本文中,我们提出了一种基于扩散模型的无监督 MAR 方法,这是一种具有高能力表示数据分布的生成模型。具体来说,我们首先使用没有金属伪影的 CT 图像训练扩散模型。随后,我们迭代地在正弦图域和图像域中引入扩散先验,以恢复由金属伪影引起的退化部分。此外,我们为图像域融合设计了时间动态权重掩模。双域处理使我们的方法能够超越现有的无监督 MAR 方法,包括另一种基于扩散模型的 MAR 方法。其有效性已在合成数据集上进行了定性和定量验证。此外,我们的方法在临床数据集上的监督和非监督方法中展示了优越的视觉结果。代码可在 github.com/DeepXuan/DuDoDp-MAR 中找到。
AU Beuret, Samuel
Thiran, Jean-Philippe
AU Beuret、塞缪尔·蒂兰、让·菲利普
Windowed Radon Transform and Tensor Rank-1 Decomposition for Adaptive
Beamforming in Ultrafast Ultrasound
超快超声中自适应波束形成的窗氡变换和张量Rank-1分解
Ultrafast ultrasound has recently emerged as an alternative to
traditional focused ultrasound. By virtue of the low number of
insonifications it requires, ultrafast ultrasound enables the imaging of
the human body at potentially very high frame rates. However,
unaccounted for speed-of-sound variations in the insonified medium often
result in phase aberrations in the reconstructed images. The diagnosis
capability of ultrafast ultrasound is thus ultimately impeded.
Therefore, there is a strong need for adaptive beamforming methods that
are resilient to speed-of-sound aberrations. Several of such techniques
have been proposed recently but they often lack parallelizability or the
ability to directly correct both transmit and receive phase aberrations.
In this article, we introduce an adaptive beamforming method designed to
address these shortcomings. To do so, we compute the windowed Radon
transform of several complex radio-frequency images reconstructed using
delay-and-sum. Then, we apply to the obtained local sinograms weighted
tensor rank-1 decompositions and their results are eventually used to
reconstruct a corrected image. We demonstrate using simulated and
in-vitro data that our method is able to successfully recover
aberration-free images and that it outperforms both coherent compounding
and the recently introduced SVD beamformer. Finally, we validate the
proposed beamforming technique on in-vivo data, resulting in a
significant improvement of image quality compared to the two reference
methods.
超快超声波最近已成为传统聚焦超声波的替代品。由于所需的声穿透次数较少,超快超声波能够以非常高的帧速率对人体进行成像。然而,未考虑到声穿透介质中的声速变化通常会导致重建图像中的相位畸变。超快超声的诊断能力因此最终受到阻碍。因此,强烈需要能够适应声速像差的自适应波束形成方法。最近已经提出了几种这样的技术,但它们通常缺乏并行性或直接校正发射和接收相位畸变的能力。在本文中,我们介绍了一种旨在解决这些缺点的自适应波束形成方法。为此,我们计算使用延迟求和重建的几个复杂射频图像的加窗氡变换。然后,我们对获得的局部正弦图进行加权张量Rank-1分解,其结果最终用于重建校正图像。我们使用模拟和体外数据证明,我们的方法能够成功恢复无像差图像,并且其性能优于相干复合和最近推出的 SVD 波束形成器。最后,我们在体内数据上验证了所提出的波束形成技术,与两种参考方法相比,图像质量显着提高。
AU Nguyen, Huy Hoang
Blaschko, Matthew B.
Saarakkala, Simo
Tiulpin, Aleksei
AU Nguyen、Huy Hoang Blaschko、Matthew B. Saarakkala、Simo Tiulpin、Aleksei
Clinically-Inspired Multi-Agent Transformers for Disease Trajectory
Forecasting From Multimodal Data
受临床启发的多智能体变压器,用于根据多模态数据进行疾病轨迹预测
Deep neural networks are often applied to medical images to automate the
problem of medical diagnosis. However, a more clinically relevant
question that practitioners usually face is how to predict the future
trajectory of a disease. Current methods for prognosis or disease
trajectory forecasting often require domain knowledge and are
complicated to apply. In this paper, we formulate the prognosis
prediction problem as a one-to-many prediction problem. Inspired by a
clinical decision-making process with two agents-a radiologist and a
general practitioner - we predict prognosis with two transformer-based
components that share information with each other. The first transformer
in this framework aims to analyze the imaging data, and the second one
leverages its internal states as inputs, also fusing them with auxiliary
clinical data. The temporal nature of the problem is modeled within the
transformer states, allowing us to treat the forecasting problem as a
multi-task classification, for which we propose a novel loss. We show
the effectiveness of our approach in predicting the development of
structural knee osteoarthritis changes and forecasting Alzheimer's
disease clinical status directly from raw multi-modal data. The proposed
method outperforms multiple state-of-the-art baselines with respect to
performance and calibration, both of which are needed for real-world
applications. An open-source implementation of our method is made
publicly available at https://github.com/Oulu-IMEDS/CLIMATv2.
深度神经网络通常应用于医学图像以自动化医学诊断问题。然而,从业者通常面临的一个与临床更相关的问题是如何预测疾病的未来轨迹。当前的预后或疾病轨迹预测方法通常需要领域知识并且应用起来很复杂。在本文中,我们将预后预测问题表述为一对多预测问题。受到两名代理人(一名放射科医生和一名全科医生)临床决策过程的启发,我们使用两个基于变压器的相互共享信息的组件来预测预后。该框架中的第一个变压器旨在分析成像数据,第二个变压器利用其内部状态作为输入,并将它们与辅助临床数据融合。问题的时间性质是在变压器状态内建模的,使我们能够将预测问题视为多任务分类,为此我们提出了一种新颖的损失。我们展示了我们的方法在预测结构性膝骨关节炎变化的发展以及直接从原始多模态数据预测阿尔茨海默病临床状态方面的有效性。所提出的方法在性能和校准方面优于多个最先进的基线,这两者都是实际应用所需要的。我们的方法的开源实现已在 https://github.com/Oulu-IMEDS/CLIMATv2 上公开发布。
AU Wu, Lingyun
Gao, Xiang
Hu, Zhiqiang
Zhang, Shaoting
吴区、高凌云、胡翔、张志强、绍婷
Pattern-Aware Transformer: Hierarchical Pattern Propagation in
Sequential Medical Images
模式感知变压器:序列医学图像中的分层模式传播
This paper investigates how to effectively mine contextual information
among sequential images and jointly model them in medical imaging tasks.
Different from state-of-the-art methods that model sequential
correlations via point-wise token encoding, this paper develops a novel
hierarchical pattern-aware tokenization strategy. It handles distinct
visual patterns independently and hierarchically, which not only ensures
the full flexibility of attention aggregation under different pattern
representations but also preserves both local and global information
simultaneously. Based on this strategy, we propose a Pattern-Aware
Transformer (PATrans) featuring a global-local dual-path pattern-aware
cross-attention mechanism to achieve hierarchical pattern matching and
propagation among sequential images. Furthermore, PATrans is
plug-and-play and can be seamlessly integrated into various backbone
networks for diverse downstream sequence modeling tasks. We demonstrate
its general application paradigm across four domains and five benchmarks
in video object detection and 3D volumetric semantic segmentation tasks,
respectively. Impressively, PATrans sets new state-of-the-art across all
these benchmarks, i.e., CVC-Video (92.3% detection F1), ASU-Mayo (99.1%
localization F1), Lung Tumor (78.59% DSC), Nasopharynx Tumor (75.50%
DSC), and Kidney Tumor (87.53% DSC). Codes and models are available at
https://github.com/GGaoxiang/PATrans.
本文研究了如何有效地挖掘序列图像之间的上下文信息并在医学成像任务中对其进行联合建模。与通过逐点标记编码对顺序相关性进行建模的最先进方法不同,本文开发了一种新颖的分层模式感知标记化策略。它独立且分层地处理不同的视觉模式,这不仅确保了不同模式表示下注意力聚合的充分灵活性,而且同时保留了局部和全局信息。基于该策略,我们提出了一种模式感知变压器(PATrans),具有全局-局部双路径模式感知交叉注意机制,以实现序列图像之间的分层模式匹配和传播。此外,PATrans 是即插即用的,可以无缝集成到各种骨干网络中,以执行各种下游序列建模任务。我们分别在视频对象检测和 3D 体积语义分割任务中展示了其跨四个领域和五个基准的通用应用范例。令人印象深刻的是,PATrans 在所有这些基准测试中都设定了新的最先进水平,即 CVC-Video(92.3% 检测 F1)、ASU-Mayo(99.1% 定位 F1)、肺肿瘤(78.59% DSC)、鼻咽肿瘤( 75.50% DSC) 和肾肿瘤 (87.53% DSC)。代码和模型可在 https://github.com/GGaoxiang/PATrans 获取。
C1 SenseTime Res, Shanghai 200233, Peoples R China
C1 Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
C3 SenseTime Res
C3 Shanghai Artificial Intelligence Lab
SN 0278-0062
EI 1558-254X
DA 2024-03-13
UT WOS:001158081600002
PM 37594875
ER
C1 SenseTime Res,上海 200233,人民 R 中国 C1 上海人工智能实验室,上海 200232,人民 R 中国 C3 SenseTime Res C3 上海人工智能实验室 SN 0278-0062 EI 1558-254X DA 2024-03-13 UT WOS:001158081600002 PM 37594875急诊室
AU Cui, Zhuo-Xu
Cao, Chentao
Wang, Yue
Jia, Sen
Cheng, Jing
Liu, Xin
Zheng, Hairong
Liang, Dong
Zhu, Yanjie
崔AU、曹卓旭、王晨涛、贾悦、程森、刘静、郑鑫、梁海蓉、朱东、燕杰
SPIRiT-Diffusion: Self-Consistency Driven Diffusion Model for
Accelerated MRI.
SPIRiT-Diffusion:加速 MRI 的自洽驱动扩散模型。
Diffusion models have emerged as a leading methodology for image
generation and have proven successful in the realm of magnetic resonance
imaging (MRI) reconstruction. However, existing reconstruction methods
based on diffusion models are primarily formulated in the image domain,
making the reconstruction quality susceptible to inaccuracies in coil
sensitivity maps (CSMs). k-space interpolation methods can effectively
address this issue but conventional diffusion models are not readily
applicable in k-space interpolation. To overcome this challenge, we
introduce a novel approach called SPIRiT-Diffusion, which is a diffusion
model for k-space interpolation inspired by the iterative
self-consistent SPIRiT method. Specifically, we utilize the iterative
solver of the self-consistent term (i.e., k-space physical prior) in
SPIRiT to formulate a novel stochastic differential equation (SDE)
governing the diffusion process. Subsequently, k-space data can be
interpolated by executing the diffusion process. This innovative
approach highlights the optimization model's role in designing the SDE
in diffusion models, enabling the diffusion process to align closely
with the physics inherent in the optimization model-a concept referred
to as model-driven diffusion. We evaluated the proposed SPIRiT-Diffusion
method using a 3D joint intracranial and carotid vessel wall imaging
dataset. The results convincingly demonstrate its superiority over
image-domain reconstruction methods, achieving high reconstruction
quality even at a substantial acceleration rate of 10. Our code are
available at https://github.com/zhyjSIAT/SPIRiT-Diffusion.
扩散模型已成为图像生成的领先方法,并在磁共振成像 (MRI) 重建领域取得了成功。然而,现有的基于扩散模型的重建方法主要是在图像域中制定的,使得重建质量容易受到线圈灵敏度图(CSM)不准确的影响。 k空间插值方法可以有效解决这个问题,但传统的扩散模型并不容易应用于k空间插值。为了克服这一挑战,我们引入了一种称为 SPIRiT-Diffusion 的新方法,它是受迭代自洽 SPIRiT 方法启发的 k 空间插值扩散模型。具体来说,我们利用 SPIRiT 中自洽项(即 k 空间物理先验)的迭代求解器来制定控制扩散过程的新颖随机微分方程 (SDE)。随后,可以通过执行扩散处理来对k空间数据进行插值。这种创新方法突出了优化模型在设计扩散模型中的 SDE 中的作用,使扩散过程能够与优化模型固有的物理原理紧密结合,这一概念称为模型驱动扩散。我们使用 3D 联合颅内和颈动脉血管壁成像数据集评估了所提出的 SPIRiT-Diffusion 方法。结果令人信服地证明了其相对于图像域重建方法的优越性,即使在 10 的大幅加速率下也能实现高重建质量。我们的代码可在 https://github.com/zhyjSIAT/SPIRiT-Diffusion 上获取。
EI 1558-254X
DA 2024-10-05
UT MEDLINE:39361455
PM 39361455
ER
EI 1558-254X DA 2024-10-05 UT MEDLINE:39361455 PM 39361455 ER
AU Gong, Shizhan
Long, Yonghao
Chen, Kai
Liu, Jiaqi
Xiao, Yuliang
Cheng, Alexis
Wang, Zerui
Dou, Qi
AU龚、龙世展、陈永浩、刘凯、肖佳琪、程玉良、王俊杰、窦泽瑞、齐
Self-Supervised Cyclic Diffeomorphic Mapping for Soft Tissue Deformation
Recovery in Robotic Surgery Scenes.
用于机器人手术场景中软组织变形恢复的自监督循环微分形映射。
The ability to recover tissue deformation from visual features is
fundamental for many robotic surgery applications. This has been a
long-standing research topic in computer vision, however, is still
unsolved due to complex dynamics of soft tissues when being manipulated
by surgical instruments. The ambiguous pixel correspondence caused by
homogeneous texture makes achieving dense and accurate tissue tracking
even more challenging. In this paper, we propose a novel self-supervised
framework to recover tissue deformations from stereo surgical videos.
Our approach integrates semantics, cross-frame motion flow, and
long-range temporal dependencies to enable the recovered deformations to
represent actual tissue dynamics. Moreover, we incorporate diffeomorphic
mapping to regularize the warping field to be physically realistic. To
comprehensively evaluate our method, we collected stereo surgical video
clips containing three types of tissue manipulation (i.e., pushing,
dissection and retraction) from two different types of surgeries (i.e.,
hemicolectomy and mesorectal excision). Our method has achieved
impressive results in capturing deformation in 3D mesh, and generalized
well across manipulations and surgeries. It also outperforms current
state-of-the-art methods on non-rigid registration and optical flow
estimation. To the best of our knowledge, this is the first work on
self-supervised learning for dense tissue deformation modeling from
stereo surgical videos. Our code will be released.
从视觉特征中恢复组织变形的能力是许多机器人手术应用的基础。这一直是计算机视觉领域的一个长期研究课题,然而,由于手术器械操纵时软组织的复杂动力学,该课题仍未得到解决。由均匀纹理引起的模糊像素对应使得实现密集且准确的组织跟踪更具挑战性。在本文中,我们提出了一种新颖的自监督框架来从立体手术视频中恢复组织变形。我们的方法集成了语义、跨帧运动流和长程时间依赖性,使恢复的变形能够代表实际的组织动力学。此外,我们结合微分同胚映射来规范扭曲场以使其物理上真实。为了全面评估我们的方法,我们收集了来自两种不同类型的手术(即半结肠切除术和直肠系膜切除术)的包含三种类型的组织操作(即推动、解剖和牵拉)的立体手术视频片段。我们的方法在捕获 3D 网格变形方面取得了令人印象深刻的结果,并且在操作和手术中得到了很好的推广。它在非刚性配准和光流估计方面也优于当前最先进的方法。据我们所知,这是第一个利用立体手术视频进行致密组织变形建模的自监督学习的工作。我们的代码将被发布。
AU Rajagopal, Abhejit
Westphalen, Antonio C.
Velarde, Nathan
Simko, Jeffry P.
Nguyen, Hao
Hope, Thomas A.
Larson, Peder E. Z.
Magudia, Kirti
AU Rajagopal、Abhejit Westphalen、Antonio C. Velarde、Nathan Simko、Jeffry P. Nguyen、Hao Hope、Thomas A. Larson、Peder EZ Magudia、Kirti
Mixed Supervision of Histopathology Improves Prostate Cancer
Classification From MRI
组织病理学的混合监督改善了 MRI 的前列腺癌分类
Non-invasive prostate cancer classification from MRI has the potential
to revolutionize patient care by providing early detection of clinically
significant disease, but has thus far shown limited positive predictive
value. To address this, we present a image-based deep learning method to
predict clinically significant prostate cancer from screening MRI in
patients that subsequently underwent biopsy with results ranging from
benign pathology to the highest grade tumors. Specifically, we
demonstrate that mixed supervision via diverse histopathological ground
truth improves classification performance despite the cost of reduced
concordance with image-based segmentation. Where prior approaches have
utilized pathology results as ground truth derived from targeted
biopsies and whole-mount prostatectomy to strongly supervise the
localization of clinically significant cancer, our approach also
utilizes weak supervision signals extracted from nontargeted systematic
biopsies with regional localization to improve overall performance. Our
key innovation is performing regression by distribution rather than
simply by value, enabling use of additional pathology findings
traditionally ignored by deep learning strategies. We evaluated our
model on a dataset of 973 (testing n=198 ) multi-parametric prostate MRI
exams collected at UCSF from 2016-2019 followed by MRI/ultrasound fusion
(targeted) biopsy and systematic (nontargeted) biopsy of the prostate
gland, demonstrating that deep networks trained with mixed supervision
of histopathology can feasibly exceed the performance of the Prostate
Imaging-Reporting and Data System (PI-RADS) clinical standard for
prostate MRI interpretation (71.6% vs 66.7% balanced accuracy and 0.724
vs 0.716 AUC).
MRI 的非侵入性前列腺癌分类有可能通过提供临床重大疾病的早期检测来彻底改变患者护理,但迄今为止显示的阳性预测价值有限。为了解决这个问题,我们提出了一种基于图像的深度学习方法,通过对随后接受活检的患者进行 MRI 筛查来预测具有临床意义的前列腺癌,这些患者的结果范围从良性病理到最高级别的肿瘤。具体来说,我们证明,尽管与基于图像的分割的一致性降低了成本,但通过不同的组织病理学基本事实进行的混合监督可以提高分类性能。先前的方法利用来自靶向活检和全前列腺切除术的病理结果作为基本事实,以强有力地监督临床上有意义的癌症的定位,而我们的方法还利用从具有区域定位的非靶向系统活检中提取的弱监督信号来提高整体性能。我们的关键创新是通过分布而不是简单地通过值进行回归,从而能够使用传统上被深度学习策略忽略的额外病理学发现。我们在 2016 年至 2019 年在 UCSF 收集的 973 例(测试 n=198)多参数前列腺 MRI 检查数据集上评估了我们的模型,随后进行了 MRI/超声融合(靶向)活检和前列腺的系统(非靶向)活检,证明研究表明,经过组织病理学混合监督训练的深层网络可以超过前列腺 MRI 解释的前列腺成像报告和数据系统 (PI-RADS) 临床标准的性能(71.6% 与 66.7% 的平衡准确度以及 0.724 与 0.716 AUC)。
AU Urban, Theresa
Noichl, Wolfgang
Engel, Klaus Juergen
Koehler, Thomas
Pfeiffer, Franz
AU Urban、特蕾莎·诺希尔、沃尔夫冈·恩格尔、克劳斯·于尔根·克勒、托马斯·菲佛、弗朗兹
Correction for X-Ray Scatter and Detector Crosstalk in Dark-Field
Radiography
暗场射线照相中 X 射线散射和探测器串扰的校正
Dark-field radiography, a new X-ray imaging method, has recently been
applied to human chest imaging for the first time. It employs
conventional X-ray devices in combination with a Talbot-Lau
interferometer with a large field of view, providing both attenuation
and dark-field radiographs. It is well known that sample scatter creates
artifacts in both modalities. Here, we demonstrate that also X-ray
scatter generated by the interferometer as well as detector crosstalk
create artifacts in the dark-field radiographs, in addition to the
expected loss of spatial resolution. We propose deconvolution-based
correction methods for the induced artifacts. The kernel for detector
crosstalk is measured and fitted to a model, while the kernel for
scatter from the analyzer grating is calculated by a Monte-Carlo
simulation. To correct for scatter from the sample, we adapt an
algorithm used for scatter correction in conventional radiography. We
validate the obtained corrections with a water phantom. Finally, we show
the impact of detector crosstalk, scatter from the analyzer grating and
scatter from the sample and their successful correction on dark-field
images of a human thorax.
暗场摄影作为一种新的X射线成像方法,最近首次应用于人体胸部成像。它采用传统的 X 射线设备与大视场 Talbot-Lau 干涉仪相结合,提供衰减和暗场射线照片。众所周知,样本分散会在两种模式中产生伪影。在这里,我们证明,除了预期的空间分辨率损失之外,干涉仪产生的 X 射线散射以及探测器串扰也会在暗场射线照片中产生伪影。我们提出了基于反卷积的校正方法来校正引起的伪影。测量探测器串扰的内核并将其拟合到模型中,而分析仪光栅散射的内核则通过蒙特卡罗模拟计算。为了校正样本的散射,我们采用了传统放射线照相中用于散射校正的算法。我们用水模型验证所获得的校正。最后,我们展示了探测器串扰、分析仪光栅散射和样本散射的影响,以及它们对人体胸部暗场图像的成功校正。
AU Wang, Qi
Wen, Zhijie
Shi, Jun
Wang, Qian
Shen, Dinggang
Ying, Shihui
王AU、文琪、史志杰、王军、沉谦、应丁刚、石慧
Spatial and Modal Optimal Transport for Fast Cross-Modal MRI
Reconstruction.
用于快速跨模态 MRI 重建的空间和模态最佳传输。
Multi-modal magnetic resonance imaging (MRI) plays a crucial role in
comprehensive disease diagnosis in clinical medicine. However, acquiring
certain modalities, such as T2-weighted images (T2WIs), is
time-consuming and prone to be with motion artifacts. It negatively
impacts subsequent multi-modal image analysis. To address this issue, we
propose an end-to-end deep learning framework that utilizes T1-weighted
images (T1WIs) as auxiliary modalities to expedite T2WIs' acquisitions.
While image pre-processing is capable of mitigating misalignment,
improper parameter selection leads to adverse pre-processing effects,
requiring iterative experimentation and adjustment. To overcome this
shortage, we employ Optimal Transport (OT) to synthesize T2WIs by
aligning T1WIs and performing cross-modal synthesis, effectively
mitigating spatial misalignment effects. Furthermore, we adopt an
alternating iteration framework between the reconstruction task and the
cross-modal synthesis task to optimize the final results. Then, we prove
that the reconstructed T2WIs and the synthetic T2WIs become closer on
the T2 image manifold with iterations increasing, and further illustrate
that the improved reconstruction result enhances the synthesis process,
whereas the enhanced synthesis result improves the reconstruction
process. Finally, experimental results from FastMRI and internal
datasets confirm the effectiveness of our method, demonstrating
significant improvements in image reconstruction quality even at low
sampling rates.
多模态磁共振成像(MRI)在临床医学的综合疾病诊断中发挥着至关重要的作用。然而,获取某些模式(例如 T2 加权图像 (T2WI))非常耗时,并且容易出现运动伪影。它会对后续的多模态图像分析产生负面影响。为了解决这个问题,我们提出了一种端到端深度学习框架,利用 T1 加权图像(T1WI)作为辅助模式来加速 T2WI 的采集。虽然图像预处理能够减轻错位,但参数选择不当会导致预处理效果不佳,需要反复实验和调整。为了克服这一不足,我们采用最优传输(OT)通过对齐 T1WI 并执行跨模态合成来合成 T2WI,有效减轻空间错位效应。此外,我们在重建任务和跨模态合成任务之间采用交替迭代框架来优化最终结果。然后,我们证明了随着迭代次数的增加,重建的T2WI和合成的T2WI在T2图像流形上变得更加接近,并进一步说明改进的重建结果增强了合成过程,而增强的合成结果改善了重建过程。最后,FastMRI 和内部数据集的实验结果证实了我们方法的有效性,证明即使在低采样率下图像重建质量也有显着改善。
AU Zhu, Hui
Zeng, Yi
Cai, Xiran
AU朱、曾辉、蔡毅、习然
Passive Acoustic Mapping for Convex Arrays With the Helical Wave
Spectrum Method
采用螺旋波谱法进行凸阵无源声学测绘
Passive acoustic mapping (PAM) has emerged as a valuable imaging
modality for monitoring the cavitation activity in focused ultrasound
therapies. When it comes to imaging in the human abdomen, convex arrays
are preferred due to their large acoustic window. However, existing PAM
methods for convex arrays rely on the computationally expensive
delay-and-sum (DAS) operation limiting the image reconstruction speed
when the field-of-view (FOV) is large. In this work, we propose an
efficient and frequency-selective PAM method for convex arrays. This
method is based on projecting the helical wave spectrum (HWS) between
cylindrical surfaces in the imaging field. Both the in silico and in
vitro experiments showed that the HWS method has comparable image
quality and similar acoustic cavitation source localization accuracy as
the DAS-based methods. Compared to the frequency-domain and time-domain
DAS methods, the time-complexity of the HWS method is reduced by one
order and two orders of magnitude, respectively. A parallel
implementation of the HWS method realized millisecond-level image
reconstruction speed. We also show that the HWS method is inherently
capable of mapping microbubble (MB) cavitation activity of different
status, i.e., no cavitation, stable cavitation, or inertial cavitation.
After compensating for the lens effects of the convex array, we further
combined PAM formed by the HWS method and B-mode imaging as a real-time
dual-mode imaging approach to map the anatomical location where MBs
cavitate in a liver phantom experiment. This method may find use in
applications where convex arrays are required for cavitation activity
monitoring in real time.
被动声学测绘 (PAM) 已成为一种有价值的成像方式,用于监测聚焦超声治疗中的空化活动。当涉及到人体腹部成像时,凸阵由于其声窗较大而成为首选。然而,现有的凸阵 PAM 方法依赖于计算成本昂贵的延迟求和 (DAS) 操作,当视场 (FOV) 较大时,这限制了图像重建速度。在这项工作中,我们提出了一种用于凸阵的高效且频率选择性的 PAM 方法。该方法基于在成像场中的圆柱表面之间投影螺旋波谱(HWS)。计算机和体外实验均表明,HWS 方法具有与基于 DAS 的方法相当的图像质量和相似的声空化源定位精度。与频域和时域DAS方法相比,HWS方法的时间复杂度分别降低了一个数量级和两个数量级。 HWS方法的并行实现实现了毫秒级的图像重建速度。我们还表明,HWS 方法本质上能够绘制不同状态的微泡 (MB) 空化活动,即无空化、稳定空化或惯性空化。在补偿凸阵透镜效应后,我们进一步将HWS方法形成的PAM和B模式成像结合作为实时双模式成像方法,以绘制肝脏模型实验中MBs空化的解剖位置。该方法可用于需要凸阵列来实时监测空化活动的应用。
C1 ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai 201210, Peoples
R China
C1 Chinese Acad Sci, Shanghai Adv Res Inst, Shanghai 201210, Peoples R
China
C1 Univ Chinese Acad Sci, Beijing 100049, Peoples R China
C1 ShanghaiTech Univ, Shanghai Engn Res Ctr Intelligent Vis & Imaging,
Shanghai 201210, Peoples R China
SN 0278-0062
EI 1558-254X
DA 2024-05-23
UT WOS:001214547800026
PM 38198274
ER
C1 上海科技大学,科学信息科学与技术学院,上海 201210,人民 R 中国 C1 中国科学院,上海先进研究中心,上海 201210,人民 R 中国 C1 中国科学技术大学,北京 100049,人民 R 中国 C1 上海科技大学,上海工程技术大学Res Ctr 智能视觉与成像,上海 201210,Peoples R China SN 0278-0062 EI 1558-254X DA 2024-05-23 UT WOS:001214547800026 PM 38198274 ER
AU Feng, Ruimin
Wu, Qing
Feng, Jie
She, Huajun
Liu, Chunlei
Zhang, Yuyao
Wei, Hongjiang
AU Feng、吴瑞敏、冯庆、佘杰、刘华军、张春雷、魏玉耀、江宏
IMJENSE: Scan-Specific Implicit Representation for Joint Coil
Sensitivity and Image Estimation in Parallel MRI
IMJENSE:并行 MRI 中联合线圈灵敏度和图像估计的扫描特定隐式表示
Parallel imaging is a commonly used technique to accelerate magnetic
resonance imaging (MRI) data acquisition. Mathematically, parallel MRI
reconstruction can be formulated as an inverse problem relating the
sparsely sampled k-space measurements to the desired MRI image. Despite
the success of many existing reconstruction algorithms, it remains a
challenge to reliably reconstruct a high-quality image from highly
reduced k-space measurements. Recently, implicit neural representation
has emerged as a powerful paradigm to exploit the internal information
and the physics of partially acquired data to generate the desired
object. In this study, we introduced IMJENSE, a scan-specific implicit
neural representation-based method for improving parallel MRI
reconstruction. Specifically, the underlying MRI image and coil
sensitivities were modeled as continuous functions of spatial
coordinates, parameterized by neural networks and polynomials,
respectively. The weights in the networks and coefficients in the
polynomials were simultaneously learned directly from sparsely acquired
k-space measurements, without fully sampled ground truth data for
training. Benefiting from the powerful continuous representation and
joint estimation of the MRI image and coil sensitivities, IMJENSE
outperforms conventional image or k-space domain reconstruction
algorithms. With extremely limited calibration data, IMJENSE is more
stable than supervised calibrationless and calibration-based
deep-learning methods. Results show that IMJENSE robustly reconstructs
the images acquired at 5x and 6x accelerations with only 4 or 8
calibration lines in 2D Cartesian acquisitions, corresponding to 22.0%
and 19.5% undersampling rates. The high-quality results and scanning
specificity make the proposed method hold the potential for further
accelerating the data acquisition of parallel MRI.
并行成像是加速磁共振成像 (MRI) 数据采集的常用技术。从数学上讲,并行 MRI 重建可以表示为将稀疏采样的 k 空间测量与所需 MRI 图像相关联的反问题。尽管许多现有的重建算法取得了成功,但从高度简化的 k 空间测量中可靠地重建高质量图像仍然是一个挑战。最近,隐式神经表示已经成为一种强大的范例,可以利用内部信息和部分获取的数据的物理特性来生成所需的对象。在这项研究中,我们引入了 IMJENSE,一种基于扫描特定隐式神经表示的方法,用于改进并行 MRI 重建。具体来说,底层 MRI 图像和线圈灵敏度被建模为空间坐标的连续函数,分别由神经网络和多项式参数化。网络中的权重和多项式中的系数是直接从稀疏获取的 k 空间测量中同时学习的,无需完全采样的地面实况数据进行训练。受益于 MRI 图像和线圈灵敏度的强大连续表示和联合估计,IMJENSE 优于传统图像或 k 空间域重建算法。由于校准数据极其有限,IMJENSE 比有监督的无校准和基于校准的深度学习方法更稳定。结果表明,IMJENSE 在 2D 笛卡尔采集中仅使用 4 或 8 条校准线即可稳健地重建以 5 倍和 6 倍加速度采集的图像,对应于 22.0% 和 19.5% 的欠采样率。 高质量的结果和扫描特异性使得该方法具有进一步加速并行 MRI 数据采集的潜力。
AU Li, Heng
Lin, Ziqin
Qiu, Zhongxi
Li, Zinan
Niu, Ke
Guo, Na
Fu, Huazhu
Hu, Yan
Liu, Jiang
AU Li, 林恒, 邱子勤, 李中希, 牛子楠, 郭克, 付娜, 胡华珠, 刘岩, 江
Enhancing and Adapting in the Clinic: Source-Free Unsupervised Domain
Adaptation for Medical Image Enhancement
临床中的增强和适应:用于医学图像增强的无源无监督域适应
Medical imaging provides many valuable clues involving anatomical
structure and pathological characteristics. However, image degradation
is a common issue in clinical practice, which can adversely impact the
observation and diagnosis by physicians and algorithms. Although
extensive enhancement models have been developed, these models require a
well pre-training before deployment, while failing to take advantage of
the potential value of inference data after deployment. In this paper,
we raise an algorithm for source-free unsupervised domain adaptive
medical image enhancement (SAME), which adapts and optimizes enhancement
models using test data in the inference phase. A structure-preserving
enhancement network is first constructed to learn a robust source model
from synthesized training data. Then a teacher-student model is
initialized with the source model and conducts source-free unsupervised
domain adaptation (SFUDA) by knowledge distillation with the test data.
Additionally, a pseudo-label picker is developed to boost the knowledge
distillation of enhancement tasks. Experiments were implemented on ten
datasets from three medical image modalities to validate the advantage
of the proposed algorithm, and setting analysis and ablation studies
were also carried out to interpret the effectiveness of SAME. The
remarkable enhancement performance and benefits for downstream tasks
demonstrate the potential and generalizability of SAME. The code is
available at
https://github.com/liamheng/Annotation-free-Medical-Image-Enhancement.
医学影像提供了许多涉及解剖结构和病理特征的有价值的线索。然而,图像质量下降是临床实践中的常见问题,这会对医生和算法的观察和诊断产生不利影响。尽管已经开发了广泛的增强模型,但这些模型在部署前需要进行良好的预训练,而在部署后无法利用推理数据的潜在价值。在本文中,我们提出了一种无源无监督域自适应医学图像增强(SAME)算法,该算法在推理阶段使用测试数据来适应和优化增强模型。首先构建结构保持增强网络,以从合成的训练数据中学习鲁棒的源模型。然后用源模型初始化师生模型,并通过测试数据的知识蒸馏进行无源无监督域适应(SFUDA)。此外,还开发了伪标签选择器来促进增强任务的知识蒸馏。在来自三种医学图像模态的十个数据集上进行了实验,以验证所提出算法的优势,并且还进行了设置分析和消融研究以解释 SAME 的有效性。下游任务的显着增强性能和优势证明了 SAME 的潜力和通用性。该代码可在 https://github.com/liamheng/Annotation-free-Medical-Image-Enhancement 获取。
AU Murali, Aditya
Alapatt, Deepak
Mascagni, Pietro
Vardazaryan, Armine
Garcia, Alain
Okamoto, Nariaki
Mutter, Didier
Padoy, Nicolas
AU Murali、阿迪亚·阿拉帕特、迪帕克·马斯卡尼、皮特罗·瓦尔达扎良、阿米恩·加西亚、阿兰·冈本、Nariaki Mutter、迪迪埃·帕多伊、尼古拉斯
Latent Graph Representations for Critical View of Safety Assessment
安全评估批判性观点的潜在图表示
Assessing the critical view of safety in laparoscopic cholecystectomy
requires accurate identification and localization of key anatomical
structures, reasoning about their geometric relationships to one
another, and determining the quality of their exposure. Prior works have
approached this task by including semantic segmentation as an
intermediate step, using predicted segmentation masks to then predict
the CVS. While these methods are effective, they rely on extremely
expensive ground-truth segmentation annotations and tend to fail when
the predicted segmentation is incorrect, limiting generalization. In
this work, we propose a method for CVS prediction wherein we first
represent a surgical image using a disentangled latent scene graph, then
process this representation using a graph neural network. Our graph
representations explicitly encode semantic information - object
location, class information, geometric relations - to improve
anatomy-driven reasoning, as well as visual features to retain
differentiability and thereby provide robustness to semantic errors.
Finally, to address annotation cost, we propose to train our method
using only bounding box annotations, incorporating an auxiliary image
reconstruction objective to learn fine-grained object boundaries. We
show that our method not only outperforms several baseline methods when
trained with bounding box annotations, but also scales effectively when
trained with segmentation masks, maintaining state-of-the-art
performance.
评估腹腔镜胆囊切除术安全性的关键观点需要准确识别和定位关键解剖结构,推理它们彼此之间的几何关系,并确定它们的暴露质量。之前的工作通过将语义分割作为中间步骤来完成此任务,然后使用预测的分割掩码来预测 CVS。虽然这些方法很有效,但它们依赖于极其昂贵的真实分割注释,并且当预测的分割不正确时往往会失败,从而限制了泛化。在这项工作中,我们提出了一种 CVS 预测方法,其中我们首先使用解开的潜在场景图表示手术图像,然后使用图神经网络处理该表示。我们的图形表示显式编码语义信息(对象位置、类信息、几何关系)以改进解剖驱动的推理,以及视觉特征以保留可微性,从而提供对语义错误的鲁棒性。最后,为了解决注释成本,我们建议仅使用边界框注释来训练我们的方法,并结合辅助图像重建目标来学习细粒度的对象边界。我们表明,我们的方法不仅在使用边界框注释进行训练时优于几种基线方法,而且在使用分割掩模进行训练时也能有效地扩展,从而保持最先进的性能。
AU Zhang, Yuanming
Li, Zheng
Han, Xiangmin
Ding, Saisai
Li, Juncheng
Wang, Jun
Ying, Shihui
Shi, Jun
张AU、李元明、韩正、丁向民、李赛赛、王俊成、英俊、石慧、Jun
Pseudo-Data Based Self-Supervised Federated Learning for Classification
of Histopathological Images
基于伪数据的自监督联邦学习用于组织病理学图像分类
Computer-aided diagnosis (CAD) can help pathologists improve diagnostic
accuracy together with consistency and repeatability for cancers.
However, the CAD models trained with the histopathological images only
from a single center (hospital) generally suffer from the generalization
problem due to the straining inconsistencies among different centers. In
this work, we propose a pseudo-data based self-supervised federated
learning (FL) framework, named SSL-FT-BT, to improve both the diagnostic
accuracy and generalization of CAD models. Specifically, the pseudo
histopathological images are generated from each center, which contain
both inherent and specific properties corresponding to the real images
in this center, but do not include the privacy information. These pseudo
images are then shared in the central server for self-supervised
learning (SSL) to pre-train the backbone of global mode. A multi-task
SSL is then designed to effectively learn both the center-specific
information and common inherent representation according to the data
characteristics. Moreover, a novel Barlow Twins based FL (FL-BT)
algorithm is proposed to improve the local training for the CAD models
in each center by conducting model contrastive learning, which benefits
the optimization of the global model in the FL procedure. The
experimental results on four public histopathological image datasets
indicate the effectiveness of the proposed SSL-FL-BT on both diagnostic
accuracy and generalization.
计算机辅助诊断 (CAD) 可以帮助病理学家提高癌症诊断的准确性以及一致性和可重复性。然而,由于不同中心之间的应变不一致,仅使用来自单个中心(医院)的组织病理学图像训练的 CAD 模型通常会遇到泛化问题。在这项工作中,我们提出了一种基于伪数据的自监督联邦学习 (FL) 框架,名为 SSL-FT-BT,以提高 CAD 模型的诊断准确性和泛化能力。具体而言,从每个中心生成伪组织病理学图像,其包含与该中心的真实图像相对应的固有的和特定的属性,但不包括隐私信息。然后,这些伪图像在中央服务器中共享,用于自我监督学习(SSL),以预训练全局模式的主干。然后设计多任务SSL,根据数据特征有效地学习中心特定信息和共同的固有表示。此外,提出了一种新的基于Barlow Twins的FL(FL-BT)算法,通过模型对比学习来改进每个中心CAD模型的局部训练,这有利于FL过程中全局模型的优化。在四个公共组织病理学图像数据集上的实验结果表明了所提出的 SSL-FL-BT 在诊断准确性和泛化方面的有效性。
AU Xu, Kai
Lu, Shiyu
Huang, Bin
Wu, Weiwen
Liu, Qiegen
AU徐、陆凯、黄世宇、吴斌、刘伟文、Qiegen
Stage-by-stage Wavelet Optimization Refinement Diffusion Model for
Sparse-View CT Reconstruction.
用于稀疏视图 CT 重建的分阶段小波优化细化扩散模型。
Diffusion model has emerged as a potential tool to tackle the challenge
of sparse-view CT reconstruction, displaying superior performance
compared to conventional methods. Nevertheless, these prevailing
diffusion models predominantly focus on the sinogram or image domains,
which can lead to instability during model training, potentially
culminating in convergence towards local minimal solutions. The wavelet
transform serves to disentangle image contents and features into
distinct frequency-component bands at varying scales, adeptly capturing
diverse directional structures. Employing the wavelet transform as a
guiding sparsity prior significantly enhances the robustness of
diffusion models. In this study, we present an innovative approach named
the Stage-by-stage Wavelet Optimization Refinement Diffusion (SWORD)
model for sparse-view CT reconstruction. Specifically, we establish a
unified mathematical model integrating low-frequency and high-frequency
generative models, achieving the solution with an optimization
procedure. Furthermore, we perform the low-frequency and high-frequency
generative models on wavelet's decomposed components rather than the
original sinogram, ensuring the stability of model training. Our method
is rooted in established optimization theory, comprising three distinct
stages, including low-frequency generation, high-frequency refinement
and domain transform. The experimental results demonstrated that the
proposed method outperformed existing state-of-the-art methods both
quantitatively and qualitatively.
扩散模型已成为应对稀疏视图 CT 重建挑战的潜在工具,与传统方法相比表现出优越的性能。然而,这些流行的扩散模型主要集中在正弦图或图像域,这可能导致模型训练过程中的不稳定,最终可能会收敛到局部最小解。小波变换用于将图像内容和特征分解成不同尺度的不同频率分量频带,从而熟练地捕获不同的方向结构。采用小波变换作为指导稀疏先验显着增强了扩散模型的鲁棒性。在这项研究中,我们提出了一种用于稀疏视图 CT 重建的创新方法,称为逐级小波优化细化扩散 (SWORD) 模型。具体来说,我们建立了一个集成低频和高频生成模型的统一数学模型,通过优化过程实现求解。此外,我们对小波分解后的分量而不是原始正弦图进行低频和高频生成模型,保证了模型训练的稳定性。我们的方法植根于已建立的优化理论,包括三个不同的阶段,包括低频生成、高频细化和域变换。实验结果表明,所提出的方法在数量和质量上都优于现有的最先进方法。
EI 1558-254X
DA 2024-01-20
UT MEDLINE:38236666
PM 38236666
ER
EI 1558-254X DA 2024-01-20 UT MEDLINE:38236666 PM 38236666 ER
AU Acciavatti, Raymond J.
Choi, Chloe J.
Vent, Trevor L.
Barufaldi, Bruno
Cohen, Eric A.
Wileyto, E. Paul
Maidment, Andrew D. A.
AU Acciavatti、Raymond J. Choi、Chloe J. Vent、Trevor L. Barufaldi、Bruno Cohen、Eric A. Wileyto、E. Paul Maidment、Andrew DA
Non-Isocentric Geometry for Next-Generation Tomosynthesis With
Super-Resolution
用于具有超分辨率的下一代断层合成的非等中心几何结构
Our lab at the University of Pennsylvania (UPenn) is investigating novel
designs for digital breast tomosynthesis. We built a next-generation
tomosynthesis system with a non-isocentric geometry
(superior-to-inferior detector motion). This paper examines four metrics
of image quality affected by this design. First, aliasing was analyzed
in reconstructions prepared with smaller pixelation than the detector.
Aliasing was assessed with a theoretical model of r-factor, a metric
calculating amplitudes of alias signal relative to input signal in the
Fourier transform of the reconstruction of a sinusoidal object. Aliasing
was also assessed experimentally with a bar pattern (illustrating
spatial variations in aliasing) and 360(degrees)-star pattern
(illustrating directional anisotropies in aliasing). Second, the point
spread function (PSF) was modeled in the direction perpendicular to the
detector to assess out-of-plane blurring. Third, power spectra were
analyzed in an anthropomorphic phantom developed by UPenn and
manufactured by Computerized Imaging Reference Systems (CIRS), Inc.
(Norfolk, VA). Finally, calcifications were analyzed in the CIRS Model
020 BR3D Breast Imaging Phantom in terms of signal-to-noise ratio (SNR);
i.e., mean calcification signal relative to background-tissue noise.
Image quality was generally superior in the non-isocentric geometry:
Aliasing artifacts were suppressed in both theoretical and experimental
reconstructions prepared with smaller pixelation than the detector. PSF
width was also reduced at most positions. Anatomic noise was reduced.
Finally, SNR in calcification detection was improved. (A potential
trade-off of smaller-pixel reconstructions was reduced SNR; however, SNR
was still improved by the detector-motion acquisition.) In conclusion,
the non-isocentric geometry improved image quality in several ways.
我们位于宾夕法尼亚大学 (UPenn) 的实验室正在研究数字乳房断层合成的新颖设计。我们构建了具有非等中心几何结构(从上到下探测器运动)的下一代断层合成系统。本文研究了受此设计影响的图像质量的四个指标。首先,在用比检测器更小的像素化准备的重建中分析混叠。使用 r 因子的理论模型来评估混叠,r 因子是一种计算混叠信号相对于正弦对象重建的傅里叶变换中的输入信号的幅度的度量。还使用条形图案(说明混叠中的空间变化)和 360(度)星形图案(说明混叠中的方向各向异性)对混叠进行了实验评估。其次,在垂直于探测器的方向上对点扩散函数 (PSF) 进行建模,以评估面外模糊。第三,在由 UPenn 开发并由 Computerized Imaging Reference Systems (CIRS), Inc.(诺福克,弗吉尼亚州)制造的拟人体模中分析功率谱。最后,在 CIRS Model 020 BR3D 乳腺成像体模中对钙化点进行了信噪比 (SNR) 分析;即,相对于背景组织噪声的平均钙化信号。非等中心几何结构中的图像质量通常较高:在使用比探测器更小的像素化准备的理论和实验重建中,混叠伪影均得到抑制。大多数位置的 PSF 宽度也有所减小。解剖噪音减少。最后,钙化检测的信噪比得到了提高。 (较小像素重建的潜在权衡是降低了信噪比;然而,检测器运动采集仍然提高了信噪比。) 总之,非等心几何结构通过多种方式提高了图像质量。
AU Ban, Yutong
Eckhoff, Jennifer A.
Ward, Thomas M.
Hashimoto, Daniel A.
Meireles, Ozanan R.
Rus, Daniela
Rosman, Guy
AU Ban、Yutong Eckhoff、Jennifer A. Ward、Thomas M. Hashimoto、Daniel A. Meireles、Ozanan R. Rus、Daniela Rosman、Guy
Concept Graph Neural Networks for Surgical Video Understanding
用于理解手术视频的概念图神经网络
Analysis of relations between objects and comprehension of abstract
concepts in the surgical video is important in AI-augmented surgery.
However, building models that integrate our knowledge and understanding
of surgery remains a challenging endeavor. In this paper, we propose a
novel way to integrate conceptual knowledge into temporal analysis tasks
using temporal concept graph networks. In the proposed networks, a
knowledge graph is incorporated into the temporal video analysis of
surgical notions, learning the meaning of concepts and relations as they
apply to the data. We demonstrate results in surgical video data for
tasks such as verification of the critical view of safety, estimation of
the Parkland grading scale as well as recognizing
instrument-action-tissue triplets. The results show that our method
improves the recognition and detection of complex benchmarks as well as
enables other analytic applications of interest.
手术视频中对象之间关系的分析和抽象概念的理解对于人工智能增强手术非常重要。然而,建立整合我们对手术的知识和理解的模型仍然是一项具有挑战性的工作。在本文中,我们提出了一种使用时态概念图网络将概念知识集成到时态分析任务中的新方法。在所提出的网络中,知识图被纳入手术概念的时间视频分析中,学习概念和关系应用于数据时的含义。我们展示了手术视频数据的结果,用于验证安全性批判性观点、估计 Parkland 分级量表以及识别器械-动作-组织三元组等任务。结果表明,我们的方法提高了复杂基准的识别和检测,并支持其他感兴趣的分析应用。
AU Yang, Yanwu
Ye, Chenfei
Guo, Xutao
Wu, Tao
Xiang, Yang
Ma, Ting
欧阳、叶彦武、郭晨飞、吴旭涛、翔涛、马洋、婷
Mapping Multi-Modal Brain Connectome for Brain Disorder Diagnosis via
Cross-Modal Mutual Learning
通过跨模态相互学习绘制多模态脑连接组用于脑部疾病诊断
Recently, the study of multi-modal brain connectome has recorded a
tremendous increase and facilitated the diagnosis of brain disorders. In
this paradigm, functional and structural networks, e.g., functional and
structural connectivity derived from fMRI and DTI, are in some manner
interacted but are not necessarily linearly related. Accordingly, there
remains a great challenge to leverage complementary information for
brain connectome analysis. Recently, Graph Convolutional Networks (GNN)
have been widely applied to the fusion of multi-modal brain connectome.
However, most existing GNN methods fail to couple inter-modal
relationships. In this regard, we propose a Cross-modal Graph Neural
Network (Cross-GNN) that captures inter-modal dependencies through
dynamic graph learning and mutual learning. Specifically, the
inter-modal representations are attentively coupled into a compositional
space for reasoning inter-modal dependencies. Additionally, we
investigate mutual learning in explicit and implicit ways: (1)
Cross-modal representations are obtained by cross-embedding explicitly
based on the inter-modal correspondence matrix. (2) We propose a
cross-modal distillation method to implicitly regularize latent
representations with cross-modal semantic contexts. We carry out
statistical analysis on the attentively learned correspondence matrices
to evaluate inter-modal relationships for associating disease
biomarkers. Our extensive experiments on three datasets demonstrate the
superiority of our proposed method for disease diagnosis with promising
prediction performance and multi-modal connectome biomarker location.
最近,多模式脑连接组的研究有了巨大的增长,并促进了脑部疾病的诊断。在这个范式中,功能和结构网络,例如源自fMRI和DTI的功能和结构连接,以某种方式相互作用,但不一定是线性相关的。因此,利用补充信息进行脑连接组分析仍然存在巨大挑战。近年来,图卷积网络(GNN)已被广泛应用于多模态脑连接组的融合。然而,大多数现有的 GNN 方法无法耦合模态间的关系。在这方面,我们提出了一种跨模态图神经网络(Cross-GNN),通过动态图学习和相互学习来捕获模态间依赖关系。具体来说,将模态间表示仔细地耦合到组合空间中以推理模态间依赖性。此外,我们以显式和隐式方式研究相互学习:(1)通过基于模间对应矩阵显式交叉嵌入来获得跨模态表示。 (2)我们提出了一种跨模态蒸馏方法,通过跨模态语义上下文隐式正则化潜在表示。我们对仔细学习的对应矩阵进行统计分析,以评估关联疾病生物标志物的模态关系。我们对三个数据集进行的广泛实验证明了我们提出的疾病诊断方法的优越性,具有良好的预测性能和多模式连接组生物标志物定位。
AU Yue, Zheng
Jiang, Jiayao
Hou, Wenguang
Zhou, Quan
David Spence, J
Fenster, Aaron
Qiu, Wu
Ding, Mingyue
区越、姜正、侯佳耀、周文广、全·大卫·斯宾塞、J·芬斯特、邱亚伦、丁武、明月
Prior-knowledge Embedded U-Net based Fully Automatic Vessel Wall Volume
Measurement of the Carotid Artery in 3D Ultrasound Image.
基于先验知识嵌入式 U-Net 的 3D 超声图像中颈动脉的全自动血管壁体积测量。
The vessel-wall-volume (VWV) measured based on three-dimensional (3D)
carotid artery (CA) ultrasound (US) images can help to assess carotid
atherosclerosis and manage patients at risk for stroke. Manual
involvement for measurement work is subjective and requires well-trained
operators, and fully automatic measurement tools are not yet available.
Thereby, we proposed a fully automatic VWV measurement framework
(Auto-VWV) using a CA prior-knowledge embedded U-Net (CAP-UNet) to
measure the VWV from 3D CA US images without manual intervention. The
Auto-VWV framework is designed to improve the repeated VWV measuring
consistency, which resulted in the first fully automatic framework for
VWV measurement. CAP-UNet is developed to improve segmentation accuracy
on the whole CA, which composed of a U-Net type backbone and three
additional prior-knowledge learning modules. Specifically, a continuity
learning module is used to learn the spatial continuity of the arteries
in a sequence of image slices. A voxel evolution learning module was
designed to learn the evolution of the artery in adjacent slices, and a
topology learning module was used to learn the unique topology of the
carotid artery. In two 3D CA US datasets, CAP-UNet architecture achieved
state-of-the-art performance compared to eight competing models.
Furthermore, CAP-UNet-based Auto-VWV achieved better accuracy and
consistency than Auto-VWV based on competing models in the simulated
repeated measurement. Finally, using 10 pairs of real repeatedly scanned
samples, Auto-VWV achieved better VWV measurement reproducibility than
intra- and inter-operator manual measurements.
基于三维 (3D) 颈动脉 (CA) 超声 (US) 图像测量的血管壁体积 (VWV) 有助于评估颈动脉粥样硬化并管理有中风风险的患者。人工参与测量工作具有主观性,需要训练有素的操作人员,目前还没有全自动测量工具。因此,我们提出了一种全自动 VWV 测量框架 (Auto-VWV),使用 CA 先验知识嵌入式 U-Net (CAP-UNet) 来测量 3D CA US 图像的 VWV,无需人工干预。 Auto-VWV 框架旨在提高重复 VWV 测量的一致性,从而产生了第一个全自动 VWV 测量框架。 CAP-UNet 是为了提高整个 CA 的分割精度而开发的,它由 U-Net 型主干和三个附加先验知识学习模块组成。具体来说,连续性学习模块用于学习图像切片序列中动脉的空间连续性。设计体素进化学习模块来学习相邻切片中动脉的进化,并使用拓扑学习模块来学习颈动脉的独特拓扑。在两个 3D CA US 数据集中,与八个竞争模型相比,CAP-UNet 架构实现了最先进的性能。此外,在模拟重复测量中,基于CAP-UNet的Auto-VWV比基于竞争模型的Auto-VWV取得了更好的准确性和一致性。最后,使用 10 对真实的重复扫描样本,自动 VWV 实现了比操作员内部和操作员之间手动测量更好的 VWV 测量再现性。
AU Yang, Xiaoyu
Xu, Lijian
Yu, Simon
Xia, Qing
Li, Hongsheng
Zhang, Shaoting
AU Yang, 徐晓宇, 于立建, 夏西蒙, 李庆, 张宏生, 绍婷
Segmentation and Vascular Vectorization for Coronary Artery by
Geometry-based Cascaded Neural Network.
基于几何的级联神经网络对冠状动脉进行分割和血管矢量化。
Segmentation of the coronary artery is an important task for the
quantitative analysis of coronary computed tomography angiography (CCTA)
images and is being stimulated by the field of deep learning. However,
the complex structures with tiny and narrow branches of the coronary
artery bring it a great challenge. Coupled with the medical image
limitations of low resolution and poor contrast, fragmentations of
segmented vessels frequently occur in the prediction. Therefore, a
geometry-based cascaded segmentation method is proposed for the coronary
artery, which has the following innovations: 1) Integrating geometric
deformation networks, we design a cascaded network for segmenting the
coronary artery and vectorizing results. The generated meshes of the
coronary artery are continuous and accurate for twisted and
sophisticated coronary artery structures, without fragmentations. 2)
Different from mesh annotations generated by the traditional marching
cube method from voxel-based labels, a finer vectorized mesh of the
coronary artery is reconstructed with the regularized morphology. The
novel mesh annotation benefits the geometry-based segmentation network,
avoiding bifurcation adhesion and point cloud dispersion in intricate
branches. 3) A dataset named CCA-200 is collected, consisting of 200
CCTA images with coronary artery disease. The ground truths of 200 cases
are coronary internal diameter annotations by professional radiologists.
Extensive experiments verify our method on our collected dataset CCA-200
and public ASOCA dataset, with a Dice of 0.778 on CCA-200 and 0.895 on
ASOCA, showing superior results. Especially, our geometry-based model
generates an accurate, intact and smooth coronary artery, devoid of any
fragmentations of segmented vessels.
冠状动脉分割是冠状动脉计算机断层扫描血管造影(CCTA)图像定量分析的一项重要任务,并且受到深度学习领域的推动。然而,冠状动脉结构复杂、分支细小、狭窄,给它带来了巨大的挑战。再加上医学图像分辨率低、对比度差的限制,预测中经常出现分割血管的碎片。因此,提出了一种基于几何的冠状动脉级联分割方法,该方法具有以下创新点:1)集成几何变形网络,设计了用于分割冠状动脉并对结果进行矢量化的级联网络。对于扭曲和复杂的冠状动脉结构,生成的冠状动脉网格是连续且准确的,没有碎片。 2)与传统的行进立方体方法根据基于体素的标签生成的网格注释不同,用正则化形态重建了更精细的冠状动脉矢量化网格。新颖的网格注释有利于基于几何的分割网络,避免复杂分支中的分叉粘附和点云分散。 3) 收集名为CCA-200的数据集,由200张冠状动脉疾病的CCTA图像组成。 200个病例的基本事实是专业放射科医生的冠状动脉内径注释。大量实验在我们收集的数据集 CCA-200 和公共 ASOCA 数据集上验证了我们的方法,CCA-200 上的 Dice 为 0.778,ASOCA 上的 Dice 为 0.895,显示出优异的结果。特别是,我们基于几何的模型生成了准确、完整且光滑的冠状动脉,没有任何分段血管的碎片。
AU Li, Zirong
Chang, Dingyue
Zhang, Zhenxi
Luo, Fulin
Liu, Qiegen
Zhang, Jianjia
Yang, Guang
Wu, Weiwen
AU Li、常子荣、张丁月、罗振西、刘福林、张切根、杨健佳、吴光、伟文
Dual-domain Collaborative Diffusion Sampling for Multi-Source Stationary
Computed Tomography Reconstruction.
用于多源固定计算机断层扫描重建的双域协作扩散采样。
The multi-source stationary CT, where both the detector and X-ray source
are fixed, represents a novel imaging system with high temporal
resolution that has garnered significant interest. Limited space within
the system restricts the number of X-ray sources, leading to sparse-view
CT imaging challenges. Recent diffusion models for reconstructing
sparse-view CT have generally focused separately on sinogram or image
domains. Sinogram-centric models effectively estimate missing
projections but may introduce artifacts, lacking mechanisms to ensure
image correctness. Conversely, image-domain models, while capturing
detailed image features, often struggle with complex data distribution,
leading to inaccuracies in projections. Addressing these issues, the
Dual-domain Collaborative Diffusion Sampling (DCDS) model integrates
sinogram and image domain diffusion processes for enhanced sparse-view
reconstruction. This model combines the strengths of both domains in an
optimized mathematical framework. A collaborative diffusion mechanism
underpins this model, improving sinogram recovery and image generative
capabilities. This mechanism facilitates feedback-driven image
generation from the sinogram domain and uses image domain results to
complete missing projections. Optimization of the DCDS model is further
achieved through the alternative direction iteration method, focusing on
data consistency updates. Extensive testing, including numerical
simulations, real phantoms, and clinical cardiac datasets, demonstrates
the DCDS model's effectiveness. It consistently outperforms various
state-of-the-art benchmarks, delivering exceptional reconstruction
quality and precise sinogram.
多源固定 CT 的探测器和 X 射线源都是固定的,代表了一种具有高时间分辨率的新型成像系统,引起了人们的极大兴趣。系统内有限的空间限制了 X 射线源的数量,导致稀疏视图 CT 成像面临挑战。最近用于重建稀疏视图 CT 的扩散模型通常分别关注正弦图或图像域。以正弦图为中心的模型可以有效地估计缺失的投影,但可能会引入伪影,缺乏确保图像正确性的机制。相反,图像域模型在捕获详细图像特征的同时,通常会与复杂的数据分布作斗争,从而导致投影不准确。为了解决这些问题,双域协作扩散采样 (DCDS) 模型集成了正弦图和图像域扩散过程,以增强稀疏视图重建。该模型在优化的数学框架中结合了两个领域的优势。协作扩散机制支撑了该模型,提高了正弦图恢复和图像生成能力。这种机制有利于从正弦图域生成反馈驱动的图像,并使用图像域结果来完成缺失的投影。通过交替方向迭代方法进一步实现DCDS模型的优化,重点关注数据一致性更新。广泛的测试,包括数值模拟、真实模型和临床心脏数据集,证明了 DCDS 模型的有效性。它始终优于各种最先进的基准,提供卓越的重建质量和精确的正弦图。
AU Chen, Wenting
Liu, Jie
Chow, Tommy W S
Yuan, Yixuan
AU Chen、刘文婷、周杰、Tommy WS Yuan、Yixuan
STAR-RL: Spatial-temporal Hierarchical Reinforcement Learning for
Interpretable Pathology Image Super-Resolution.
STAR-RL:用于可解释病理学图像超分辨率的时空分层强化学习。
Pathology image are essential for accurately interpreting lesion cells
in cytopathology screening, but acquiring high-resolution digital slides
requires specialized equipment and long scanning times. Though
super-resolution (SR) techniques can alleviate this problem, existing
deep learning models recover pathology image in a black-box manner,
which can lead to untruthful biological details and misdiagnosis.
Additionally, current methods allocate the same computational resources
to recover each pixel of pathology image, leading to the sub-optimal
recovery issue due to the large variation of pathology image. In this
paper, we propose the first hierarchical reinforcement learning
framework named Spatial-Temporal hierARchical Reinforcement Learning
(STAR-RL), mainly for addressing the aforementioned issues in pathology
image super-resolution problem. We reformulate the SR problem as a
Markov decision process of interpretable operations and adopt the
hierarchical recovery mechanism in patch level, to avoid sub-optimal
recovery. Specifically, the higher-level spatial manager is proposed to
pick out the most corrupted patch for the lower-level patch worker.
Moreover, the higher-level temporal manager is advanced to evaluate the
selected patch and determine whether the optimization should be stopped
earlier, thereby avoiding the over-processed problem. Under the guidance
of spatial-temporal managers, the lower-level patch worker processes the
selected patch with pixel-wise interpretable actions at each time step.
Experimental results on medical images degraded by different kernels
show the effectiveness of STAR-RL. Furthermore, STAR-RL validates the
promotion in tumor diagnosis with a large margin and shows
generalizability under various degradation. The source code is to be
released.
病理图像对于细胞病理学筛查中准确解释病变细胞至关重要,但获取高分辨率数字切片需要专门的设备和较长的扫描时间。尽管超分辨率(SR)技术可以缓解这个问题,但现有的深度学习模型以黑盒方式恢复病理图像,这可能导致不真实的生物学细节和误诊。此外,当前的方法分配相同的计算资源来恢复病理图像的每个像素,由于病理图像的巨大变化而导致次优恢复问题。在本文中,我们提出了第一个分层强化学习框架,名为时空分层强化学习(STAR-RL),主要用于解决病理图像超分辨率问题中的上述问题。我们将SR问题重新表述为可解释操作的马尔可夫决策过程,并采用补丁级别的分层恢复机制,以避免次优恢复。具体来说,建议较高级别的空间管理器为较低级别的补丁工作人员挑选出损坏最严重的补丁。此外,高级时间管理器可以评估所选补丁并确定是否应提前停止优化,从而避免过度处理问题。在时空管理器的指导下,较低级别的补丁工作人员在每个时间步骤通过像素级可解释的动作来处理选定的补丁。不同内核降解的医学图像的实验结果表明了STAR-RL的有效性。此外,STAR-RL 极大地验证了其在肿瘤诊断中的促进作用,并在各种退化下显示出普遍性。 源代码即将发布。
AU Chen, Zhihao
Niu, Chuang
Gao, Qi
Wang, Ge
Shan, Hongming
陈AU、牛志浩、高闯、王奇、葛山、洪明
LIT-Former: Linking In-Plane and Through-Plane Transformers for
Simultaneous CT Image Denoising and Deblurring
LIT-Former:连接平面内和平面内变压器以同时进行 CT 图像去噪和去模糊
This paper studies 3D low-dose computed tomography (CT) imaging.
Although various deep learning methods were developed in this context,
typically they focus on 2D images and perform denoising due to low-dose
and deblurring for super-resolution separately. Up to date, little work
was done for simultaneous in-plane denoising and through-plane
deblurring, which is important to obtain high-quality 3D CT images with
lower radiation and faster imaging speed. For this task, a
straightforward method is to directly train an end-to-end 3D network.
However, it demands much more training data and expensive computational
costs. Here, we propose to link in-plane and through-plane transformers
for simultaneous in-plane denoising and through-plane deblurring, termed
as LIT-Former, which can efficiently synergize in-plane and
through-plane sub-tasks for 3D CT imaging and enjoy the advantages of
both convolution and transformer networks. LIT-Former has two novel
designs: efficient multi-head self-attention modules (eMSM) and
efficient convolutional feed-forward networks (eCFN). First, eMSM
integrates in-plane 2D self-attention and through-plane 1D
self-attention to efficiently capture global interactions of 3D
self-attention, the core unit of transformer networks. Second, eCFN
integrates 2D convolution and 1D convolution to extract local
information of 3D convolution in the same fashion. As a result, the
proposed LIT-Former synergizes these two sub-tasks, significantly
reducing the computational complexity as compared to 3D counterparts and
enabling rapid convergence. Extensive experimental results on simulated
and clinical datasets demonstrate superior performance over
state-of-the-art models. The source code is made available at
https://github.com/hao1635/LIT-Former.
本文研究 3D 低剂量计算机断层扫描 (CT) 成像。尽管在此背景下开发了各种深度学习方法,但它们通常专注于 2D 图像,并分别执行低剂量去噪和超分辨率去模糊。迄今为止,同时进行面内去噪和穿面去模糊的工作还很少,这对于获得具有较低辐射和更快成像速度的高质量 3D CT 图像非常重要。对于此任务,一种简单的方法是直接训练端到端 3D 网络。然而,它需要更多的训练数据和昂贵的计算成本。在这里,我们建议链接平面内和穿过平面变压器以同时进行平面内去噪和穿过平面去模糊,称为 LIT-Former,它可以有效地协同 3D CT 成像的平面内和穿过平面子任务并享受卷积网络和变压器网络的优点。 LIT-Former 有两种新颖的设计:高效的多头自注意力模块(eMSM)和高效的卷积前馈网络(eCFN)。首先,eMSM 集成了平面内 2D 自注意力和跨平面 1D 自注意力,以有效捕获变压器网络核心单元 3D 自注意力的全局交互。其次,eCFN 集成了 2D 卷积和 1D 卷积,以相同的方式提取 3D 卷积的局部信息。因此,所提出的 LIT-Former 协同这两个子任务,与 3D 对应任务相比显着降低了计算复杂性,并实现快速收敛。模拟和临床数据集的大量实验结果表明,其性能优于最先进的模型。源代码位于 https://github.com/hao1635/LIT-Former。
AU Liu, Pan
Huang, Gao
Jing, Jing
Bian, Suyan
Cheng, Liuquan
Lu, Xin Yang
Rao, Chongyou
Liu, Yu
Hua, Yun
Wang, Yongjun
He, Kunlun
刘AU、黄潘、高静、边静、程苏彦、陆柳泉、饶欣阳、刘崇友、余华、王云、何永军、昆仑
An Energy Matching Vessel Segmentation Framework in 3-D Medical Images
3D 医学图像中的能量匹配血管分割框架
Accurate vascular segmentation from High Resolution 3-Dimensional (HR3D)
medical scans is crucial for clinicians to visualize complex vasculature
and diagnose related vascular diseases. However, a reliable and scalable
vessel segmentation framework for HR3D scans remains a challenge. In
this work, we propose a High-resolution Energy-matching Segmentation
(HrEmS) framework that utilizes deep learning to directly process the
entire HR3D scan and segment the vasculature to the finest level. The
HrEmS framework introduces two novel components. Firstly, it uses the
real-order total variation operator to construct a new loss function
that guides the segmentation network to obtain the correct topology
structure by matching the energy of the predicted segment to the energy
of the manual label. This is different from traditional loss functions
such as dice loss, which matches the pixels between predicted segment
and manual label. Secondly, a curvature-based weight-correction module
is developed, which directs the network to focus on crucial and complex
structural parts of the vasculature instead of the easy parts. The
proposed HrEmS framework was tested on three in-house multi-center
datasets and three public datasets, and demonstrated improved results in
comparison with the state-of-the-art methods using both
topology-relevant and volumetric-relevant metrics. Furthermore, a
double-blind assessment by three experienced radiologists on the
critical points of the clinical diagnostic processes provided additional
evidence of the superiority of the HrEmS framework.
高分辨率三维 (HR3D) 医学扫描的准确血管分割对于临床医生可视化复杂血管系统和诊断相关血管疾病至关重要。然而,用于 HR3D 扫描的可靠且可扩展的血管分割框架仍然是一个挑战。在这项工作中,我们提出了一种高分辨率能量匹配分割(HrEmS)框架,该框架利用深度学习直接处理整个 HR3D 扫描并将脉管系统分割到最精细的水平。 HrEmS 框架引入了两个新颖的组件。首先,它使用实阶全变分算子构造一个新的损失函数,通过将预测分段的能量与手动标签的能量相匹配来引导分段网络获得正确的拓扑结构。这与骰子损失等传统损失函数不同,后者匹配预测片段和手动标签之间的像素。其次,开发了基于曲率的权重校正模块,引导网络关注脉管系统的关键和复杂的结构部分,而不是简单的部分。所提出的 HrEmS 框架在三个内部多中心数据集和三个公共数据集上进行了测试,并与使用拓扑相关和体积相关指标的最先进方法相比,展示了改进的结果。此外,三位经验丰富的放射科医生对临床诊断过程的关键点进行的双盲评估为 HrEmS 框架的优越性提供了额外的证据。
AU Miao, Juzheng
Zhou, Si-Ping
Zhou, Guang-Quan
Wang, Kai-Ni
Yang, Meng
Zhou, Shoujun
Chen, Yang
区苗、周居正、周思平、王广泉、杨凯妮、周猛、陈守军、杨
SC-SSL: Self-Correcting Collaborative and Contrastive Co-Training Model
for Semi-Supervised Medical Image Segmentation
SC-SSL:半监督医学图像分割的自校正协作和对比协同训练模型
Image segmentation achieves significant improvements with deep neural
networks at the premise of a large scale of labeled training data, which
is laborious to assure in medical image tasks. Recently, semi-supervised
learning (SSL) has shown great potential in medical image segmentation.
However, the influence of the learning target quality for unlabeled data
is usually neglected in these SSL methods. Therefore, this study
proposes a novel self-correcting co-training scheme to learn a better
target that is more similar to ground-truth labels from collaborative
network outputs. Our work has three-fold highlights. First, we advance
the learning target generation as a learning task, improving the
learning confidence for unannotated data with a self-correcting module.
Second, we impose a structure constraint to encourage the shape
similarity further between the improved learning target and the
collaborative network outputs. Finally, we propose an innovative
pixel-wise contrastive learning loss to boost the representation
capacity under the guidance of an improved learning target, thus
exploring unlabeled data more efficiently with the awareness of semantic
context. We have extensively evaluated our method with the
state-of-the-art semi-supervised approaches on four public-available
datasets, including the ACDC dataset, M&Ms dataset, Pancreas-CT dataset,
and Task_07 CT dataset. The experimental results with different
labeled-data ratios show our proposed method's superiority over other
existing methods, demonstrating its effectiveness in semi-supervised
medical image segmentation.
在大规模标记训练数据的前提下,深度神经网络的图像分割取得了显着的改进,而这在医学图像任务中很难保证。最近,半监督学习(SSL)在医学图像分割中显示出巨大的潜力。然而,这些 SSL 方法通常忽略了学习目标质量对未标记数据的影响。因此,本研究提出了一种新颖的自校正协同训练方案,以学习与协作网络输出中的真实标签更相似的更好目标。我们的工作有三个亮点。首先,我们将学习目标生成作为一项学习任务来推进,通过自校正模块提高未注释数据的学习信心。其次,我们施加结构约束以进一步鼓励改进的学习目标和协作网络输出之间的形状相似性。最后,我们提出了一种创新的逐像素对比学习损失,以在改进的学习目标的指导下提高表示能力,从而在语义上下文的感知下更有效地探索未标记的数据。我们使用最先进的半监督方法在四个公开数据集(包括 ACDC 数据集、M&Ms 数据集、Pancreas-CT 数据集和 Task_07 CT 数据集)上广泛评估了我们的方法。不同标记数据比例的实验结果表明我们提出的方法相对于其他现有方法的优越性,证明了其在半监督医学图像分割中的有效性。
AU Liu, Jicheng
Liu, Hui
Fu, Huazhu
Ye, Yu
Chen, Kun
Lu, Yu
Mao, Jianbo
Xu, Ronald X.
Sun, Mingzhai
刘AU、刘继成、付辉、叶华珠、陈宇、陆坤、毛宇、徐建波、孙旭、明斋
Edge-Guided Contrastive Adaptation Network for Arteriovenous Nicking
Classification Using Synthetic Data
使用合成数据进行动静脉切口分类的边缘引导对比适应网络
Retinal arteriovenous nicking (AVN) manifests as a reduced venular
caliber of an arteriovenous crossing. AVNs are signs of many systemic,
particularly cardiovascular diseases. Studies have shown that people
with AVN are twice as likely to have a stroke. However, AVN
classification faces two challenges. One is the lack of data, especially
AVNs compared to the normal arteriovenous (AV) crossings. The other is
the significant intra-class variations and minute inter-class
differences. AVNs may look different in shape, scale, pose, and color.
On the other hand, the AVN could be different from the normal AV
crossing only by slight thinning of the vein. To address these
challenges, first, we develop a data synthesis method to generate AV
crossings, including normal and AVNs. Second, to mitigate the domain
shift between the synthetic and real data, an edge-guided unsupervised
domain adaptation network is designed to guide the transfer of domain
invariant information. Third, a semantic contrastive learning branch
(SCLB) is introduced and a set of semantically related images, as a
semantic triplet, are input to the network simultaneously to guide the
network to focus on the subtle differences in venular width and to
ignore the differences in appearance. These strategies effectively
mitigate the lack of data, domain shift between synthetic and real data,
and significant intra- but minute inter-class differences. Extensive
experiments have been performed to demonstrate the outstanding
performance of the proposed method.
视网膜动静脉缺损(AVN)表现为动静脉交叉口的小静脉口径减小。 AVN 是许多全身性疾病,特别是心血管疾病的征兆。研究表明,患有 AVN 的人中风的可能性是普通人的两倍。然而,AVN 分类面临两个挑战。一是缺乏数据,尤其是 AVN 与正常动静脉 (AV) 交叉点的比较。另一个是显着的类内差异和微小的类间差异。 AVN 的形状、大小、姿势和颜色可能有所不同。另一方面,AVN 与正常 AV 交叉的不同之处仅在于静脉稍微变细。为了解决这些挑战,首先,我们开发了一种数据合成方法来生成 AV 交叉,包括正常和 AVN。其次,为了减轻合成数据和真实数据之间的域转移,设计了边缘引导的无监督域适应网络来指导域不变信息的传输。第三,引入语义对比学习分支(SCLB),将一组语义相关的图像作为语义三元组同时输入到网络中,引导网络关注小静脉宽度的细微差异,而忽略静脉宽度的差异。外貌。这些策略有效地缓解了数据缺乏、合成数据与真实数据之间的领域转移以及类内但微小的类间显着差异。已经进行了大量的实验来证明所提出的方法的出色性能。
AU Spieker, Veronika
Eichhorn, Hannah
Hammernik, Kerstin
Rueckert, Daniel
Preibisch, Christine
Karampinos, Dimitrios C.
Schnabel, Julia A.
AU Spieker、Veronika Eichhorn、Hannah Hammernik、Kerstin Rueckert、Daniel Preibisch、Christine Karampinos、Dimitrios C. Schnabel、Julia A.
Deep Learning for Retrospective Motion Correction in MRI: A
Comprehensive Review
MRI 中回顾性运动校正的深度学习:综合综述
Motion represents one of the major challenges in magnetic resonance
imaging (MRI). Since the MR signal is acquired in frequency space, any
motion of the imaged object leads to complex artefacts in the
reconstructed image in addition to other MR imaging artefacts. Deep
learning has been frequently proposed for motion correction at several
stages of the reconstruction process. The wide range of MR acquisition
sequences, anatomies and pathologies of interest, and motion patterns
(rigid vs. deformable and random vs. regular) makes a comprehensive
solution unlikely. To facilitate the transfer of ideas between different
applications, this review provides a detailed overview of proposed
methods for learning-based motion correction in MRI together with their
common challenges and potentials. This review identifies differences and
synergies in underlying data usage, architectures, training and
evaluation strategies. We critically discuss general trends and outline
future directions, with the aim to enhance interaction between different
application areas and research fields.
运动是磁共振成像 (MRI) 的主要挑战之一。由于 MR 信号是在频率空间中采集的,因此除了其他 MR 成像伪影之外,成像对象的任何运动都会导致重建图像中出现复杂的伪影。深度学习经常被提出用于重建过程的几个阶段的运动校正。广泛的 MR 采集序列、感兴趣的解剖结构和病理学以及运动模式(刚性与可变形、随机与规则)使得全面的解决方案不太可能。为了促进不同应用之间的思想转移,本综述详细概述了 MRI 中基于学习的运动校正所提出的方法及其常见挑战和潜力。本次审查确定了基础数据使用、架构、培训和评估策略方面的差异和协同作用。我们批判性地讨论总体趋势并概述未来方向,旨在加强不同应用领域和研究领域之间的互动。
AU Cui, Zhuo-Xu
Liu, Congcong
Fan, Xiaohong
Cao, Chentao
Cheng, Jing
Zhu, Qingyong
Liu, Yuanyuan
Jia, Sen
Wang, Haifeng
Zhu, Yanjie
Zhou, Yihang
Zhang, Jianping
Liu, Qiegen
Liang, Dong
崔AU、刘卓旭、范聪聪、曹晓红、程晨涛、朱静、刘庆勇、贾媛媛、王森、朱海峰、周艳杰、张一航、刘建平、梁切根、董
Physics-Informed DeepMRI: k-Space Interpolation Meets Heat Diffusion.
基于物理的 DeepMRI:k 空间插值遇到热扩散。
Recently, diffusion models have shown considerable promise for MRI
reconstruction. However, extensive experimentation has revealed that
these models are prone to generating artifacts due to the inherent
randomness involved in generating images from pure noise. To achieve
more controlled image reconstruction, we reexamine the concept of
interpolatable physical priors in k-space data, focusing specifically on
the interpolation of high-frequency (HF) k-space data from low-frequency
(LF) k-space data. Broadly, this insight drives a shift in the
generation paradigm from random noise to a more deterministic approach
grounded in the existing LF k-space data. Building on this, we first
establish a relationship between the interpolation of HF k-space data
from LF k-space data and the reverse heat diffusion process, providing a
fundamental framework for designing diffusion models that generate
missing HF data. To further improve reconstruction accuracy, we
integrate a traditional physics-informed k-space interpolation model
into our diffusion framework as a data fidelity term. Experimental
validation using publicly available datasets demonstrates that our
approach significantly surpasses traditional k-space interpolation
methods, deep learning-based k-space interpolation techniques, and
conventional diffusion models, particularly in HF regions. Finally, we
assess the generalization performance of our model across various
out-of-distribution datasets. Our code are available at
https://github.com/ZhuoxuCui/Heat-Diffusion.
最近,扩散模型在 MRI 重建方面显示出了巨大的前景。然而,大量的实验表明,由于从纯噪声生成图像所涉及的固有随机性,这些模型很容易产生伪影。为了实现更受控的图像重建,我们重新审视了 k 空间数据中可插值物理先验的概念,特别关注从低频 (LF) k 空间数据插值高频 (HF) k 空间数据。从广义上讲,这种见解推动了生成范式从随机噪声转向基于现有 LF k 空间数据的更具确定性的方法。在此基础上,我们首先建立了从 LF k 空间数据插值 HF k 空间数据与反向热扩散过程之间的关系,为设计生成缺失 HF 数据的扩散模型提供了基本框架。为了进一步提高重建精度,我们将传统的物理信息 k 空间插值模型作为数据保真度项集成到我们的扩散框架中。使用公开数据集进行的实验验证表明,我们的方法显着优于传统的 k 空间插值方法、基于深度学习的 k 空间插值技术和传统的扩散模型,特别是在 HF 区域。最后,我们评估模型在各种分布外数据集上的泛化性能。我们的代码可在 https://github.com/ZhuoxuCui/Heat-Diffusion 获取。
AU Jiang, Yikun
Pei, Yuru
Xu, Tianmin
Yuan, Xiaoru
Zha, Hongbin
区江、裴一琨、徐玉如、袁天民、查晓如、洪斌
Towards Semantically-Consistent Deformable 2D-3D Registration for 3D
Craniofacial Structure Estimation from A Single-View Lateral
Cephalometric Radiograph.
通过单视图侧位头影测量射线照片进行 3D 颅面结构估计的语义一致的可变形 2D-3D 配准。
The deep neural networks combined with the statistical shape model have
enabled efficient deformable 2D-3D registration and recovery of 3D
anatomical structures from a single radiograph. However, the recovered
volumetric image tends to lack the volumetric fidelity of fine-grained
anatomical structures and explicit consideration of cross-dimensional
semantic correspondence. In this paper, we introduce a simple but
effective solution for semantically-consistent deformable 2D-3D
registration and detailed volumetric image recovery by inferring a
voxel-wise registration field between the cone-beam computed tomography
and a single lateral cephalometric radiograph (LC). The key idea is to
refine the initial statistical model-based registration field with
craniofacial structural details and semantic consistency from the LC.
Specifically, our framework employs a self-supervised scheme to learn a
voxel-level refiner of registration fields to provide fine-grained
craniofacial structural details and volumetric fidelity. We also present
a weakly supervised semantic consistency measure for semantic
correspondence, relieving the requirements of volumetric image
collections and annotations. Experiments showcase that our method
achieves deformable 2D-3D registration with performance gains over
state-of-the-art registration and radiograph-based volumetric
reconstruction methods. The source code is available at
https://github.com/Jyk-122/SC-DREG.
深度神经网络与统计形状模型相结合,实现了高效的可变形 2D-3D 配准,并从单张 X 光照片中恢复 3D 解剖结构。然而,恢复的体积图像往往缺乏细粒度解剖结构的体积保真度和对跨维度语义对应的明确考虑。在本文中,我们通过推断锥形束计算机断层扫描和单侧头影测量 X 光片 (LC) 之间的体素配准场,介绍了一种简单但有效的解决方案,用于语义一致的可变形 2D-3D 配准和详细的体积图像恢复。关键思想是利用 LC 的颅面结构细节和语义一致性来完善基于初始统计模型的配准字段。具体来说,我们的框架采用自我监督方案来学习配准字段的体素级细化器,以提供细粒度的颅面结构细节和体积保真度。我们还提出了一种用于语义对应的弱监督语义一致性度量,减轻了体积图像收集和注释的要求。实验表明,我们的方法实现了可变形 2D-3D 配准,其性能优于最先进的配准和基于射线照相的体积重建方法。源代码可在 https://github.com/Jyk-122/SC-DREG 获取。
AU Zhang, Jianjia
Mao, Haiyang
Chang, Dingyue
Yu, Hengyong
Wu, Weiwen
Shen, Dinggang
张AU、毛健佳、常海洋、于丁月、吴恒勇、沉伟文、丁刚
Adaptive and Iterative Learning With Multi-Perspective Regularizations
for Metal Artifact Reduction
通过多视角正则化进行自适应和迭代学习,以减少金属伪影
Metal artifact reduction (MAR) is important for clinical diagnosis with
CT images. The existing state-of-the-art deep learning methods usually
suppress metal artifacts in sinogram or image domains or both. However,
their performance is limited by the inherent characteristics of the two
domains, i.e., the errors introduced by local manipulations in the
sinogram domain would propagate throughout the whole image during
backprojection and lead to serious secondary artifacts, while it is
difficult to distinguish artifacts from actual image features in the
image domain. To alleviate these limitations, this study analyzes the
desirable properties of wavelet transform in-depth and proposes to
perform MAR in the wavelet domain. First, wavelet transform yields
components that possess spatial correspondence with the image, thereby
preventing the spread of local errors to avoid secondary artifacts.
Second, using wavelet transform could facilitate identification of
artifacts from image since metal artifacts are mainly high-frequency
signals. Taking these advantages of the wavelet transform, this paper
decomposes an image into multiple wavelet components and introduces
multi-perspective regularizations into the proposed MAR model. To
improve the transparency and validity of the model, all the modules in
the proposed MAR model are designed to reflect their mathematical
meanings. In addition, an adaptive wavelet module is also utilized to
enhance the flexibility of the model. To optimize the model, an
iterative algorithm is developed. The evaluation on both synthetic and
real clinical datasets consistently confirms the superior performance of
the proposed method over the competing methods.
金属伪影减少 (MAR) 对于 CT 图像的临床诊断非常重要。现有最先进的深度学习方法通常会抑制正弦图或图像域或两者中的金属伪影。然而,它们的性能受到两个域的固有特征的限制,即正弦图域中的局部操作引入的误差会在反投影过程中传播到整个图像并导致严重的二次伪影,而很难将伪影与伪影区分开来。图像域中的实际图像特征。为了缓解这些限制,本研究深入分析了小波变换的理想特性,并提出在小波域中执行 MAR。首先,小波变换产生与图像具有空间对应关系的分量,从而防止局部误差的扩散,从而避免二次伪影。其次,使用小波变换可以促进从图像中识别伪影,因为金属伪影主要是高频信号。利用小波变换的这些优点,本文将图像分解为多个小波分量,并将多视角正则化引入到所提出的 MAR 模型中。为了提高模型的透明度和有效性,所提出的MAR模型中的所有模块都旨在反映其数学含义。此外,还利用自适应小波模块来增强模型的灵活性。为了优化模型,开发了迭代算法。对合成和真实临床数据集的评估一致证实了所提出的方法相对于竞争方法的优越性能。
AU Kyung, Sunggu
Won, Jongjun
Pak, Seongyong
Kim, Sunwoo
Lee, Sangyoon
Park, Kanggil
Hong, Gil-Sun
Kim, Namkug
AU Kyung、Sunggu Won、Jongjun Pak、Seongyong Kim、Sunwoo Lee、Sangyoon Park、Kanggil Hong、Gil-Sun Kim、Namkug
Generative Adversarial Network with Robust Discriminator Through
Multi-Task Learning for Low-Dose CT Denoising.
通过低剂量 CT 去噪的多任务学习,具有鲁棒鉴别器的生成对抗网络。
Reducing the dose of radiation in computed tomography (CT) is vital to
decreasing secondary cancer risk. However, the use of low-dose CT (LDCT)
images is accompanied by increased noise that can negatively impact
diagnoses. Although numerous deep learning algorithms have been
developed for LDCT denoising, several challenges persist, including the
visual incongruence experienced by radiologists, unsatisfactory
performances across various metrics, and insufficient exploration of the
networks' robustness in other CT domains. To address such issues, this
study proposes three novel accretions. First, we propose a generative
adversarial network (GAN) with a robust discriminator through multi-task
learning that simultaneously performs three vision tasks: restoration,
image-level, and pixel-level decisions. The more multi-tasks that are
performed, the better the denoising performance of the generator, which
means multi-task learning enables the discriminator to provide more
meaningful feedback to the generator. Second, two regulatory mechanisms,
restoration consistency (RC) and non-difference suppression (NDS), are
introduced to improve the discriminator's representation capabilities.
These mechanisms eliminate irrelevant regions and compare the
discriminator's results from the input and restoration, thus
facilitating effective GAN training. Lastly, we incorporate residual
fast Fourier transforms with convolution (Res-FFT-Conv) blocks into the
generator to utilize both frequency and spatial representations. This
approach provides mixed receptive fields by using spatial (or local),
spectral (or global), and residual connections. Our model was evaluated
using various pixel- and feature-space metrics in two denoising tasks.
Additionally, we conducted visual scoring with radiologists. The results
indicate superior performance in both quantitative and qualitative
measures compared to state-of-the-art denoising techniques.
减少计算机断层扫描 (CT) 中的辐射剂量对于降低继发性癌症风险至关重要。然而,低剂量 CT (LDCT) 图像的使用伴随着噪声的增加,可能会对诊断产生负面影响。尽管已经开发了许多用于 LDCT 去噪的深度学习算法,但仍然存在一些挑战,包括放射科医生遇到的视觉不一致、各种指标的性能不令人满意,以及对网络在其他 CT 领域的鲁棒性探索不足。为了解决这些问题,本研究提出了三个新颖的增长点。首先,我们通过多任务学习提出了一种具有鲁棒判别器的生成对抗网络(GAN),该网络同时执行三个视觉任务:恢复、图像级和像素级决策。执行的多任务越多,生成器的去噪性能就越好,这意味着多任务学习使鉴别器能够为生成器提供更有意义的反馈。其次,引入恢复一致性(RC)和无差异抑制(NDS)两种调节机制来提高判别器的表示能力。这些机制消除了不相关的区域,并比较了判别器的输入和恢复结果,从而促进有效的 GAN 训练。最后,我们将残差快速傅里叶变换与卷积(Res-FFT-Conv)块合并到生成器中,以利用频率和空间表示。这种方法通过使用空间(或局部)、光谱(或全局)和残差连接来提供混合感受野。我们的模型在两个去噪任务中使用各种像素和特征空间指标进行了评估。 此外,我们与放射科医生一起进行了视觉评分。结果表明,与最先进的去噪技术相比,在定量和定性测量方面都具有优越的性能。
AU Luo, Mengting
Zhou, Nan
Wang, Tao
He, Linchao
Wang, Wang
Chen, Hu
Liao, Peixi
Zhang, Yi
AU罗、周梦婷、王楠、何涛、王林超、王晨、廖胡、张佩曦、易
Bi-Constraints Diffusion: A Conditional Diffusion Model with Degradation
Guidance for Metal Artifact Reduction.
双约束扩散:具有用于减少金属伪影的降解指导的条件扩散模型。
In recent years, score-based diffusion models have emerged as effective
tools for estimating score functions from empirical data distributions,
particularly in integrating implicit priors with inverse problems like
CT reconstruction. However, score-based diffusion models are rarely
explored in challenging tasks such as metal artifact reduction (MAR). In
this paper, we introduce the BiConstraints Diffusion Model for Metal
Artifact Reduction (BCDMAR), an innovative approach that enhances
iterative reconstruction with a conditional diffusion model for MAR.
This method employs a metal artifact degradation operator in place of
the traditional metal-excluded projection operator in the data-fidelity
term, thereby preserving structure details around metal regions.
However, scorebased diffusion models tend to be susceptible to grayscale
shifts and unreliable structures, making it challenging to reach an
optimal solution. To address this, we utilize a precorrected image as a
prior constraint, guiding the generation of the score-based diffusion
model. By iteratively applying the score-based diffusion model and the
data-fidelity step in each sampling iteration, BCDMAR effectively
maintains reliable tissue representation around metal regions and
produces highly consistent structures in non-metal regions. Through
extensive experiments focused on metal artifact reduction tasks, BCDMAR
demonstrates superior performance over other state-of-the-art
unsupervised and supervised methods, both quantitatively and in terms of
visual results.
近年来,基于分数的扩散模型已成为根据经验数据分布估计分数函数的有效工具,特别是在将隐式先验与 CT 重建等逆问题相结合时。然而,在金属伪影减少(MAR)等具有挑战性的任务中,很少探索基于评分的扩散模型。在本文中,我们介绍了用于金属伪影减少的 BiConstraints 扩散模型 (BCDMAR),这是一种利用 MAR 条件扩散模型增强迭代重建的创新方法。该方法在数据保真度方面采用金属伪影退化算子代替传统的金属排除投影算子,从而保留金属区域周围的结构细节。然而,基于分数的扩散模型往往容易受到灰度变化和不可靠结构的影响,使得达到最佳解决方案具有挑战性。为了解决这个问题,我们利用预先校正的图像作为先验约束,指导基于分数的扩散模型的生成。通过在每次采样迭代中迭代应用基于分数的扩散模型和数据保真度步骤,BCDMAR 有效地保持金属区域周围可靠的组织表示,并在非金属区域中产生高度一致的结构。通过针对金属伪影减少任务的大量实验,BCDMAR 在定量和视觉结果方面都表现出了优于其他最先进的无监督和监督方法的性能。
AU Yan, Siyuan
Yu, Zhen
Liu, Chi
Ju, Lie
Mahapatra, Dwarikanath
Betz-Stablein, Brigid
Mar, Victoria
Janda, Monika
Soyer, Peter
Ge, Zongyuan
AU Yan, 于思源, 刘震, Chi Ju, Lie Mahapatra, Dwarikanath Betz-Stablein, Brigid Mar, Victoria Janda, Monika Soyer, Peter Ge, 宗源
Prompt-driven Latent Domain Generalization for Medical Image
Classification.
用于医学图像分类的提示驱动的潜在域泛化。
Deep learning models for medical image analysis easily suffer from
distribution shifts caused by dataset artifact bias, camera variations,
differences in the imaging station, etc., leading to unreliable
diagnoses in real-world clinical settings. Domain generalization (DG)
methods, which aim to train models on multiple domains to perform well
on unseen domains, offer a promising direction to solve the problem.
However, existing DG methods assume domain labels of each image are
available and accurate, which is typically feasible for only a limited
number of medical datasets. To address these challenges, we propose a
unified DG framework for medical image classification without relying on
domain labels, called Prompt-driven Latent Domain Generalization (PLDG).
PLDG consists of unsupervised domain discovery and prompt learning. This
framework first discovers pseudo domain labels by clustering the
bias-associated style features, then leverages collaborative domain
prompts to guide a Vision Transformer to learn knowledge from discovered
diverse domains. To facilitate cross-domain knowledge learning between
different prompts, we introduce a domain prompt generator that enables
knowledge sharing between domain prompts and a shared prompt. A domain
mixup strategy is additionally employed for more flexible decision
margins and mitigates the risk of incorrect domain assignments.
Extensive experiments on three medical image classification tasks and
one debiasing task demonstrate that our method can achieve comparable or
even superior performance than conventional DG algorithms without
relying on domain labels. Our code is publicly available at
https://github.com/SiyuanYan1/PLDG/tree/main.
用于医学图像分析的深度学习模型很容易受到数据集伪影偏差、相机变化、成像站差异等引起的分布变化的影响,从而导致现实临床环境中的诊断不可靠。领域泛化(DG)方法旨在训练多个领域的模型,使其在未见过的领域中表现良好,为解决该问题提供了一个有希望的方向。然而,现有的 DG 方法假设每个图像的域标签可用且准确,这通常仅适用于有限数量的医学数据集。为了应对这些挑战,我们提出了一种不依赖域标签的医学图像分类统一 DG 框架,称为提示驱动的潜在域泛化(PLDG)。 PLDG 由无监督领域发现和即时学习组成。该框架首先通过聚类与偏差相关的风格特征来发现伪域标签,然后利用协作域提示来指导 Vision Transformer 从发现的不同域中学习知识。为了促进不同提示之间的跨领域知识学习,我们引入了领域提示生成器,它可以实现领域提示和共享提示之间的知识共享。另外还采用了域混合策略,以获得更灵活的决策裕度,并降低了错误域分配的风险。对三个医学图像分类任务和一个去偏任务的广泛实验表明,我们的方法可以在不依赖域标签的情况下实现与传统 DG 算法相当甚至更好的性能。我们的代码可在 https://github.com/SiyuanYan1/PLDG/tree/main 上公开获取。
AU Wang, Yuyang
Liu, Xiaomo
Li, Liang
王AU、刘雨阳、李小沫、梁
Metal Artifacts Reducing Method Based on Diffusion Model Using Intraoral
Optical Scanning Data for Dental Cone-beam CT.
基于扩散模型的金属伪影减少方法,利用牙科锥束CT口内光学扫描数据。
In dental cone-beam computed tomography (CBCT), metal implants can cause
metal artifacts, affecting image quality and the final medical
diagnosis. To reduce the impact of metal artifacts, our proposed metal
artifacts reduction (MAR) method takes a novel approach by integrating
CBCT data with intraoral optical scanning data, utilizing information
from these two different modalities to correct metal artifacts in the
projection domain using a guided-diffusion model. The intraoral optical
scanning data provides a more accurate generation domain for the
diffusion model. We have proposed a multi-channel generation method in
the training and generation stage of the diffusion model, considering
the physical mechanism of CBCT, to ensure the consistency of the
diffusion model generation. In this paper, we present experimental
results that convincingly demonstrate the feasibility and efficacy of
our approach, which introduces intraoral optical scanning data into the
analysis and processing of projection domain data using the diffusion
model for the first time, and modifies the diffusion model to better
adapt to the physical model of CBCT.
在牙科锥形束计算机断层扫描 (CBCT) 中,金属植入物可能会产生金属伪影,影响图像质量和最终的医疗诊断。为了减少金属伪影的影响,我们提出的金属伪影减少(MAR)方法采用了一种新颖的方法,将 CBCT 数据与口内光学扫描数据相结合,利用这两种不同模式的信息,使用引导-校正投影域中的金属伪影。扩散模型。口内光学扫描数据为扩散模型提供了更准确的生成域。考虑到CBCT的物理机制,我们在扩散模型的训练和生成阶段提出了多通道生成方法,以保证扩散模型生成的一致性。在本文中,我们提出的实验结果令人信服地证明了我们方法的可行性和有效性,该方法首次使用扩散模型将口内光学扫描数据引入到投影域数据的分析和处理中,并将扩散模型修改为更好的适应CBCT的物理模型。
AU Jiang, Xiajun
Missel, Ryan
Toloubidokhti, Maryam
Gillette, Karli
Prassl, Anton J.
Plank, Gernot
Horacek, B. Milan
Sapp, John L.
Wang, Linwei
AU Jiang, Xiajun Missel, Ryan Toloubidokhti, Maryam Gillette, Karli Prassl, Anton J. Plank, Gernot Horacek, B. Milan Sapp, John L. Wang, 林伟
Hybrid Neural State-Space Modeling for Supervised and Unsupervised
Electrocardiographic Imaging
用于监督和无监督心电图成像的混合神经状态空间建模
State-space modeling (SSM) provides a general framework for many image
reconstruction tasks. Error in a priori physiological knowledge of the
imaging physics, can bring incorrectness to solutions. Modern
deep-learning approaches show great promise but lack interpretability
and rely on large amounts of labeled data. In this paper, we present a
novel hybrid SSM framework for electrocardiographic imaging (ECGI) to
leverage the advantage of state-space formulations in data-driven
learning. We first leverage the physics-based forward operator to
supervise the learning. We then introduce neural modeling of the
transition function and the associated Bayesian filtering strategy. We
applied the hybrid SSM framework to reconstruct electrical activity on
the heart surface from body-surface potentials. In unsupervised settings
of both in-silico and in-vivo data without cardiac electrical activity
as the ground truth to supervise the learning, we demonstrated improved
ECGI performances of the hybrid SSM framework trained from a small
number of ECG observations in comparison to the fixed SSM. We further
demonstrated that, when in-silico simulation data becomes available,
mixed supervised and unsupervised training of the hybrid SSM achieved a
further 40.6% and 45.6% improvements, respectively, in comparison to
traditional ECGI baselines and supervised data-driven ECGI baselines for
localizing the origin of ventricular activations in real data.
状态空间建模(SSM)为许多图像重建任务提供了通用框架。成像物理学的先验生理知识的错误可能会给解决方案带来不正确的结果。现代深度学习方法显示出巨大的前景,但缺乏可解释性,并且依赖大量标记数据。在本文中,我们提出了一种用于心电图成像(ECGI)的新型混合 SSM 框架,以利用数据驱动学习中状态空间公式的优势。我们首先利用基于物理的前向算子来监督学习。然后我们介绍转换函数的神经建模和相关的贝叶斯过滤策略。我们应用混合 SSM 框架从体表电位重建心脏表面的电活动。在计算机和体内数据的无监督设置中,没有心电活动作为监督学习的基本事实,我们证明了与固定 SSM 相比,通过少量 ECG 观察训练的混合 SSM 框架的 ECGI 性能得到了改善。我们进一步证明,当计算机模拟数据可用时,与传统 ECGI 基线和用于本地化的监督数据驱动 ECGI 基线相比,混合 SSM 的混合监督和无监督训练分别进一步提高了 40.6% 和 45.6%真实数据中心室激活的起源。
AU Zeng, Qingjie
Xie, Yutong
Lu, Zilin
Lu, Mengkang
Zhang, Jingfeng
Zhou, Yuyin
Xia, Yong
曾区、谢庆杰、路雨桐、路子林、张孟康、周景峰、夏玉印、勇
Consistency-guided Differential Decoding for Enhancing Semi-supervised
Medical Image Segmentation.
用于增强半监督医学图像分割的一致性引导差分解码。
Semi-supervised learning (SSL) has been proven beneficial for mitigating
the issue of limited labeled data, especially on volumetric medical
image segmentation. Unlike previous SSL methods which focus on exploring
highly confident pseudo-labels or developing consistency regularization
schemes, our empirical findings suggest that differential decoder
features emerge naturally when two decoders strive to generate
consistent predictions. Based on the observation, we first analyze the
treasure of discrepancy in learning towards consistency, under both
pseudo-labeling and consistency regularization settings, and
subsequently propose a novel SSL method called LeFeD, which learns the
feature-level discrepancies obtained from two decoders, by feeding such
information as feedback signals to the encoder. The core design of LeFeD
is to enlarge the discrepancies by training differential decoders, and
then learn from the differential features iteratively. We evaluate LeFeD
against eight state-of-the-art (SOTA) methods on three public datasets.
Experiments show LeFeD surpasses competitors without any bells and
whistles, such as uncertainty estimation and strong constraints, as well
as setting a new state of the art for semi-supervised medical image
segmentation. Code has been released at
https://github.com/maxwell0027/LeFeD.
半监督学习 (SSL) 已被证明有助于缓解有限标记数据的问题,尤其是在体积医学图像分割方面。与之前的 SSL 方法专注于探索高度置信的伪标签或开发一致性正则化方案不同,我们的实证研究结果表明,当两个解码器努力生成一致的预测时,差分解码器特征会自然出现。基于观察,我们首先分析了在伪标签和一致性正则化设置下学习一致性的差异宝藏,随后提出了一种称为 LeFeD 的新颖 SSL 方法,该方法学习从两个解码器获得的特征级别差异,通过将此类信息作为反馈信号馈送到编码器。 LeFeD的核心设计是通过训练差分解码器来放大差异,然后迭代地从差分特征中学习。我们在三个公共数据集上针对八种最先进 (SOTA) 方法评估 LeFeD。实验表明,LeFeD 在没有任何附加功能(例如不确定性估计和强约束)的情况下超越了竞争对手,并为半监督医学图像分割设定了新的技术水平。代码已发布于 https://github.com/maxwell0027/LeFeD。
AU Li, Jun
Su, Tongkun
Zhao, Baoliang
Lv, Faqin
Wang, Qiong
Navab, Nassir
Hu, Ying
Jiang, Zhongliang
AU Li, 苏军, 赵同坤, 吕宝亮, 王发勤, 纳瓦布, Nassir Hu, 蒋英, 忠良
Ultrasound Report Generation with Cross-Modality Feature Alignment via
Unsupervised Guidance.
通过无监督指导生成具有跨模态特征对齐的超声报告。
Automatic report generation has arisen as a significant research area in
computer-aided diagnosis, aiming to alleviate the burden on clinicians
by generating reports automatically based on medical images. In this
work, we propose a novel framework for automatic ultrasound report
generation, leveraging a combination of unsupervised and supervised
learning methods to aid the report generation process. Our framework
incorporates unsupervised learning methods to extract potential
knowledge from ultrasound text reports, serving as the prior information
to guide the model in aligning visual and textual features, thereby
addressing the challenge of feature discrepancy. Additionally, we design
a global semantic comparison mechanism to enhance the performance of
generating more comprehensive and accurate medical reports. To enable
the implementation of ultrasound report generation, we constructed three
large-scale ultrasound image-text datasets from different organs for
training and validation purposes. Extensive evaluations with other
state-of-the-art approaches exhibit its superior performance across all
three datasets. Code and dataset are valuable at this link.
自动报告生成已成为计算机辅助诊断的一个重要研究领域,旨在通过根据医学图像自动生成报告来减轻临床医生的负担。在这项工作中,我们提出了一种自动超声报告生成的新颖框架,利用无监督和监督学习方法的组合来帮助报告生成过程。我们的框架采用无监督学习方法,从超声文本报告中提取潜在知识,作为先验信息来指导模型对齐视觉和文本特征,从而解决特征差异的挑战。此外,我们设计了一种全局语义比较机制,以提高生成更全面、更准确的医疗报告的性能。为了实现超声报告生成,我们构建了来自不同器官的三个大规模超声图像文本数据集,用于训练和验证目的。对其他最先进方法的广泛评估显示了其在所有三个数据集上的卓越性能。代码和数据集在此链接中很有价值。
AU Zhang, Dong
Liu, Xiujian
Wang, Anbang
Zhang, Hongwei
Yang, Guang
Zhang, Heye
Gao, Zhifan
张AU、刘栋、王秀剑、张安邦、杨宏伟、张光、高荷叶、志凡
Constraint-Aware Learning for Fractional Flow Reserve Pullback Curve
Estimation from Invasive Coronary Imaging.
基于侵入性冠状动脉成像的血流储备分数回拉曲线估计的约束感知学习。
Estimation of the fractional flow reserve (FFR) pullback curve from
invasive coronary imaging is important for the intraoperative guidance
of coronary intervention. Machine/deep learning has been proven
effective in FFR pullback curve estimation. However, the existing
methods suffer from inadequate incorporation of intrinsic geometry
associations and physics knowledge. In this paper, we propose a
constraint-aware learning framework to improve the estimation of the FFR
pullback curve from invasive coronary imaging. It incorporates both
geometrical and physical constraints to approximate the relationships
between the geometric structure and FFR values along the coronary artery
centerline. Our method also leverages the power of synthetic data in
model training to reduce the collection costs of clinical data.
Moreover, to bridge the domain gap between synthetic and real data
distributions when testing on real-world imaging data, we also employ a
diffusion-driven test-time data adaptation method that preserves the
knowledge learned in synthetic data. Specifically, this method learns a
diffusion model of the synthetic data distribution and then projects
real data to the synthetic data distribution at test time. Extensive
experimental studies on a synthetic dataset and a real-world dataset of
382 patients covering three imaging modalities have shown the better
performance of our method for FFR estimation of stenotic coronary
arteries, compared with other machine/deep learning-based FFR estimation
models and computational fluid dynamics-based model. The results also
provide high agreement and correlation between the FFR predictions of
our method and the invasively measured FFR values. The plausibility of
FFR predictions along the coronary artery centerline is also validated.
通过侵入性冠状动脉成像估计血流储备分数(FFR)回拉曲线对于冠状动脉介入术的术中指导非常重要。机器/深度学习已被证明在 FFR 回调曲线估计中有效。然而,现有的方法缺乏对内在几何关联和物理知识的充分结合。在本文中,我们提出了一种约束感知学习框架,以改进侵入性冠状动脉成像中 FFR 回拉曲线的估计。它结合了几何和物理约束来近似沿着冠状动脉中心线的几何结构和 FFR 值之间的关系。我们的方法还利用模型训练中合成数据的力量来降低临床数据的收集成本。此外,为了在测试真实世界成像数据时弥合合成数据分布和真实数据分布之间的领域差距,我们还采用了扩散驱动的测试时数据适应方法,该方法保留了在合成数据中学到的知识。具体来说,该方法学习合成数据分布的扩散模型,然后在测试时将真实数据投影到合成数据分布。对合成数据集和涵盖三种成像模式的 382 名患者的真实数据集进行的广泛实验研究表明,与其他基于机器/深度学习的 FFR 估计模型和计算相比,我们的方法在狭窄冠状动脉的 FFR 估计方面具有更好的性能。基于流体动力学的模型。结果还提供了我们方法的 FFR 预测与侵入式测量的 FFR 值之间的高度一致性和相关性。沿着冠状动脉中心线的 FFR 预测的合理性也得到了验证。
AU Shao, Wei
Shi, Hang
Liu, Jianxin
Zuo, Yingli
Sun, Liang
Xia, Tiansong
Chen, Wanyuan
Wan, Peng
Sheng, Jianpeng
Zhu, Qi
Zhang, Daoqiang
区少、石伟、刘航、左建新、孙英丽、夏亮、陈天松、万源、盛鹏、朱建鹏、张琪、道强
Multi-Instance Multi-Task Learning for Joint Clinical Outcome and
Genomic Profile Predictions From the Histopathological Images
根据组织病理学图像进行联合临床结果和基因组图谱预测的多实例多任务学习
With the remarkable success of digital histopathology and the deep
learning technology, many whole-slide pathological images (WSIs) based
deep learning models are designed to help pathologists diagnose human
cancers. Recently, rather than predicting categorical variables as in
cancer diagnosis, several deep learning studies are also proposed to
estimate the continuous variables such as the patients' survival or
their transcriptional profile. However, most of the existing studies
focus on conducting these predicting tasks separately, which overlooks
the useful intrinsic correlation among them that can boost the
prediction performance of each individual task. In addition, it is sill
challenge to design the WSI-based deep learning models, since a WSI is
with huge size but annotated with coarse label. In this study, we
propose a general multi-instance multi-task learning framework
(HistMIMT) for multi-purpose prediction from WSIs. Specifically, we
firstly propose a novel multi-instance learning module (TMICS)
considering both common and specific task information across different
tasks to generate bag representation for each individual task. Then, a
soft-mask based fusion module with channel attention (SFCA) is developed
to leverage useful information from the related tasks to help improve
the prediction performance on target task. We evaluate our method on
three cancer cohorts derived from the Cancer Genome Atlas (TCGA). For
each cohort, our multi-purpose prediction tasks range from cancer
diagnosis, survival prediction and estimating the transcriptional
profile of gene TP53. The experimental results demonstrated that
HistMIMT can yield better outcome on all clinical prediction tasks than
its competitors.
随着数字组织病理学和深度学习技术的巨大成功,许多基于全幻灯片病理图像(WSI)的深度学习模型被设计用来帮助病理学家诊断人类癌症。最近,还提出了几项深度学习研究来估计连续变量,例如患者的生存或其转录谱,而不是像癌症诊断中那样预测分类变量。然而,大多数现有研究都集中于单独执行这些预测任务,而忽略了它们之间有用的内在相关性,而这些内在相关性可以提高每个单独任务的预测性能。此外,设计基于 WSI 的深度学习模型仍然是一个挑战,因为 WSI 尺寸巨大,但标签标注粗糙。在本研究中,我们提出了一种通用的多实例多任务学习框架(HistMIMT),用于 WSI 的多用途预测。具体来说,我们首先提出了一种新颖的多实例学习模块(TMICS),考虑不同任务之间的常见和特定任务信息,为每个单独的任务生成包表示。然后,开发了一种具有通道注意功能的基于软掩模的融合模块(SFCA),以利用相关任务中的有用信息来帮助提高目标任务的预测性能。我们在来自癌症基因组图谱 (TCGA) 的三个癌症队列中评估了我们的方法。对于每个队列,我们的多用途预测任务包括癌症诊断、生存预测和估计基因 TP53 的转录谱。实验结果表明,HistMIMT 在所有临床预测任务上都能比竞争对手产生更好的结果。
AU Ding, Saisai
Li, Juncheng
Wang, Jun
Ying, Shihui
Shi, Jun
欧丁、李赛赛、王俊成、英俊、施世慧、Jun
Multimodal Co-attention Fusion Network with Online Data Augmentation for
Cancer Subtype Classification.
多模态共同注意融合网络与在线数据增强用于癌症亚型分类。
It is an essential task to accurately diagnose cancer subtypes in
computational pathology for personalized cancer treatment. Recent
studies have indicated that the combination of multimodal data, such as
whole slide images (WSIs) and multi-omics data, could achieve more
accurate diagnosis. However, robust cancer diagnosis remains challenging
due to the heterogeneity among multimodal data, as well as the
performance degradation caused by insufficient multimodal patient data.
In this work, we propose a novel multimodal co-attention fusion network
(MCFN) with online data augmentation (ODA) for cancer subtype
classification. Specifically, a multimodal mutual-guided co-attention
(MMC) module is proposed to effectively perform dense multimodal
interactions. It enables multimodal data to mutually guide and calibrate
each other during the integration process to alleviate inter- and
intra-modal heterogeneities. Subsequently, a self-normalizing network
(SNN)-Mixer is developed to allow information communication among
different omics data and alleviate the high-dimensional small-sample
size problem in multi-omics data. Most importantly, to compensate for
insufficient multimodal samples for model training, we propose an ODA
module in MCFN. The ODA module leverages the multimodal knowledge to
guide the data augmentations of WSIs and maximize the data diversity
during model training. Extensive experiments are conducted on the public
TCGA dataset. The experimental results demonstrate that the proposed
MCFN outperforms all the compared algorithms, suggesting its
effectiveness.
在计算病理学中准确诊断癌症亚型是个性化癌症治疗的一项重要任务。最近的研究表明,多模态数据(例如全切片图像(WSI)和多组学数据)的结合可以实现更准确的诊断。然而,由于多模态数据之间的异质性,以及多模态患者数据不足导致的性能下降,稳健的癌症诊断仍然具有挑战性。在这项工作中,我们提出了一种新颖的多模式共同注意融合网络(MCFN)和在线数据增强(ODA),用于癌症亚型分类。具体来说,提出了一种多模态相互引导共同注意(MMC)模块来有效地执行密集的多模态交互。它使多模态数据能够在集成过程中相互指导和校准,以减轻模态间和模内的异质性。随后,开发了自归一化网络(SNN)-Mixer,以允许不同组学数据之间的信息通信,并缓解多组学数据中的高维小样本问题。最重要的是,为了弥补模型训练的多模态样本不足的问题,我们在 MCFN 中提出了 ODA 模块。 ODA模块利用多模态知识来指导WSI的数据增强,并在模型训练期间最大化数据多样性。在公共 TCGA 数据集上进行了大量实验。实验结果表明,所提出的 MCFN 优于所有比较算法,表明其有效性。
AU Li, Yicong
Li, Wanhua
Chen, Qi
Huang, Wei
Zou, Yuda
Xiao, Xin
Shinomiya, Kazunori
Gunn, Pat
Gupta, Nishika
Polilov, Alexey
Xu, Yongchao
Zhang, Yueyi
Xiong, Zhiwei
Pfister, Hanspeter
Wei, Donglai
Wu, Jingpeng
AU Li、李一聪、陈万华、黄奇、邹伟、肖宇大、Xin Shinomiya、Kazunori Gunn、Pat Gupta、Nishika Polilov、Alexey Xu、张永超、熊跃毅、Zhiwei Pfister、Hanspeter Wei、吴东来、Jingpeng
WASPSYN: A Challenge for Domain Adaptive Synapse Detection in Microwasp
Brain Connectomes.
WASPSYN:微黄蜂大脑连接体中域自适应突触检测的挑战。
The size of image volumes in connectomics studies now reaches terabyte
and often petabyte scales with a great diversity of appearance due to
different sample preparation procedures. However, manual annotation of
neuronal structures (e.g., synapses) in these huge image volumes is
time-consuming, leading to limited labeled training data often smaller
than 0.001% of the large-scale image volumes in application. Methods
that can utilize in-domain labeled data and generalize to out-of-domain
unlabeled data are in urgent need. Although many domain adaptation
approaches are proposed to address such issues in the natural image
domain, few of them have been evaluated on connectomics data due to a
lack of domain adaptation benchmarks. Therefore, to enable developments
of domain adaptive synapse detection methods for large-scale
connectomics applications, we annotated 14 image volumes from a
biologically diverse set of Megaphragma viggianii brain regions
originating from three different whole-brain datasets and organized the
WASPSYN challenge at ISBI 2023. The annotations include coordinates of
pre-synapses and post-synapses in the 3D space, together with their
one-to-many connectivity information. This paper describes the dataset,
the tasks, the proposed baseline, the evaluation method, and the results
of the challenge. Limitations of the challenge and the impact on
neuroscience research are also discussed. The challenge is and will
continue to be available at
https://codalab.lisn.upsaclay.fr/competitions/9169. Successful
algorithms that emerge from our challenge may potentially revolutionize
real-world connectomics research and further the cause that aims to
unravel the complexity of brain structure and function.
连接组学研究中的图像体积大小现已达到 TB 级甚至 PB 级,由于不同的样品制备程序,其外观具有很大的多样性。然而,对这些巨大图像量中的神经元结构(例如突触)进行手动注释非常耗时,导致有限的标记训练数据通常小于应用中大规模图像量的 0.001%。迫切需要能够利用域内标记数据并推广到域外未标记数据的方法。尽管提出了许多领域适应方法来解决自然图像领域中的此类问题,但由于缺乏领域适应基准,很少有人对连接组学数据进行评估。因此,为了开发适用于大规模连接组学应用的域自适应突触检测方法,我们注释了来自来自三个不同全脑数据集的生物学多样性的维吉亚巨噬菌大脑区域的 14 个图像卷,并在 ISBI 2023 上组织了 WASPSYN 挑战赛。注释包括 3D 空间中突触前和突触后的坐标,以及它们的一对多连接信息。本文描述了数据集、任务、建议的基线、评估方法和挑战结果。还讨论了挑战的局限性以及对神经科学研究的影响。挑战赛现已并将继续在 https://codalab.lisn.upsaclay.fr/competitions/9169 上进行。我们的挑战中出现的成功算法可能会彻底改变现实世界的连接组学研究,并进一步推动旨在揭示大脑结构和功能复杂性的事业。
AU Naughton, Noel
Cahoon, Stacey
Sutton, Brad
Georgiadis, John G
AU Naughton、Noel Cahoon、Stacey Sutton、Brad Georgiadis、John G
Accelerated, physics-inspired inference of skeletal muscle
microstructure from diffusion-weighted MRI.
从扩散加权 MRI 加速、受物理启发的骨骼肌微观结构推断。
Muscle health is a critical component of overall health and quality of
life. However, current measures of skeletal muscle health take limited
account of microstructural variations within muscle, which play a
crucial role in mediating muscle function. To address this, we present a
physics-inspired, machine learning-based framework for the non-invasive
estimation of microstructural organization in skeletal muscle from
diffusion-weighted MRI (dMRI) in an uncertainty-aware manner. To reduce
the computational expense associated with direct numerical simulations
of dMRI physics, a polynomial meta-model is developed that accurately
represents the input/output relationships of a high-fidelity numerical
model. This meta-model is used to develop a Gaussian process (GP) model
that provides voxel-wise estimates and confidence intervals of
microstructure organization in skeletal muscle. Given noise-free data,
the GP model accurately estimates microstructural parameters. In the
presence of noise, the diameter, intracellular diffusion coefficient,
and membrane permeability are accurately estimated with narrow
confidence intervals, while volume fraction and extracellular diffusion
coefficient are poorly estimated and exhibit wide confidence intervals.
A reduced-acquisition GP model, consisting of one-third the
diffusion-encoding measurements, is shown to predict parameters with
similar accuracy to the original model. The fiber diameter and volume
fraction estimated by the reduced GP model is validated via histology,
with both parameters accurately estimated, demonstrating the capability
of the proposed framework as a promising non-invasive tool for assessing
skeletal muscle health and function.
肌肉健康是整体健康和生活质量的重要组成部分。然而,目前的骨骼肌健康测量方法对肌肉内部微观结构变化的考虑有限,而肌肉内部微观结构变化在调节肌肉功能中发挥着至关重要的作用。为了解决这个问题,我们提出了一种受物理启发、基于机器学习的框架,用于以不确定性感知的方式通过扩散加权 MRI (dMRI) 对骨骼肌的微观结构组织进行非侵入性估计。为了减少与 dMRI 物理直接数值模拟相关的计算费用,开发了一种多项式元模型,可以准确地表示高保真数值模型的输入/输出关系。该元模型用于开发高斯过程 (GP) 模型,该模型提供骨骼肌微观结构组织的体素估计和置信区间。给定无噪声数据,GP 模型可以准确估计微观结构参数。在存在噪声的情况下,直径、细胞内扩散系数和膜渗透性的准确估计具有狭窄的置信区间,而体积分数和细胞外扩散系数的估计较差并且表现出宽的置信区间。减少采集的 GP 模型由三分之一的扩散编码测量组成,可以以与原始模型相似的精度预测参数。通过组织学验证了简化 GP 模型估计的纤维直径和体积分数,这两个参数都得到了准确估计,证明了所提出的框架作为评估骨骼肌健康和功能的有前景的非侵入性工具的能力。
AU van Herten, Rudolf L. M.
Hampe, Nils
Takx, Richard A. P.
Franssen, Klaas Jan
Wang, Yining
Sucha, Dominika
Henriques, Jose P.
Leiner, Tim
Planken, R. Nils
Isgum, Ivana
AU van Herten、Rudolf LM Hampe、Nils Takx、Richard AP Franssen、Klaas Jan Wang、Yining Sucha、Dominika Henriques、Jose P. Leiner、Tim Planken、R. Nils Isgum、Ivana
Automatic Coronary Artery Plaque Quantification and CAD-RADS Prediction
Using Mesh Priors
使用网格先验自动冠状动脉斑块量化和 CAD-RADS 预测
Coronary artery disease (CAD) remains the leading cause of death
worldwide. Patients with suspected CAD undergo coronary CT angiography
(CCTA) to evaluate the risk of cardiovascular events and determine the
treatment. Clinical analysis of coronary arteries in CCTA comprises the
identification of atherosclerotic plaque, as well as the grading of any
coronary artery stenosis typically obtained through the CAD-Reporting
and Data System (CAD-RADS). This requires analysis of the coronary lumen
and plaque. While voxel-wise segmentation is a commonly used approach in
various segmentation tasks, it does not guarantee topologically
plausible shapes. To address this, in this work, we propose to directly
infer surface meshes for coronary artery lumen and plaque based on a
centerline prior and use it in the downstream task of CAD-RADS scoring.
The method is developed and evaluated using a total of 2407 CCTA scans.
Our method achieved lesion-wise volume intraclass correlation
coefficients of 0.98, 0.79, and 0.85 for calcified, non-calcified, and
total plaque volume respectively. Patient-level CAD-RADS categorization
was evaluated on a representative hold-out test set of 300 scans, for
which the achieved linearly weighted kappa (kappa) was 0.75. CAD-RADS
categorization on the set of 658 scans from another hospital and scanner
led to a kappa of 0.71. The results demonstrate that direct inference of
coronary artery meshes for lumen and plaque is feasible, and allows for
the automated prediction of routinely performed CAD-RADS categorization.
冠状动脉疾病(CAD)仍然是全世界死亡的主要原因。疑似 CAD 患者接受冠状动脉 CT 血管造影 (CCTA),以评估心血管事件的风险并确定治疗方案。 CCTA 中冠状动脉的临床分析包括动脉粥样硬化斑块的识别,以及通常通过 CAD 报告和数据系统 (CAD-RADS) 获得的任何冠状动脉狭窄的分级。这需要对冠状动脉腔和斑块进行分析。虽然体素分割是各种分割任务中常用的方法,但它不能保证拓扑上合理的形状。为了解决这个问题,在这项工作中,我们建议根据先验中心线直接推断冠状动脉管腔和斑块的表面网格,并将其用于 CAD-RADS 评分的下游任务。该方法是使用总共 2407 个 CCTA 扫描来开发和评估的。我们的方法实现了钙化、非钙化和总斑块体积的病变体积组内相关系数分别为 0.98、0.79 和 0.85。患者级别 CAD-RADS 分类在 300 次扫描的代表性保留测试集上进行评估,其中获得的线性加权 kappa (kappa) 为 0.75。对来自另一家医院和扫描仪的 658 个扫描集进行 CAD-RADS 分类,得出的 kappa 为 0.71。结果表明,直接推断冠状动脉网格的管腔和斑块是可行的,并且可以自动预测常规执行的 CAD-RADS 分类。
AU Guo, Jia
Lu, Shuai
Jia, Lize
Zhang, Weihang
Li, Huiqi
郭AU、路佳、贾帅、张丽泽、李伟航、慧琪
Encoder-Decoder Contrast for Unsupervised Anomaly Detection in Medical
Images
用于医学图像中无监督异常检测的编码器-解码器对比
Unsupervised anomaly detection (UAD) aims to recognize anomalous images
based on the training set that contains only normal images. In medical
image analysis, UAD benefits from leveraging the easily obtained normal
(healthy) images, avoiding the costly collecting and labeling of
anomalous (unhealthy) images. Most advanced UAD methods rely on frozen
encoder networks pre-trained using ImageNet for extracting feature
representations. However, the features extracted from the frozen
encoders that are borrowed from natural image domains coincide little
with the features required in the target medical image domain. Moreover,
optimizing encoders usually causes pattern collapse in UAD. In this
paper, we propose a novel UAD method, namely Encoder-Decoder Contrast
(EDC), which optimizes the entire network to reduce biases towards
pre-trained image domain and orient the network in the target medical
domain. We start from feature reconstruction approach that detects
anomalies from reconstruction errors. Essentially, a contrastive
learning paradigm is introduced to tackle the problem of pattern
collapsing while optimizing the encoder and the reconstruction decoder
simultaneously. In addition, to prevent instability and further improve
performances, we propose to bring globality into the contrastive
objective function. Extensive experiments are conducted across four
medical image modalities including optical coherence tomography, color
fundus image, brain MRI, and skin lesion image, where our method
outperforms all current state-of-the-art UAD methods.
无监督异常检测(UAD)旨在基于仅包含正常图像的训练集来识别异常图像。在医学图像分析中,UAD 受益于利用容易获得的正常(健康)图像,避免了昂贵的收集和标记异常(不健康)图像。大多数先进的 UAD 方法依赖于使用 ImageNet 预先训练的冻结编码器网络来提取特征表示。然而,从自然图像域借用的冻结编码器中提取的特征与目标医学图像域中所需的特征几乎不相符。此外,优化编码器通常会导致 UAD 中的模式崩溃。在本文中,我们提出了一种新颖的UAD方法,即编码器-解码器对比度(EDC),该方法优化整个网络以减少对预训练图像域的偏差并将网络定位在目标医学领域。我们从检测重建错误中的异常的特征重建方法开始。本质上,引入对比学习范式来解决模式崩溃问题,同时优化编码器和重建解码器。此外,为了防止不稳定并进一步提高性能,我们建议将全局性引入对比目标函数中。对四种医学图像模式进行了广泛的实验,包括光学相干断层扫描、彩色眼底图像、脑部 MRI 和皮肤病变图像,我们的方法优于当前所有最先进的 UAD 方法。
AU Zhu, Jiening Veeraraghavan, Harini Jiang, Jue Oh, Jung Hun Norton, Larry Deasy, Joseph O. Tannenbaum, Allen
Wasserstein HOG: Local Directionality Extraction via Optimal Transport
Wasserstein HOG:通过最佳传输提取局部方向性
Directionally sensitive radiomic features including the histogram of
oriented gradient (HOG) have been shown to provide objective and
quantitative measures for predicting disease outcomes in multiple
cancers. However, radiomic features are sensitive to imaging
variabilities including acquisition differences, imaging artifacts and
noise, making them impractical for using in the clinic to inform patient
care. We treat the problem of extracting robust local directionality
features by mapping via optimal transport a given local image patch to
an iso-intense patch of its mean. We decompose the transport map into
sub-work costs each transporting in different directions. To test our
approach, we evaluated the ability of the proposed approach to quantify
tumor heterogeneity from magnetic resonance imaging (MRI) scans of brain
glioblastoma multiforme, computed tomography (CT) scans of head and neck
squamous cell carcinoma as well as longitudinal CT scans in lung cancer
patients treated with immunotherapy. By considering the entropy
difference of the extracted local directionality within tumor regions,
we found that patients with higher entropy in their images, had
significantly worse overall survival for all three datasets, which
indicates that tumors that have images exhibiting flows in many
directions may be more malignant. This may seem to reflect high tumor
histologic grade or disorganization. Furthermore, by comparing the
changes in entropy longitudinally using two imaging time points, we
found patients with reduction in entropy from baseline CT are associated
with longer overall survival (hazard ratio = 1.95, 95% confidence
interval of 1.4-2.8, ${p}$ = 1.65e-5). The proposed method provides a
robust, training free approach to quantify the local directionality
contained in images.
方向敏感的放射组学特征,包括定向梯度直方图 (HOG) 已被证明可以为预测多种癌症的疾病结果提供客观和定量的测量。然而,放射组学特征对成像变异(包括采集差异、成像伪影和噪声)很敏感,这使得它们在临床中用于指导患者护理是不切实际的。我们通过最优传输将给定的局部图像块映射到其均值的等强度块来处理提取鲁棒局部方向性特征的问题。我们将运输地图分解为每个在不同方向运输的子工作成本。为了测试我们的方法,我们评估了所提出的方法通过脑多形性胶质母细胞瘤的磁共振成像(MRI)扫描、头颈部鳞状细胞癌的计算机断层扫描(CT)以及纵向CT扫描来量化肿瘤异质性的能力。接受免疫疗法治疗的肺癌患者。通过考虑肿瘤区域内提取的局部方向性的熵差,我们发现图像中熵较高的患者在所有三个数据集中的总体生存率明显较差,这表明具有在多个方向上显示流动的图像的肿瘤可能更差。恶性的。这似乎反映了肿瘤组织学分级较高或组织混乱。此外,通过使用两个成像时间点纵向比较熵的变化,我们发现熵较基线 CT 减少的患者与较长的总生存期相关(风险比 = 1.95,95% 置信区间为 1.4-2.8,${p} $ = 1.65e-5)。 所提出的方法提供了一种稳健的、免训练的方法来量化图像中包含的局部方向性。
AU Fu, Suzhong
Xu, Jing
Chang, Shilong
Yang, Luyao
Ling, Shuting
Cai, Jinghan
Chen, Jiayin
Yuan, Jiacheng
Cai, Ying
Zhang, Bei
Huang, Zicheng
Yang, Kun
Sui, Wenhai
Xue, Linyan
Zhao, Qingliang
AU Fu, 徐苏中, 常静, 杨世龙, 凌璐瑶, 蔡舒婷, 陈静涵, 袁佳音, 蔡家成, 张英, 黄蓓, 杨子成, 隋坤, 薛文海, 赵林艳, 庆亮
Robust Vascular Segmentation for Raw Complex Images of Laser Speckle
Contrast Based on Weakly Supervised Learning
基于弱监督学习的激光散斑对比原始复杂图像的鲁棒血管分割
Laser speckle contrast imaging (LSCI) is widely used for in vivo
real-time detection and analysis of local blood flow microcirculation
due to its non-invasive ability and excellent spatial and temporal
resolution. However, vascular segmentation of LSCI images still faces a
lot of difficulties due to numerous specific noises caused by the
complexity of blood microcirculation's structure and irregular vascular
aberrations in diseased regions. In addition, the difficulties of LSCI
image data annotation have hindered the application of deep learning
methods based on supervised learning in the field of LSCI image vascular
segmentation. To tackle these difficulties, we propose a robust weakly
supervised learning method, which selects the threshold combinations and
processing flows instead of labor-intensive annotation work to construct
the ground truth of the dataset, and design a deep neural network,
FURNet, based on UNet++ and ResNeXt. The model obtained from training
achieves high-quality vascular segmentation and captures multi-scene
vascular features on both constructed and unknown datasets with good
generalization. Furthermore, we intravital verified the availability of
this method on a tumor before and after embolization treatment. This
work provides a new approach for realizing LSCI vascular segmentation
and also makes a new application-level advance in the field of
artificial intelligence-assisted disease diagnosis.
激光散斑对比成像(LSCI)因其无创能力和优异的时空分辨率而被广泛应用于体内局部血流微循环的实时检测和分析。然而,由于血液微循环结构的复杂性和病变区域不规则的血管畸变造成大量特定噪声,LSCI图像的血管分割仍然面临很多困难。此外,LSCI图像数据标注的困难阻碍了基于监督学习的深度学习方法在LSCI图像血管分割领域的应用。为了解决这些困难,我们提出了一种鲁棒的弱监督学习方法,该方法选择阈值组合和处理流程而不是劳动密集型的注释工作来构建数据集的基本事实,并设计了一个基于 UNet++ 的深度神经网络 FURNet和 ResNeXt。训练获得的模型实现了高质量的血管分割,并在构建的和未知的数据集上捕获多场景血管特征,具有良好的泛化性。此外,我们在栓塞治疗前后验证了该方法在肿瘤上的可用性。该工作为实现LSCI血管分割提供了新的途径,也在人工智能辅助疾病诊断领域取得了新的应用层面的进展。
AU Lobos, Rodrigo A.
Chan, Chin-Cheng
Haldar, Justin P.
AU Lobos、Rodrigo A. Chan、Chin-Cheng Haldar、Justin P.
New Theory and Faster Computations for Subspace-Based Sensitivity Map
Estimation in Multichannel MRI
多通道 MRI 中基于子空间的灵敏度图估计的新理论和更快的计算
Sensitivity map estimation is important in many multichannel MRI
applications. Subspace-based sensitivity map estimation methods like
ESPIRiT are popular and perform well, though can be computationally
expensive and their theoretical principles can be nontrivial to
understand. In the first part of this work, we present a novel
theoretical derivation of subspace-based sensitivity map estimation
based on a linear-predictability/structured low-rank modeling
perspective. This results in an estimation approach that is equivalent
to ESPIRiT, but with distinct theory that may be more intuitive for some
readers. In the second part of this work, we propose and evaluate a set
of computational acceleration approaches (collectively known as PISCO)
that can enable substantial improvements in computation time (up to
similar to 100x in the examples we show) and memory for subspace-based
sensitivity map estimation.
灵敏度图估计在许多多通道 MRI 应用中很重要。基于子空间的灵敏度图估计方法(例如 ESPIRiT)很流行并且性能良好,但计算成本可能很高,而且其理论原理也很难理解。在这项工作的第一部分中,我们提出了一种基于线性可预测性/结构化低秩建模视角的基于子空间的灵敏度图估计的新颖理论推导。这产生了与 ESPIRiT 等效的估计方法,但具有对某些读者来说可能更直观的独特理论。在这项工作的第二部分中,我们提出并评估了一组计算加速方法(统称为 PISCO),这些方法可以显着改善基于子空间的计算时间(在我们展示的示例中高达 100 倍)和内存。敏感性图估计。
AU Tang, Xinlu
Zhang, Chencheng
Guo, Rui
Yang, Xinling
Qian, Xiaohua
唐AU、张新禄、郭晨城、杨锐、钱欣岭、晓华
A Causality-Aware Graph Convolutional Network Framework for Rigidity
Assessment in Parkinsonians
用于帕金森病僵化评估的因果感知图卷积网络框架
Rigidity is one of the common motor disorders in Parkinson's disease
(PD), which lead to life quality deterioration. The widely-used
rating-scale-based approach for rigidity assessment still depends on the
availability of experienced neurologists and is limited by rating
subjectivity. Given the recent successful applications of quantitative
susceptibility mapping (QSM) in auxiliary PD diagnosis, automated
assessment of PD rigidity can be essentially achieved through QSM
analysis. However, a major challenge is the performance instability due
to the confounding factors (e.g., noise and distribution shift) which
conceal the truly-causal features. Therefore, we propose a
causality-aware graph convolutional network (GCN) framework, where
causal feature selection is combined with causal invariance to ensure
that causality-informed model decisions are reached. Firstly, a GCN
model that integrates causal feature selection is systematically
constructed at three graph levels: node, structure, and representation.
In this model, a causal diagram is learned to extract a subgraph with
truly-causal information. Secondly, a non-causal perturbation strategy
is developed along with an invariance constraint to ensure the stability
of the assessment results under different distributions, and thus avoid
spurious correlations caused by distribution shifts. The superiority of
the proposed method is shown by extensive experiments and the clinical
value is revealed by the direct relevance of selected brain regions to
rigidity in PD. Besides, its extensibility is verified on other two
tasks: PD bradykinesia and mental state for Alzheimer's disease.
Overall, we provide a clinically-potential tool for automated and stable
assessment of PD rigidity. Our source code will be available at
https://github.com/SJTUBME-QianLab/Causality-Aware-Rigidity.
强直是帕金森病(PD)常见的运动障碍之一,会导致生活质量恶化。广泛使用的基于评级量表的僵化评估方法仍然取决于经验丰富的神经科医生的可用性,并且受到评级主观性的限制。鉴于定量磁化率图(QSM)最近在辅助PD诊断中的成功应用,PD刚性的自动评估基本上可以通过QSM分析来实现。然而,一个主要的挑战是由于隐藏了真正因果特征的混杂因素(例如噪声和分布变化)而导致的性能不稳定。因此,我们提出了一种因果关系感知图卷积网络(GCN)框架,其中因果特征选择与因果不变性相结合,以确保达成因果关系知情的模型决策。首先,在节点、结构和表示三个图层面系统地构建了集成因果特征选择的GCN模型。在这个模型中,学习因果图来提取具有真正因果信息的子图。其次,制定了非因果扰动策略和不变性约束,以确保不同分布下评估结果的稳定性,从而避免分布变化引起的虚假相关性。广泛的实验证明了所提出方法的优越性,并且所选大脑区域与 PD 僵硬的直接相关性揭示了临床价值。此外,它的可扩展性在另外两个任务上得到了验证:PD运动迟缓和阿尔茨海默病的精神状态。总体而言,我们提供了一种具有临床潜力的工具,用于自动、稳定地评估 PD 硬度。 我们的源代码可在 https://github.com/SJTUBME-QianLab/Causality-Aware-Rigidity 获取。
AU Wu, Yongjian
Zhou, Yang
Saiyin, Jiya
Wei, Bingzheng
Lai, Maode
Shou, Jianzhong
Xu, Yan
吴宇、周永健、杨赛银、魏继雅、赖秉正、寿茂德、徐建中、严
AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot
Nuclei Detection via Visual-Language Pre-trained Models.
AttriPrompter:通过视觉语言预训练模型进行零样本核检测的属性语义自动提示。
Large-scale visual-language pre-trained models (VLPMs) have demonstrated
exceptional performance in downstream object detection through text
prompts for natural scenes. However, their application to zero-shot
nuclei detection on histopathology images remains relatively unexplored,
mainly due to the significant gap between the characteristics of medical
images and the weboriginated text-image pairs used for pre-training.
This paper aims to investigate the potential of the object-level VLPM,
Grounded Language-Image Pre-training (GLIP), for zero-shot nuclei
detection. Specifically, we propose an innovative auto-prompting
pipeline, named AttriPrompter, comprising attribute generation,
attribute augmentation, and relevance sorting, to avoid subjective
manual prompt design. AttriPrompter utilizes VLPMs' text-to-image
alignment to create semantically rich text prompts, which are then fed
into GLIP for initial zero-shot nuclei detection. Additionally, we
propose a self-trained knowledge distillation framework, where GLIP
serves as the teacher with its initial predictions used as pseudo
labels, to address the challenges posed by high nuclei density,
including missed detections, false positives, and overlapping instances.
Our method exhibits remarkable performance in label-free nuclei
detection, out-performing all existing unsupervised methods and
demonstrating excellent generality. Notably, this work highlights the
astonishing potential of VLPMs pre-trained on natural image-text pairs
for downstream tasks in the medical field as well. Code will be released
at github.com/AttriPrompter.
大规模视觉语言预训练模型(VLPM)通过自然场景的文本提示在下游对象检测中表现出了卓越的性能。然而,它们在组织病理学图像上的零样本核检测中的应用仍然相对未经探索,这主要是由于医学图像的特征与用于预训练的网络原始文本图像对之间存在显着差距。本文旨在研究对象级 VLPM(基础语言图像预训练(GLIP))在零样本核检测中的潜力。具体来说,我们提出了一种创新的自动提示管道,名为 AttriPrompter,包括属性生成、属性增强和相关性排序,以避免主观的手动提示设计。 AttriPrompter 利用 VLPM 的文本到图像对齐来创建语义丰富的文本提示,然后将其输入 GLIP 中以进行初始零样本核检测。此外,我们提出了一个自我训练的知识蒸馏框架,其中 GLIP 作为老师,其初始预测用作伪标签,以解决高核密度带来的挑战,包括漏检、误报和重叠实例。我们的方法在无标记细胞核检测中表现出卓越的性能,优于所有现有的无监督方法,并表现出出色的通用性。值得注意的是,这项工作凸显了在自然图像文本对上进行预训练的 VLPM 对于医学领域下游任务的惊人潜力。代码将在 github.com/AttriPrompter 发布。
EI 1558-254X
DA 2024-10-05
UT MEDLINE:39361456
PM 39361456
ER
EI 1558-254X DA 2024-10-05 UT MEDLINE:39361456 PM 39361456 ER
AU You, Xin
He, Junjun
Yang, Jie
Gu, Yun
区游、何欣、杨军军、顾杰、云
Learning with Explicit Shape Priors for Medical Image Segmentation.
使用显式形状先验学习医学图像分割。
Medical image segmentation is a fundamental task for medical image
analysis and surgical planning. In recent years, UNet-based networks
have prevailed in the field of medical image segmentation. However,
convolutional neural networks (CNNs) suffer from limited receptive
fields, which fail to model the long-range dependency of organs or
tumors. Besides, these models are heavily dependent on the training of
the final segmentation head. And existing methods can not well address
aforementioned limitations simultaneously. Hence, in our work, we
proposed a novel shape prior module (SPM), which can explicitly
introduce shape priors to promote the segmentation performance of
UNet-based models. The explicit shape priors consist of global and local
shape priors. The former with coarse shape representations provides
networks with capabilities to model global contexts. The latter with
finer shape information serves as additional guidance to relieve the
heavy dependence on the learnable prototype in the segmentation head. To
evaluate the effectiveness of SPM, we conduct experiments on three
challenging public datasets. And our proposed model achieves
state-of-the-art performance. Furthermore, SPM can serve as a
plug-and-play structure into classic CNNs and Transformer-based
backbones, facilitating the segmentation task on different datasets.
Source codes are available at https://github.
com/AlexYouXin/Explicit-Shape-Priors.
医学图像分割是医学图像分析和手术规划的一项基本任务。近年来,基于UNet的网络在医学图像分割领域盛行。然而,卷积神经网络(CNN)的感受野有限,无法模拟器官或肿瘤的远程依赖性。此外,这些模型很大程度上依赖于最终分割头的训练。而现有方法并不能很好地同时解决上述局限性。因此,在我们的工作中,我们提出了一种新颖的形状先验模块(SPM),它可以显式地引入形状先验来提高基于 UNet 的模型的分割性能。显式形状先验由全局形状先验和局部形状先验组成。前者具有粗糙的形状表示,为网络提供了对全局上下文进行建模的能力。后者具有更精细的形状信息,可作为额外的指导,以减轻对分割头中可学习原型的严重依赖。为了评估 SPM 的有效性,我们在三个具有挑战性的公共数据集上进行了实验。我们提出的模型实现了最先进的性能。此外,SPM 可以作为经典 CNN 和基于 Transformer 的主干的即插即用结构,促进不同数据集上的分割任务。源代码可在 https://github 上获取。 com/AlexYouXin/Explicit-Shape-Priors。
AU Liu, Che Cheng, Sibo Shi, Miaojing Shah, Anand Bai, Wenjia Arcucci, Rossella
IMITATE: Clinical Prior Guided Hierarchical Vision-Language
Pre-training.
模仿:临床事先引导的分层视觉语言预训练。
In the field of medical Vision-Language Pretraining (VLP), significant
efforts have been devoted to deriving text and image features from both
clinical reports and associated medical images. However, most existing
methods may have overlooked the opportunity in leveraging the inherent
hierarchical structure of clinical reports, which are generally split
into 'findings' for descriptive content and 'impressions' for conclusive
observation. Instead of utilizing this rich, structured format, current
medical VLP approaches often simplify the report into either a unified
entity or fragmented tokens. In this work, we propose a novel clinical
prior guided VLP framework named IMITATE to learn the structure
information from medical reports with hierarchical vision-language
alignment. The framework derives multi-level visual features from the
chest X-ray (CXR) images and separately aligns these features with the
descriptive and the conclusive text encoded in the hierarchical medical
report. Furthermore, a new clinical-informed contrastive loss is
introduced for cross-modal learning, which accounts for clinical prior
knowledge in formulating sample correlations in contrastive learning.
The proposed model, IMITATE, outperforms baseline VLP methods across six
different datasets, spanning five medical imaging downstream tasks.
Comprehensive experimental results highlight the advantages of
integrating the hierarchical structure of medical reports for
vision-language alignment.
在医学视觉语言预训练(VLP)领域,人们致力于从临床报告和相关医学图像中获取文本和图像特征。然而,大多数现有方法可能忽视了利用临床报告固有的层次结构的机会,临床报告通常分为描述性内容的“发现”和结论性观察的“印象”。当前的医疗 VLP 方法通常将报告简化为统一的实体或碎片化的标记,而不是利用这种丰富的结构化格式。在这项工作中,我们提出了一种名为 IMITATE 的新型临床先验引导 VLP 框架,用于通过分层视觉语言对齐从医学报告中学习结构信息。该框架从胸部 X 射线 (CXR) 图像中获取多级视觉特征,并将这些特征与分层医疗报告中编码的描述性和结论性文本分别对齐。此外,为跨模式学习引入了一种新的临床知情对比损失,它解释了对比学习中制定样本相关性的临床先验知识。所提出的模型 IMITATE 在六个不同的数据集上优于基线 VLP 方法,涵盖五个医学成像下游任务。综合实验结果凸显了整合医疗报告的层次结构以实现视觉语言对齐的优势。
AU Bian, Wanyu
Jang, Albert
Zhang, Liping
Yang, Xiaonan
Stewart, Zachary
Liu, Fang
AU Bian、Wanyu Jang、Albert 张、Liping Yang、Xiaonan Stewart、Zachary Liu、Fang
Diffusion Modeling with Domain-conditioned Prior Guidance for
Accelerated MRI and qMRI Reconstruction.
具有域条件先验指导的扩散建模,用于加速 MRI 和 qMRI 重建。
This study introduces a novel image reconstruction technique based on a
diffusion model that is conditioned on the native data domain. Our
method is applied to multi-coil MRI and quantitative MRI (qMRI)
reconstruction, leveraging the domain-conditioned diffusion model within
the frequency and parameter domains. The prior MRI physics are used as
embeddings in the diffusion model, enforcing data consistency to guide
the training and sampling process, characterizing MRI k-space encoding
in MRI reconstruction, and leveraging MR signal modeling for qMRI
reconstruction. Furthermore, a gradient descent optimization is
incorporated into the diffusion steps, enhancing feature learning and
improving denoising. The proposed method demonstrates a significant
promise, particularly for reconstructing images at high acceleration
factors. Notably, it maintains great reconstruction accuracy for static
and quantitative MRI reconstruction across diverse anatomical
structures. Beyond its immediate applications, this method provides
potential generalization capability, making it adaptable to inverse
problems across various domains.
本研究介绍了一种基于以本机数据域为条件的扩散模型的新颖图像重建技术。我们的方法应用于多线圈 MRI 和定量 MRI (qMRI) 重建,利用频率和参数域内的域条件扩散模型。先前的 MRI 物理学被用作扩散模型中的嵌入,强制数据一致性以指导训练和采样过程,表征 MRI 重建中的 MRI k 空间编码,并利用 MR 信号建模进行 qMRI 重建。此外,将梯度下降优化纳入扩散步骤中,增强特征学习并改善去噪。所提出的方法展现了巨大的前景,特别是在高加速因子下重建图像方面。值得注意的是,它在不同解剖结构的静态和定量 MRI 重建中保持了很高的重建精度。除了直接应用之外,该方法还提供了潜在的泛化能力,使其适用于跨各个领域的反演问题。
AU Liu, Min Wu, Shuhan Chen, Runze Lin, Zhuangdian Wang, Yaonan Meijering, Erik
Brain Image Segmentation for Ultrascale Neuron Reconstruction via an
Adaptive Dual-Task Learning Network
通过自适应双任务学习网络进行超大规模神经元重建的大脑图像分割
Accurate morphological reconstruction of neurons in whole brain images
is critical for brain science research. However, due to the wide range
of whole brain imaging, uneven staining, and optical system
fluctuations, there are significant differences in image properties
between different regions of the ultrascale brain image, such as
dramatically varying voxel intensities and inhomogeneous distribution of
background noise, posing an enormous challenge to neuron reconstruction
from whole brain images. In this paper, we propose an adaptive dual-task
learning network (ADTL-Net) to quickly and accurately extract neuronal
structures from ultrascale brain images. Specifically, this framework
includes an External Features Classifier (EFC) and a Parameter Adaptive
Segmentation Decoder (PASD), which share the same Multi-Scale Feature
Encoder (MSFE). MSFE introduces an attention module named Channel Space
Fusion Module (CSFM) to extract structure and intensity distribution
features of neurons at different scales for addressing the problem of
anisotropy in 3D space. Then, EFC is designed to classify these feature
maps based on external features, such as foreground intensity
distributions and image smoothness, and select specific PASD parameters
to decode them of different classes to obtain accurate segmentation
results. PASD contains multiple sets of parameters trained by different
representative complex signal-to-noise distribution image blocks to
handle various images more robustly. Experimental results prove that
compared with other advanced segmentation methods for neuron
reconstruction, the proposed method achieves state-of-the-art results in
the task of neuron reconstruction from ultrascale brain images, with an
improvement of about 49% in speed and 12% in F1 score.
全脑图像中神经元的准确形态重建对于脑科学研究至关重要。然而,由于全脑成像范围广泛、染色不均匀和光学系统波动,超尺度脑图像不同区域之间的图像特性存在显着差异,例如体素强度差异巨大、背景噪声分布不均匀等,从全脑图像重建神经元是一个巨大的挑战。在本文中,我们提出了一种自适应双任务学习网络(ADTL-Net),可以快速准确地从超大规模脑图像中提取神经元结构。具体来说,该框架包括外部特征分类器(EFC)和参数自适应分割解码器(PASD),它们共享相同的多尺度特征编码器(MSFE)。 MSFE引入了名为通道空间融合模块(CSFM)的注意力模块来提取不同尺度神经元的结构和强度分布特征,以解决3D空间中的各向异性问题。然后,EFC旨在根据外部特征(例如前景强度分布和图像平滑度)对这些特征图进行分类,并选择特定的PASD参数对不同类别的它们进行解码,以获得准确的分割结果。 PASD包含由不同代表性复杂信噪分布图像块训练的多组参数,以更鲁棒地处理各种图像。实验结果证明,与其他先进的神经元重建分割方法相比,该方法在超尺度脑图像神经元重建任务中取得了state-of-the-art的结果,速度提高了约49%,效率提高了12%。 F1成绩。
C1 Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Peoples R China
C1 Hunan Univ, Natl Engn Lab Robot Visual Percept & Control Techn, Changsha
410082, Peoples R China
C1 Int Sci & Technol Innovat Cooperat Base Biomed Ima, Changsha 410082,
Peoples R China
C1 Hunan Univ, Res Inst, Chongqing 401120, Peoples R China
C1 Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW 2052, Australia
C3 Int Sci & Technol Innovat Cooperat Base Biomed Ima
SN 0278-0062
EI 1558-254X
DA 2024-07-22
UT WOS:001263692100011
PM 38373129
ER
C1 湖南大学电子与信息工程学院, 长沙 410082, 人民大学 C1 湖南大学, 国家工程实验室机器人视觉感知与控制技术实验室, 长沙 410082, 人民大学 C1 生物医学国际科技创新合作基地, 长沙 410082 Peoples R China C1 湖南大学,Res Inst,重庆 401120,Peoples R China C1 Univ 新南威尔士州,Sch Comp Sci & Engn,Sydney,NSW 2052,澳大利亚 C3 国际科技创新合作基地 Biomed Ima SN 0278-0062 EI 1558 -254X DA 2024-07-22 UT WOS:001263692100011 PM 38373129 ER
AU Thandiackal, Kevin
Piccinelli, Luigi
Gupta, Rajarsi
Pati, Pushpak
Goksel, Orcun
AU Thandiackal、Kevin Piccinelli、Luigi Gupta、Rajarsi Pati、Pushpak Goksel、Orcun
Multi-Scale Feature Alignment for Continual Learning of Unlabeled
Domains
用于未标记域持续学习的多尺度特征对齐
Methods for unsupervised domain adaptation (UDA) help to improve the
performance of deep neural networks on unseen domains without any
labeled data. Especially in medical disciplines such as histopathology,
this is crucial since large datasets with detailed annotations are
scarce. While the majority of existing UDA methods focus on the
adaptation from a labeled source to a single unlabeled target domain,
many real-world applications with a long life cycle involve more than
one target domain. Thus, the ability to sequentially adapt to multiple
target domains becomes essential. In settings where the data from
previously seen domains cannot be stored, e.g., due to data protection
regulations, the above becomes a challenging continual learning problem.
To this end, we propose to use generative feature-driven image replay in
conjunction with a dual-purpose discriminator that not only enables the
generation of images with realistic features for replay, but also
promotes feature alignment during domain adaptation. We evaluate our
approach extensively on a sequence of three histopathological datasets
for tissue-type classification, achieving state-of-the-art results. We
present detailed ablation experiments studying our proposed method
components and demonstrate a possible use-case of our continual UDA
method for an unsupervised patch-based segmentation task given
high-resolution tissue images. Our code is available at:
https://github.com/histocartography/multi-scale-feature-alignment.
无监督域适应(UDA)方法有助于在没有任何标记数据的情况下提高深度神经网络在看不见的域上的性能。特别是在组织病理学等医学学科中,这一点至关重要,因为带有详细注释的大型数据集很少。虽然大多数现有 UDA 方法侧重于从标记源到单个未标记目标域的适应,但许多具有较长生命周期的实际应用程序涉及多个目标域。因此,顺序适应多个目标域的能力变得至关重要。在无法存储来自先前看到的域的数据的设置中,例如,由于数据保护法规,上述问题成为具有挑战性的持续学习问题。为此,我们建议将生成特征驱动的图像重放与双用途鉴别器结合使用,不仅能够生成具有真实特征的图像以进行重放,而且还可以在域适应期间促进特征对齐。我们在用于组织类型分类的三个组织病理学数据集序列上广泛评估我们的方法,取得了最先进的结果。我们提出了详细的消融实验,研究我们提出的方法组件,并展示了我们的连续 UDA 方法的可能用例,用于给定高分辨率组织图像的无监督的基于块的分割任务。我们的代码位于:https://github.com/histocartography/multi-scale-feature-alignment。
AU Zhu, Qi
Li, Shengrong
Meng, Xiangshui
Xu, Qiang
Zhang, Zhiqiang
Shao, Wei
Zhang, Daoqiang
朱AU、李琪、孟胜荣、徐响水、张强、邵志强、张伟、道强
Spatio-Temporal Graph Hubness Propagation Model for Dynamic Brain
Network Classification
动态脑网络分类的时空图中心传播模型
Dynamic brain network has the advantage over static brain network in
characterizing the variation pattern of functional brain connectivity,
and it has attracted increasing attention in brain disease diagnosis.
However, most of the existing dynamic brain networks analysis methods
rely on extracting features from independent brain networks divided by
sliding windows, making them hard to reveal the high-order dynamic
evolution laws of functional brain networks. Additionally, they cannot
effectively extract the spatio-temporal topology features in dynamic
brain networks. In this paper, we propose to use optimal transport (OT)
theory to capture the topology evolution of the dynamic brain networks,
and develop a multi-channel spatio-temporal graph convolutional network
that collaboratively extracts the temporal and spatial features from the
evolution networks. Specifically, we first adaptively evaluate the graph
hubness of brain regions in the brain network of each time window, which
comprehensively models information transmission among multiple brain
regions. Second, the hubness propagation information across adjacent
time windows is captured by optimal transport, describing high-order
topology evolution of dynamic brain networks. Moreover, we develop a
spatio-temporal graph convolutional network with attention mechanism to
collaboratively extract the intrinsic temporal and spatial topology
information from the above networks. Finally, the multi-layer perceptron
is adopted for classifying the dynamic brain network. The extensive
experiment on the collected epilepsy dataset and the public ADNI dataset
show that our proposed method not only outperforms several
state-of-the-art methods in brain disease diagnosis, but also reveals
the key dynamic alterations of brain connectivities between patients and
healthy controls.
动态脑网络在表征脑功能连接的变化模式方面比静态脑网络具有优势,在脑疾病诊断中越来越受到关注。然而,现有的动态脑网络分析方法大多依赖于从滑动窗口划分的独立脑网络中提取特征,难以揭示功能脑网络的高阶动态演化规律。此外,它们无法有效地提取动态大脑网络中的时空拓扑特征。在本文中,我们建议使用最优传输(OT)理论来捕获动态脑网络的拓扑演化,并开发一种多通道时空图卷积网络,从演化网络中协作提取时空特征。具体来说,我们首先自适应评估每个时间窗口的大脑网络中大脑区域的图中心度,从而综合建模多个大脑区域之间的信息传输。其次,通过最佳传输捕获相邻时间窗口的中心传播信息,描述动态大脑网络的高阶拓扑演化。此外,我们开发了一种具有注意机制的时空图卷积网络,以协作从上述网络中提取内在的时空拓扑信息。最后,采用多层感知器对动态脑网络进行分类。 对收集的癫痫数据集和公共 ADNI 数据集进行的广泛实验表明,我们提出的方法不仅优于脑部疾病诊断中的几种最先进的方法,而且揭示了患者和健康对照之间大脑连接的关键动态变化。
AU Li, Pengcheng
Gao, Chenqiang
Lian, Chunfeng
Meng, Deyu
AU Li、高鹏程、连晨强、孟春风、德宇
Spatial Prior-Guided Bi-Directional Cross-Attention Transformers for
Tooth Instance Segmentation.
用于牙齿实例分割的空间先验引导双向交叉注意变压器。
Tooth instance segmentation of dental panoramic X-ray images represents
a task of significant clinical importance. Teeth demonstrate symmetry
within the upper and lower jawbones and are arranged in a specific
order. However, previous studies frequently overlook this crucial
spatial prior information, resulting in misidentifications of tooth
categories for adjacent or similarly shaped teeth. In this paper, we
propose SPGTNet, a spatial prior-guided transformer method, designed to
both the extracted tooth positional features from CNNs and the
long-range contextual information from vision transformers for dental
panoramic X-ray image segmentation. Initially, a center-based spatial
prior perception module is employed to identify each tooth's centroid,
thereby enhancing the spatial prior information for the CNN sequence
features. Subsequently, a bi-directional cross-attention module is
designed to facilitate the interaction between the spatial prior
information of the CNN sequence features and the long-distance
contextual features of the vision transformer sequence features.
Finally, an instance identification head is employed to derive the tooth
segmentation results. Extensive experiments on three public benchmark
datasets have demonstrated the effectiveness and superiority of our
proposed method in comparison with other state-of-the-art approaches.
The proposed method demonstrates the capability to accurately identify
and analyze tooth structures, thereby providing crucial information for
dental diagnosis, treatment planning, and research.
牙科全景 X 射线图像的牙齿实例分割是一项具有重要临床意义的任务。牙齿在上颌骨和下颌骨内表现出对称性,并按特定顺序排列。然而,以前的研究经常忽视这一重要的空间先验信息,导致对相邻或形状相似的牙齿的牙齿类别的错误识别。在本文中,我们提出了 SPGTNet,一种空间先验引导变换器方法,旨在从 CNN 提取的牙齿位置特征和来自视觉变换器的远程上下文信息进行牙科全景 X 射线图像分割。最初,采用基于中心的空间先验感知模块来识别每个牙齿的质心,从而增强 CNN 序列特征的空间先验信息。随后,设计了双向交叉注意力模块,以促进CNN序列特征的空间先验信息与视觉变换器序列特征的长距离上下文特征之间的交互。最后,采用实例识别头来得出牙齿分割结果。对三个公共基准数据集的广泛实验证明了我们提出的方法与其他最先进方法相比的有效性和优越性。该方法展示了准确识别和分析牙齿结构的能力,从而为牙科诊断、治疗计划和研究提供重要信息。
AU Chai, Zhizhong
Luo, Luyang
Lin, Huangjing
Heng, Pheng-Ann
Chen, Hao
柴AU、罗志忠、林路阳、黄静恒、陈鹏安、郝
Deep Omni-Supervised Learning for Rib Fracture Detection From Chest
Radiology Images
通过胸部放射图像检测肋骨骨折的深度全监督学习
Deep learning (DL)-based rib fracture detection has shown promise of
playing an important role in preventing mortality and improving patient
outcome. Normally, developing DL-based object detection models requires
a huge amount of bounding box annotation. However, annotating medical
data is time-consuming and expertise-demanding, making obtaining a large
amount of fine-grained annotations extremely infeasible. This poses a
pressing need for developing label-efficient detection models to
alleviate radiologists' labeling burden. To tackle this challenge, the
literature on object detection has witnessed an increase of
weakly-supervised and semi-supervised approaches, yet still lacks a
unified framework that leverages various forms of fully-labeled,
weakly-labeled, and unlabeled data. In this paper, we present a novel
omni-supervised object detection network, ORF-Netv2, to leverage as much
available supervision as possible. Specifically, a multi-branch
omni-supervised detection head is introduced with each branch trained
with a specific type of supervision. A co-training-based dynamic label
assignment strategy is then proposed to enable flexible and robust
learning from the weakly-labeled and unlabeled data. Extensive
evaluation was conducted for the proposed framework with three rib
fracture datasets on both chest CT and X-ray. By leveraging all forms of
supervision, ORF-Netv2 achieves mAPs of 34.7, 44.7, and 19.4 on the
three datasets, respectively, surpassing the baseline detector which
uses only box annotations by mAP gains of 3.8, 4.8, and 5.0,
respectively. Furthermore, ORF-Netv2 consistently outperforms other
competitive label-efficient methods over various scenarios, showing a
promising framework for label-efficient fracture detection. The code is
available at: https://github.com/zhizhongchai/ORF-Net.
基于深度学习 (DL) 的肋骨骨折检测有望在预防死亡和改善患者预后方面发挥重要作用。通常,开发基于深度学习的对象检测模型需要大量的边界框注释。然而,对医学数据进行注释既耗时又需要专业知识,使得获得大量细粒度的注释极其不可行。这就迫切需要开发标签高效的检测模型,以减轻放射科医生的标签负担。为了应对这一挑战,有关目标检测的文献见证了弱监督和半监督方法的增加,但仍然缺乏利用各种形式的全标记、弱标记和未标记数据的统一框架。在本文中,我们提出了一种新颖的全监督对象检测网络 ORF-Netv2,以尽可能多地利用可用的监督。具体来说,引入了多分支全监督检测头,每个分支都接受特定类型的监督训练。然后提出了一种基于协同训练的动态标签分配策略,以实现从弱标记和未标记数据中进行灵活且鲁棒的学习。使用胸部 CT 和 X 射线的三个肋骨骨折数据集对所提出的框架进行了广泛的评估。通过利用各种形式的监督,ORF-Netv2 在三个数据集上分别实现了 34.7、44.7 和 19.4 的 mAP,超过了仅使用框注释的基线检测器,mAP 增益分别为 3.8、4.8 和 5.0。此外,ORF-Netv2 在各种场景中始终优于其他竞争性标签高效方法,为标签高效断裂检测展示了一个有前途的框架。 代码位于:https://github.com/zhizhongchai/ORF-Net。
AU Lin, Weiyuan
Gao, Zhifan
Liu, Hui
Zhang, Heye
AU Lin、高伟远、刘志凡、张辉、Heye
A Deformable Constraint Transport Network for Optimal Aortic
Segmentation From CT Images
用于从 CT 图像中实现最佳主动脉分割的可变形约束传输网络
Aortic segmentation from computed tomography (CT) is crucial for
facilitating aortic intervention, as it enables clinicians to visualize
aortic anatomy for diagnosis and measurement. However, aortic
segmentation faces the challenge of variable geometry in space, as the
geometric diversity of different diseases and the geometric
transformations that occur between raw and measured images. Existing
constraint-based methods can potentially solve the challenge, but they
are hindered by two key issues: inaccurate definition of properties and
inappropriate topology of transformation in space. In this paper, we
propose a deformable constraint transport network (DCTN). The DCTN
adaptively extracts aortic features to define intra-image constrained
properties and guides topological implementation in space to constrain
inter-image geometric transformation between raw and curved planar
reformation (CPR) images. The DCTN contains a deformable attention
extractor, a geometry-aware decoder and an optimal transport guider. The
extractor generates variable patches that preserve semantic integrity
and long-range dependency in long-sequence images. The decoder enhances
the perception of geometric texture and semantic features, particularly
for low-intensity aortic coarctation and false lumen, which removes
background interference. The guider explores the geometric discrepancies
between raw and CPR images, constructs probability distributions of
discrepancies, and matches them with inter-image transformation to guide
geometric topology in space. Experimental studies on 267 aortic subjects
and four public datasets show the superiority of our DCTN over 23
methods. The results demonstrate DCTN's advantages in aortic
segmentation for different types of aortic disease, for different aortic
segments, and in the measurement of clinical indexes.
计算机断层扫描 (CT) 的主动脉分割对于促进主动脉介入至关重要,因为它使临床医生能够可视化主动脉解剖结构以进行诊断和测量。然而,主动脉分割面临着空间几何可变的挑战,因为不同疾病的几何多样性以及原始图像和测量图像之间发生的几何变换。现有的基于约束的方法可以潜在地解决这一挑战,但它们受到两个关键问题的阻碍:属性定义不准确和空间变换拓扑不合适。在本文中,我们提出了一种可变形约束传输网络(DCTN)。 DCTN 自适应地提取主动脉特征来定义图像内约束属性,并指导空间中的拓扑实现以约束原始图像和弯曲平面重组 (CPR) 图像之间的图像间几何变换。 DCTN 包含可变形注意力提取器、几何感知解码器和最佳传输引导器。提取器生成可变补丁,以保留长序列图像中的语义完整性和远程依赖性。该解码器增强了对几何纹理和语义特征的感知,特别是对于低强度主动脉缩窄和假腔,从而消除了背景干扰。引导器探索原始图像和 CPR 图像之间的几何差异,构建差异的概率分布,并将其与图像间变换相匹配,以引导空间中的几何拓扑。对 267 名主动脉受试者和四个公共数据集的实验研究表明,我们的 DCTN 优于 23 种方法。 结果证明了DCTN在针对不同类型主动脉疾病、不同主动脉节段的主动脉分割以及临床指标测量方面的优势。
AU Liu, Jinhua
Desrosiers, Christian
Yu, Dexin
Zhou, Yuanfeng
AU Liu、Jinhua Desrosiers、Christian Yu、周德新、袁峰
Semi-Supervised Medical Image Segmentation Using Cross-Style Consistency
With Shape-Aware and Local Context Constraints
使用具有形状感知和局部上下文约束的跨样式一致性的半监督医学图像分割
Despite the remarkable progress in semi-supervised medical image
segmentation methods based on deep learning, their application to
real-life clinical scenarios still faces considerable challenges. For
example, insufficient labeled data often makes it difficult for networks
to capture the complexity and variability of the anatomical regions to
be segmented. To address these problems, we design a new semi-supervised
segmentation framework that aspires to produce anatomically plausible
predictions. Our framework comprises two parallel networks:
shape-agnostic and shape-aware networks. These networks learn from each
other, enabling effective utilization of unlabeled data. Our shape-aware
network implicitly introduces shape guidance to capture shape
fine-grained information. Meanwhile, shape-agnostic networks employ
uncertainty estimation to further obtain reliable pseudo-labels for the
counterpart. We also employ a cross-style consistency strategy to
enhance the network's utilization of unlabeled data. It enriches the
dataset to prevent overfitting and further eases the coupling of the two
networks that learn from each other. Our proposed architecture also
incorporates a novel loss term that facilitates the learning of the
local context of segmentation by the network, thereby enhancing the
overall accuracy of prediction. Experiments on three different datasets
of medical images show that our method outperforms many excellent
semi-supervised segmentation methods and outperforms them in perceiving
shape. The code can be seen at https://github.com/igip-liu/SLC-Net.
尽管基于深度学习的半监督医学图像分割方法取得了显着进展,但其在现实临床场景中的应用仍然面临着相当大的挑战。例如,标记数据不足通常会使网络难以捕获要分割的解剖区域的复杂性和可变性。为了解决这些问题,我们设计了一个新的半监督分割框架,旨在产生解剖学上合理的预测。我们的框架包含两个并行网络:形状不可知网络和形状感知网络。这些网络相互学习,从而能够有效利用未标记的数据。我们的形状感知网络隐式引入形状指导来捕获形状细粒度信息。同时,形状不可知网络采用不确定性估计来进一步获得对应对象的可靠伪标签。我们还采用跨风格一致性策略来增强网络对未标记数据的利用率。它丰富了数据集以防止过度拟合,并进一步简化了两个相互学习的网络的耦合。我们提出的架构还结合了一种新颖的损失项,有助于网络学习分割的局部上下文,从而提高预测的整体准确性。在三个不同的医学图像数据集上的实验表明,我们的方法优于许多优秀的半监督分割方法,并且在感知形状方面优于它们。代码可见https://github.com/igip-liu/SLC-Net。
C1 Shandong Univ, Sch Software, Jinan 250101, Peoples R China
C1 Ecole Technol Super ETS, Software & IT Dept, Montreal, PQ H3C 1K3,
Canada
C1 Shandong Univ, Qilu Hosp, Jinan 250012, Peoples R China
SN 0278-0062
EI 1558-254X
DA 2024-07-02
UT WOS:001196733400012
PM 38032771
ER
C1 山东大学,Sch Software,济南 250101,人民 R 中国 C1 Ecole Technol Super ETS,软件与 IT 系,蒙特利尔,PQ H3C 1K3,加拿大 C1 山东大学,齐鲁医院,济南 250012,人民 R 中国 SN 0278-0062 EI 1558 -254X DA 2024-07-02 UT WOS:001196733400012 PM 38032771 ER
AU Stevens, Tristan S W
Meral, Faik C
Yu, Jason
Apostolakis, Iason Z
Robert, Jean-Luc
Van Sloun, Ruud J G
AU Stevens、Tristan SW Meral、Faik C Yu、Jason Apostolakis、Iason Z Robert、Jean-Luc Van Sloun、Ruud JG
Dehazing Ultrasound using Diffusion Models.
使用扩散模型进行超声去雾。
Echocardiography has been a prominent tool for the diagnosis of cardiac
disease. However, these diagnoses can be heavily impeded by poor image
quality. Acoustic clutter emerges due to multipath reflections imposed
by layers of skin, subcutaneous fat, and intercostal muscle between the
transducer and heart. As a result, haze and other noise artifacts pose a
real challenge to cardiac ultrasound imaging. In many cases, especially
with difficult-to-image patients such as patients with obesity, a
diagnosis from B-Mode ultrasound imaging is effectively rendered
unusable, forcing sonographers to resort to contrast-enhanced ultrasound
examinations or refer patients to other imaging modalities. Tissue
harmonic imaging has been a popular approach to combat haze, but in
severe cases is still heavily impacted by haze. Alternatively, denoising
algorithms are typically unable to remove highly structured and
correlated noise, such as haze. It remains a challenge to accurately
describe the statistical properties of structured haze, and develop an
inference method to subsequently remove it. Diffusion models have
emerged as powerful generative models and have shown their effectiveness
in a variety of inverse problems. In this work, we present a joint
posterior sampling framework that combines two separate diffusion models
to model the distribution of both clean ultrasound and haze in an
unsupervised manner. Furthermore, we demonstrate techniques for
effectively training diffusion models on radio-frequency ultrasound data
and highlight the advantages over image data. Experiments on both
in-vitro and in-vivo cardiac datasets show that the proposed dehazing
method effectively removes haze while preserving signals from weakly
reflected tissue.
超声心动图已成为诊断心脏病的重要工具。然而,这些诊断可能会因图像质量差而受到严重阻碍。由于换能器和心脏之间的皮肤层、皮下脂肪和肋间肌施加的多路径反射,出现了声杂波。因此,雾霾和其他噪声伪影对心脏超声成像构成了真正的挑战。在许多情况下,特别是对于肥胖患者等难以成像的患者,B 型超声成像的诊断实际上变得无法使用,迫使超声检查人员诉诸对比增强超声检查或将患者转诊至其他成像方式。组织谐波成像一直是对抗雾霾的流行方法,但在严重的情况下仍然受到雾霾的严重影响。或者,去噪算法通常无法去除高度结构化和相关的噪声,例如雾霾。准确描述结构雾的统计特性并开发一种推理方法来随后消除它仍然是一个挑战。扩散模型已成为强大的生成模型,并在各种反问题中显示出其有效性。在这项工作中,我们提出了一个联合后验采样框架,该框架结合了两个独立的扩散模型,以无监督的方式对清洁超声和雾霾的分布进行建模。此外,我们展示了在射频超声数据上有效训练扩散模型的技术,并强调了相对于图像数据的优势。对体外和体内心脏数据集的实验表明,所提出的去雾方法可以有效去除雾霾,同时保留来自弱反射组织的信号。
AU Hu, Wentao
Cheng, Lianglun
Huang, Guoheng
Yuan, Xiaochen
Zhong, Guo
Pun, Chi-Man
Zhou, Jian
Cai, Muyan
胡AU、程文涛、黄良伦、袁国恒、钟晓晨、郭朋、周驰满、蔡健、穆岩
Learning From Incorrectness: Active Learning With Negative Pre-Training
and Curriculum Querying for Histological Tissue Classification
从错误中学习:组织学组织分类的负预训练和课程查询的主动学习
Patch-level histological tissue classification is an effective
pre-processing method for histological slide analysis. However, the
classification of tissue with deep learning requires expensive
annotation costs. To alleviate the limitations of annotation budgets,
the application of active learning (AL) to histological tissue
classification is a promising solution. Nevertheless, there is a large
imbalance in performance between categories during application, and the
tissue corresponding to the categories with relatively insufficient
performance are equally important for cancer diagnosis. In this paper,
we propose an active learning framework called ICAL, which contains
Incorrectness Negative Pre-training (INP) and Category-wise Curriculum
Querying (CCQ) to address the above problem from the perspective of
category-to-category and from the perspective of categories themselves,
respectively. In particular, INP incorporates the unique mechanism of
active learning to treat the incorrect prediction results that obtained
from CCQ as complementary labels for negative pre-training, in order to
better distinguish similar categories during the training process. CCQ
adjusts the query weights based on the learning status on each category
by the model trained by INP, and utilizes uncertainty to evaluate and
compensate for query bias caused by inadequate category performance.
Experimental results on two histological tissue classification datasets
demonstrate that ICAL achieves performance approaching that of fully
supervised learning with less than 16% of the labeled data. In
comparison to the state-of-the-art active learning algorithms, ICAL
achieved better and more balanced performance in all categories and
maintained robustness with extremely low annotation budgets. The source
code will be released at https://github.com/LactorHwt/ICAL.
斑块级组织学组织分类是组织学玻片分析的有效预处理方法。然而,利用深度学习对组织进行分类需要昂贵的注释成本。为了缓解注释预算的限制,将主动学习(AL)应用于组织学组织分类是一个有前途的解决方案。然而,应用过程中类别之间的性能存在较大不平衡,性能相对不足的类别对应的组织对于癌症诊断同样重要。在本文中,我们提出了一种名为 ICAL 的主动学习框架,其中包含不正确负预训练(INP)和类别明智的课程查询(CCQ),以从类别到类别的角度和从类别角度解决上述问题分别是类别本身。特别是,INP结合了主动学习的独特机制,将CCQ获得的错误预测结果视为负面预训练的补充标签,以便在训练过程中更好地区分相似类别。 CCQ根据INP训练的模型对每个类别的学习状况来调整查询权重,并利用不确定性来评估和补偿由于类别表现不足而导致的查询偏差。两个组织学组织分类数据集的实验结果表明,ICAL 使用少于 16% 的标记数据实现了接近完全监督学习的性能。与最先进的主动学习算法相比,ICAL 在所有类别中实现了更好、更平衡的性能,并以极低的注释预算保持了鲁棒性。 源代码将在 https://github.com/LactorHwt/ICAL 发布。
AU van Harten, Louis D.
Stoker, Jaap
Isgum, Ivana
AU van Harten、Louis D. Stoker、Jaap Isgum、Ivana
Robust Deformable Image Registration Using Cycle-Consistent Implicit
Representations
使用循环一致隐式表示的鲁棒变形图像配准
Recent works in medical image registration have proposed the use of
Implicit Neural Representations, demonstrating performance that rivals
state-of-the-art learning-based methods. However, these implicit
representations need to be optimized for each new image pair, which is a
stochastic process that may fail to converge to a global minimum. To
improve robustness, we propose a deformable registration method using
pairs of cycle-consistent Implicit Neural Representations: each implicit
representation is linked to a second implicit representation that
estimates the opposite transformation, causing each network to act as a
regularizer for its paired opposite. During inference, we generate
multiple deformation estimates by numerically inverting the paired
backward transformation and evaluating the consensus of the optimized
pair. This consensus improves registration accuracy over using a single
representation and results in a robust uncertainty metric that can be
used for automatic quality control. We evaluate our method with a 4D
lung CT dataset. The proposed cycle-consistent optimization method
reduces the optimization failure rate from 2.4% to 0.0% compared to the
current state-of-the-art. The proposed inference method improves
landmark accuracy by 4.5% and the proposed uncertainty metric detects
all instances where the registration method fails to converge to a
correct solution. We verify the generalizability of these results to
other data using a centerline propagation task in abdominal 4D MRI,
where our method achieves a 46% improvement in propagation consistency
compared with single-INR registration and demonstrates a strong
correlation between the proposed uncertainty metric and registration
accuracy.
最近的医学图像配准工作提出了使用隐式神经表示,其性能可与最先进的基于学习的方法相媲美。然而,这些隐式表示需要针对每个新图像对进行优化,这是一个随机过程,可能无法收敛到全局最小值。为了提高鲁棒性,我们提出了一种使用成对的循环一致隐式神经表示的可变形配准方法:每个隐式表示都链接到估计相反变换的第二隐式表示,使每个网络充当其配对相反的正则化器。在推理过程中,我们通过对成对后向变换进行数值反转并评估优化对的一致性来生成多个变形估计。与使用单一表示相比,这种共识提高了配准准确性,并产生了可用于自动质量控制的强大的不确定性度量。我们使用 4D 肺部 CT 数据集评估我们的方法。与当前最先进的技术相比,所提出的循环一致优化方法将优化失败率从 2.4% 降低到 0.0%。所提出的推理方法将地标精度提高了 4.5%,并且所提出的不确定性度量可检测配准方法未能收敛到正确解决方案的所有实例。我们使用腹部 4D MRI 中的中心线传播任务验证了这些结果对其他数据的通用性,与单 INR 配准相比,我们的方法在传播一致性方面实现了 46% 的改进,并证明了所提出的不确定性度量和配准精度之间的强相关性。
AU Li, Wei
Liu, Guang-Hai
Fan, Haoyi
Li, Zuoyong
Zhang, David
李AU、刘伟、范广海、李浩毅、张作勇、David
Self-Supervised Multi-Scale Cropping and Simple Masked Attentive
Predicting for Lung CT-Scan Anomaly Detection
用于肺部 CT 扫描异常检测的自监督多尺度裁剪和简单屏蔽注意力预测
Anomaly detection has been widely explored by training an
out-of-distribution detector with only normal data for medical images.
However, detecting local and subtle irregularities without prior
knowledge of anomaly types brings challenges for lung CT-scan image
anomaly detection. In this paper, we propose a self-supervised framework
for learning representations of lung CT-scan images via both multi-scale
cropping and simple masked attentive predicting, which is capable of
constructing a powerful out-of-distribution detector. Firstly, we
propose CropMixPaste, a self-supervised augmentation task for generating
density shadow-like anomalies that encourage the model to detect local
irregularities of lung CT-scan images. Then, we propose a
self-supervised reconstruction block, named simple masked attentive
predicting block (SMAPB), to better refine local features by predicting
masked context information. Finally, the learned representations by
self-supervised tasks are used to build an out-of-distribution detector.
The results on real lung CT-scan datasets demonstrate the effectiveness
and superiority of our proposed method compared with state-of-the-art
methods.
通过仅使用医学图像的正常数据训练分布外检测器,异常检测已被广泛探索。然而,在事先不了解异常类型的情况下检测局部和细微的不规则现象给肺部 CT 扫描图像异常检测带来了挑战。在本文中,我们提出了一种自监督框架,用于通过多尺度裁剪和简单的掩模注意预测来学习肺部 CT 扫描图像的表示,该框架能够构建强大的分布外检测器。首先,我们提出了 CropMixPaste,这是一种自监督增强任务,用于生成密度阴影状异常,从而鼓励模型检测肺部 CT 扫描图像的局部不规则性。然后,我们提出了一种自监督重建块,称为简单掩蔽注意预测块(SMAPB),通过预测掩蔽上下文信息来更好地细化局部特征。最后,通过自监督任务学习到的表示用于构建分布外检测器。真实肺部 CT 扫描数据集的结果证明了我们提出的方法与最先进的方法相比的有效性和优越性。
AU Caudoux, Manon
Demeulenaere, Oscar
Poree, Jonathan
Sauvage, Jack
Mateo, Philippe
Ghaleh, Bijan
Flesch, Martin
Ferin, Guillaume
Tanter, Mickael
Deffieux, Thomas
Papadacci, Clement
Pernot, Mathieu
奥·考杜、曼农·德默勒奈尔、奥斯卡·波里、乔纳森·索瓦奇、杰克·马特奥、菲利普·盖勒、比扬·弗莱施、马丁·费林、吉约姆·坦特、米凯尔·德菲厄、托马斯·帕帕达奇、克莱门特·佩尔诺、马蒂厄
Curved Toroidal Row Column Addressed Transducer for 3D Ultrafast
Ultrasound Imaging
用于 3D 超快超声成像的弯曲环形行列寻址传感器
3D Imaging of the human heart at high frame rate is of major interest
for various clinical applications. Electronic complexity and cost has
prevented the dissemination of 3D ultrafast imaging into the clinic. Row
column addressed (RCA) transducers provide volumetric imaging at
ultrafast frame rate by using a low electronic channel count, but
current models are ill-suited for transthoracic cardiac imaging due to
field-of-view limitations. In this study, we proposed a mechanically
curved RCA with an aperture adapted for transthoracic cardiac imaging (
$24\times16$ mm2). The RCA has a toroidal curved surface of 96 elements
along columns (curvature radius rC = 4.47 cm) and 64 elements along rows
(curvature radius rR = 3 cm). We implemented delay and sum beamforming
with an analytical calculation of the propagation of a toroidal wave
which was validated using simulations (Field II). The imaging
performance was evaluated on a calibrated phantom. Experimental 3D
imaging was achieved up to 12 cm deep with a total angular aperture of
30 degrees for both lateral dimensions. The Contrast-to-Noise ratio
increased by 12 dB from 2 to 128 virtual sources. Then, 3D Ultrasound
Localization Microscopy (ULM) was characterized in a sub-wavelength tube
diameter. Finally, 3D ULM was demonstrated on a perfused ex-vivo swine
heart to image the coronary microcirculation.
高帧率下的人体心脏 3D 成像是各种临床应用的主要关注点。电子复杂性和成本阻碍了 3D 超快成像在临床中的传播。行列寻址 (RCA) 传感器通过使用低电子通道数以超快帧速率提供体积成像,但由于视场限制,当前模型不适合经胸心脏成像。在这项研究中,我们提出了一种机械弯曲的 RCA,其孔径适合经胸心脏成像($24\times16$mm2)。 RCA 具有沿列 96 个元件(曲率半径 rC = 4.47 cm)和沿行 64 个元件(曲率半径 rR = 3 cm)的环形曲面。我们通过对环形波传播的分析计算来实现延迟和求和波束形成,并通过模拟进行验证(领域 II)。成像性能在校准模型上进行评估。实验性 3D 成像深度可达 12 厘米,两个横向尺寸的总孔径角均为 30 度。对比度与噪声比从 2 个虚拟源增加到 128 个,增加了 12 dB。然后,3D 超声定位显微镜 (ULM) 表征了亚波长管直径。最后,在灌注的离体猪心脏上演示了 3D ULM,以对冠状动脉微循环进行成像。
AU Shaker, Abdelrahman
Maaz, Muhammad
Rasheed, Hanoona
Khan, Salman
Yang, Ming-Hsuan
Khan, Fahad Shahbaz
AU Shaker、Abdelrahman Maaz、Muhammad Rasheed、Hanoona Khan、Salman Yang、Ming-Hsuan Khan、Fahad Shahbaz
UNETR plus plus : Delving Into Efficient and Accurate 3D Medical Image
Segmentation
UNETR plus plus:深入研究高效准确的 3D 医学图像分割
Owing to the success of transformer models, recent works study their
applicability in 3D medical segmentation tasks. Within the transformer
models, the self-attention mechanism is one of the main building blocks
that strives to capture long-range dependencies, compared to the local
convolutional-based design. However, the self-attention operation has
quadratic complexity which proves to be a computational bottleneck,
especially in volumetric medical imaging, where the inputs are 3D with
numerous slices. In this paper, we propose a 3D medical image
segmentation approach, named UNETR++, that offers both high-quality
segmentation masks as well as efficiency in terms of parameters, compute
cost, and inference speed. The core of our design is the introduction of
a novel efficient paired attention (EPA) block that efficiently learns
spatial and channel-wise discriminative features using a pair of
inter-dependent branches based on spatial and channel attention. Our
spatial attention formulation is efficient and has linear complexity
with respect to the input. To enable communication between spatial and
channel-focused branches, we share the weights of query and key mapping
functions that provide a complimentary benefit (paired attention), while
also reducing the complexity. Our extensive evaluations on five
benchmarks, Synapse, BTCV, ACDC, BraTS, and Decathlon-Lung, reveal the
effectiveness of our contributions in terms of both efficiency and
accuracy. On Synapse, our UNETR++ sets a new state-of-the-art with a
Dice Score of 87.2%, while significantly reducing parameters and FLOPs
by over 71%, compared to the best method in the literature. Our code and
models are available at: https://tinyurl.com/2p87x5xn.
由于 Transformer 模型的成功,最近的工作研究了它们在 3D 医学分割任务中的适用性。在 Transformer 模型中,与基于局部卷积的设计相比,自注意力机制是努力捕获远程依赖性的主要构建块之一。然而,自注意力操作具有二次复杂度,这被证明是一个计算瓶颈,特别是在体积医学成像中,其中输入是具有大量切片的 3D。在本文中,我们提出了一种名为 UNETR++ 的 3D 医学图像分割方法,该方法提供高质量的分割掩模以及参数、计算成本和推理速度方面的效率。我们设计的核心是引入一种新颖的高效配对注意力(EPA)块,该块使用一对基于空间和通道注意力的相互依赖的分支来有效地学习空间和通道方面的判别特征。我们的空间注意力公式是有效的,并且相对于输入具有线性复杂性。为了实现以空间和通道为中心的分支之间的通信,我们共享查询和关键映射函数的权重,这提供了免费的好处(配对注意力),同时还降低了复杂性。我们对 Synapse、BTCV、ACDC、BraTS 和 Decathlon-Lung 五个基准进行了广泛的评估,揭示了我们在效率和准确性方面所做贡献的有效性。在 Synapse 上,我们的 UNETR++ 创下了新的最先进水平,Dice 得分为 87.2%,同时与文献中的最佳方法相比,参数和 FLOP 显着减少了 71% 以上。我们的代码和模型可在以下网址获取:https://tinyurl.com/2p87x5xn。
AU Gao, Jun
Lao, Qicheng
Kang, Qingbo
Liu, Paul
Du, Chenlin
Li, Kang
Zhang, Le
AU 高、老君、康其成、刘庆波、杜保罗、李陈林、张康、乐
Boosting Your Context by Dual Similarity Checkup for In-Context Learning
Medical Image Segmentation.
通过双重相似性检查来增强您的上下文,以进行上下文学习医学图像分割。
The recent advent of in-context learning (ICL) capabilities in large
pre-trained models has yielded significant advancements in the
generalization of segmentation models. By supplying domain-specific
image-mask pairs, the ICL model can be effectively guided to produce
optimal segmentation outcomes, eliminating the necessity for model
fine-tuning or interactive prompting. However, current existing
ICL-based segmentation models exhibit significant limitations when
applied to medical segmentation datasets with substantial diversity. To
address this issue, we propose a dual similarity checkup approach to
guarantee the effectiveness of selected in-context samples so that their
guidance can be maximally leveraged during inference. We first employ
large pre-trained vision models for extracting strong semantic
representations from input images and constructing a feature embedding
memory bank for semantic similarity checkup during inference. Assuring
the similarity in the input semantic space, we then minimize the
discrepancy in the mask appearance distribution between the support set
and the estimated mask appearance prior through similarity-weighted
sampling and augmentation. We validate our proposed dual similarity
checkup approach on eight publicly available medical segmentation
datasets, and extensive experimental results demonstrate that our
proposed method significantly improves the performance metrics of
existing ICL-based segmentation models, particularly when applied to
medical image datasets characterized by substantial diversity.
最近大型预训练模型中上下文学习 (ICL) 功能的出现在分割模型的泛化方面取得了重大进展。通过提供特定领域的图像掩模对,可以有效地引导 ICL 模型产生最佳分割结果,从而消除模型微调或交互式提示的必要性。然而,当前现有的基于 ICL 的分割模型在应用于具有很大多样性的医学分割数据集时表现出很大的局限性。为了解决这个问题,我们提出了一种双重相似性检查方法来保证所选上下文样本的有效性,以便在推理过程中最大限度地利用它们的指导。我们首先采用大型预训练视觉模型从输入图像中提取强语义表示,并构建特征嵌入内存库以在推理过程中进行语义相似性检查。确保输入语义空间中的相似性,然后我们通过相似性加权采样和增强来最小化支持集与先验估计的掩模外观之间掩模外观分布的差异。我们在八个公开可用的医学分割数据集上验证了我们提出的双重相似性检查方法,并且广泛的实验结果表明,我们提出的方法显着提高了现有基于 ICL 的分割模型的性能指标,特别是当应用于具有大量多样性特征的医学图像数据集时。
AU Zhang, Yikun
Hu, Dianlin
Li, Wangyao
Zhang, Weijie
Chen, Gaoyu
Chen, Ronald C
Chen, Yang
Gao, Hao
张AU、胡逸琨、李殿林、张旺耀、陈伟杰、陈高宇、陈志雄、高阳、郝
2V-CBCT: Two-Orthogonal-Projection based CBCT Reconstruction and Dose
Calculation for Radiation Therapy using Real Projection Data.
2V-CBCT:基于两次正交投影的 CBCT 重建和使用真实投影数据的放射治疗剂量计算。
This work demonstrates the feasibility of
two-orthogonal-projection-based CBCT (2V-CBCT) reconstruction and dose
calculation for radiation therapy (RT) using real projection data, which
is the first 2V-CBCT feasibility study with real projection data, to the
best of our knowledge. RT treatments are often delivered in multiple
fractions, for which on-board CBCT is desirable to calculate the
delivered dose per fraction for the purpose of RT delivery quality
assurance and adaptive RT. However, not all RT treatments/fractions have
CBCT acquired, but two orthogonal projections are always available. The
question to be addressed in this work is the feasibility of 2V-CBCT for
the purpose of RT dose calculation. 2V-CBCT is a severely ill-posed
inverse problem for which we propose a coarse-to-fine learning strategy.
First, a 3D deep neural network that can extract and exploit the
inter-slice and intra-slice information is adopted to predict the
initial 3D volumes. Then, a 2D deep neural network is utilized to
fine-tune the initial 3D volumes slice-by-slice. During the fine-tuning
stage, a perceptual loss based on multi-frequency features is employed
to enhance the image reconstruction. Dose calculation results from both
photon and proton RT demonstrate that 2V-CBCT provides comparable
accuracy with full-view CBCT based on real projection data.
这项工作向人们展示了使用真实投影数据进行基于二次正交投影的CBCT(2V-CBCT)重建和放射治疗(RT)剂量计算的可行性,这是第一个使用真实投影数据的2V-CBCT可行性研究。据我们所知。 RT 治疗通常分多次进行,因此需要机载 CBCT 来计算每次分次的递送剂量,以保证 RT 递送质量和适应性 RT。然而,并非所有 RT 治疗/分次都获得了 CBCT,但始终可以获得两个正交投影。本工作要解决的问题是 2V-CBCT 用于 RT 剂量计算的可行性。 2V-CBCT 是一个严重不适定的逆问题,我们为此提出了一种从粗到精的学习策略。首先,采用可以提取和利用切片间和切片内信息的 3D 深度神经网络来预测初始 3D 体积。然后,利用 2D 深度神经网络逐片微调初始 3D 体积。在微调阶段,采用基于多频率特征的感知损失来增强图像重建。光子和质子 RT 的剂量计算结果表明,2V-CBCT 提供的精度与基于真实投影数据的全视图 CBCT 相当。
AU Hou, Qingshan
Wang, Yaqi
Cao, Peng
Cheng, Shuai
Lan, Linqi
Yang, Jinzhu
Liu, Xiaoli
Zaiane, Osmar R.
侯AU、王青山、曹雅琪、程鹏、兰帅、杨林奇、刘金柱、Xiaoli Zaiane、Osmar R.
A Collaborative Self-Supervised Domain Adaptation for Low-Quality
Medical Image Enhancement
用于低质量医学图像增强的协作自监督域适应
Medical image analysis techniques have been employed in diagnosing and
screening clinical diseases. However, both poor medical image quality
and illumination style inconsistency increase uncertainty in clinical
decision-making, potentially resulting in clinician misdiagnosis. The
majority of current image enhancement methods primarily concentrate on
enhancing medical image quality by leveraging high-quality reference
images, which are challenging to collect in clinical applications. In
this study, we address image quality enhancement within a fully
self-supervised learning setting, wherein neither high-quality images
nor paired images are required. To achieve this goal, we investigate the
potential of self-supervised learning combined with domain adaptation to
enhance the quality of medical images without the guidance of
high-quality medical images. We design a Domain Adaptation
Self-supervised Quality Enhancement framework, called DASQE. More
specifically, we establish multiple domains at the patch level through a
designed rule-based quality assessment scheme and style clustering. To
achieve image quality enhancement and maintain style consistency, we
formulate the image quality enhancement as a collaborative
self-supervised domain adaptation task for disentangling the low-quality
factors, medical image content, and illumination style characteristics
by exploring intrinsic supervision in the low-quality medical images.
Finally, we perform extensive experiments on six benchmark datasets of
medical images, and the experimental results demonstrate that DASQE
attains state-of-the-art performance. Furthermore, we explore the impact
of the proposed method on various clinical tasks, such as retinal fundus
vessel/lesion segmentation, nerve fiber segmentation, polyp
segmentation, skin lesion segmentation, and disease classification. The
results demonstrate that DASQE is advantageous for diverse downstream
image analysis tasks.
医学图像分析技术已应用于临床疾病的诊断和筛查。然而,较差的医学图像质量和照明风格的不一致增加了临床决策的不确定性,可能导致临床医生误诊。当前大多数图像增强方法主要集中在通过利用高质量参考图像来增强医学图像质量,而在临床应用中收集这些图像具有挑战性。在这项研究中,我们在完全自我监督的学习环境中解决图像质量增强问题,其中既不需要高质量图像也不需要配对图像。为了实现这一目标,我们研究了自我监督学习与领域适应相结合的潜力,以在没有高质量医学图像指导的情况下提高医学图像的质量。我们设计了一个领域适应自监督质量增强框架,称为 DASQE。更具体地说,我们通过设计的基于规则的质量评估方案和风格聚类在补丁级别建立多个域。为了实现图像质量增强并保持风格一致性,我们将图像质量增强制定为协作自监督域适应任务,通过探索低质量中的内在监督来解开低质量因素、医学图像内容和照明风格特征。医学图像。最后,我们对六个医学图像基准数据集进行了广泛的实验,实验结果表明 DASQE 达到了最先进的性能。 此外,我们探讨了所提出的方法对各种临床任务的影响,例如视网膜眼底血管/病变分割、神经纤维分割、息肉分割、皮肤病变分割和疾病分类。结果表明,DASQE 对于各种下游图像分析任务具有优势。
AU Li, Lei
Camps, Julia
Wang, Zhinuo (Jenny)
Beetz, Marcel
Banerjee, Abhirup
Rodriguez, Blanca
Grau, Vicente
AU Li、Lei Camps、Julia Wang、Zhinuo (Jenny) Beetz、Marcel Banerjee、Abhirup Rodriguez、Blanca Grau、Vicente
Toward Enabling Cardiac Digital Twins of Myocardial Infarction Using
Deep Computational Models for Inverse Inference
使用深度计算模型进行逆向推理,实现心肌梗塞的心脏数字孪生
Cardiac digital twins (CDTs) have the potential to offer individualized
evaluation of cardiac function in a non-invasive manner, making them a
promising approach for personalized diagnosis and treatment planning of
myocardial infarction (MI). The inference of accurate myocardial tissue
properties is crucial in creating a reliable CDT of MI. In this work, we
investigate the feasibility of inferring myocardial tissue properties
from the electrocardiogram (ECG) within a CDT platform. The platform
integrates multi-modal data, such as cardiac MRI and ECG, to enhance the
accuracy and reliability of the inferred tissue properties. We perform a
sensitivity analysis based on computer simulations, systematically
exploring the effects of infarct location, size, degree of
transmurality, and electrical activity alteration on the simulated QRS
complex of ECG, to establish the limits of the approach. We subsequently
present a novel deep computational model, comprising a dual-branch
variational autoencoder and an inference model, to infer infarct
location and distribution from the simulated QRS. The proposed model
achieves mean Dice scores of $ {0}.{457} \pm {0}.{317} $ and $ {0}.{302}
\pm {0}.{273} $ for the inference of left ventricle scars and border
zone, respectively. The sensitivity analysis enhances our understanding
of the complex relationship between infarct characteristics and
electrophysiological features. The in silico experimental results show
that the model can effectively capture the relationship for the inverse
inference, with promising potential for clinical application in the
future. The code is available at
https://github.com/lileitech/MI_inverse_inference.
心脏数字孪生(CDT)有潜力以非侵入性方式提供心脏功能的个性化评估,使其成为心肌梗死(MI)个性化诊断和治疗计划的有前途的方法。准确的心肌组织特性的推断对于创建可靠的 MI 的 CDT 至关重要。在这项工作中,我们研究了在 CDT 平台内从心电图 (ECG) 推断心肌组织特性的可行性。该平台集成了心脏 MRI 和 ECG 等多模态数据,以提高推断组织特性的准确性和可靠性。我们基于计算机模拟进行敏感性分析,系统地探索梗塞位置、大小、透壁程度和电活动改变对模拟心电图 QRS 波群的影响,以确定该方法的局限性。随后,我们提出了一种新颖的深度计算模型,包括双分支变分自动编码器和推理模型,以从模拟的 QRS 推断梗塞位置和分布。所提出的模型对于 left 的推理实现了 $ {0}.{457} \pm {0}.{317} $ 和 $ {0}.{302} \pm {0}.{273} $ 的平均 Dice 分数分别是心室疤痕和边界区。敏感性分析增强了我们对梗塞特征和电生理特征之间复杂关系的理解。计算机实验结果表明,该模型能够有效捕捉逆向推理的关系,未来具有良好的临床应用潜力。代码可在 https://github.com/lileitech/MI_inverse_inference 获取。
AU Schmidt, Adam
Mohareri, Omid
DiMaio, Simon P.
Salcudean, Septimiu E.
AU Schmidt、Adam Mohareri、Omid DiMaio、Simon P. Salcudean、Septimiu E.
Surgical Tattoos in Infrared: A Dataset for Quantifying Tissue Tracking
and Mapping
红外手术纹身:用于量化组织跟踪和绘图的数据集
Quantifying performance of methods for tracking and mapping tissue in
endoscopic environments is essential for enabling image guidance and
automation of medical interventions and surgery. Datasets developed so
far either use rigid environments, visible markers, or require
annotators to label salient points in videos after collection. These are
respectively: not general, visible to algorithms, or costly and
error-prone. We introduce a novel labeling methodology along with a
dataset that uses said methodology, Surgical Tattoos in Infrared (STIR).
STIR has labels that are persistent but invisible to visible spectrum
algorithms. This is done by labelling tissue points with IR-fluorescent
dye, indocyanine green (ICG), and then collecting visible light video
clips. STIR comprises hundreds of stereo video clips in both in vivo and
ex vivo scenes with start and end points labelled in the IR spectrum.
With over 3,000 labelled points, STIR will help to quantify and enable
better analysis of tracking and mapping methods. After introducing STIR,
we analyze multiple different frame-based tracking methods on STIR using
both 3D and 2D endpoint error and accuracy metrics. STIR is available at
https://dx.doi.org/10.21227/w8g4-g548
量化内窥镜环境中跟踪和绘制组织的方法的性能对于实现医疗干预和手术的图像引导和自动化至关重要。迄今为止开发的数据集要么使用严格的环境、可见的标记,要么要求注释者在收集后标记视频中的显着点。这些分别是:不通用、算法可见、成本高昂且容易出错。我们引入了一种新颖的标记方法以及使用该方法的数据集,即红外纹身手术(STIR)。 STIR 具有持久性但对可见光谱算法不可见的标签。这是通过用红外荧光染料吲哚菁绿 (ICG) 标记组织点,然后收集可见光视频剪辑来完成的。 STIR 包含数百个体内和离体场景的立体视频剪辑,并在红外光谱中标记了起点和终点。 STIR 拥有 3,000 多个标记点,将有助于量化并更好地分析跟踪和绘图方法。介绍 STIR 后,我们使用 3D 和 2D 端点误差和准确度指标来分析 STIR 上多种不同的基于帧的跟踪方法。 STIR 可在 https://dx.doi.org/10.21227/w8g4-g548 获取
AU Shen, Chengkang
Zhu, Hao
Zhou, You
Liu, Yu
Yi, Si
Dong, Lili
Zhao, Weipeng
Brady, David J
Cao, Xun
Ma, Zhan
Lin, Yi
沉区、朱成康、周浩、刘友、于毅、董思、赵丽丽、布雷迪伟鹏、曹大卫、马迅、林展、易
Continuous 3D Myocardial Motion Tracking via Echocardiography.
通过超声心动图进行连续 3D 心肌运动跟踪。
Myocardial motion tracking stands as an essential clinical tool in the
prevention and detection of cardiovascular diseases (CVDs), the foremost
cause of death globally. However, current techniques suffer from
incomplete and inaccurate motion estimation of the myocardium in both
spatial and temporal dimensions, hindering the early identification of
myocardial dysfunction. To address these challenges, this paper
introduces the Neural Cardiac Motion Field (NeuralCMF). NeuralCMF
leverages implicit neural representation (INR) to model the 3D structure
and the comprehensive 6D forward/backward motion of the heart. This
method surpasses pixel-wise limitations by offering the capability to
continuously query the precise shape and motion of the myocardium at any
specific point throughout the cardiac cycle, enhancing the detailed
analysis of cardiac dynamics beyond traditional speckle tracking.
Notably, NeuralCMF operates without the need for paired datasets, and
its optimization is self-supervised through the physics knowledge priors
in both space and time dimensions, ensuring compatibility with both 2D
and 3D echocardiogram video inputs. Experimental validations across
three representative datasets support the robustness and innovative
nature of the NeuralCMF, marking significant advantages over existing
state-of-the-art methods in cardiac imaging and motion tracking. Code is
available at: https://njuvision.github.io/NeuralCMF.
心肌运动跟踪是预防和检测心血管疾病(CVD)的重要临床工具,心血管疾病是全球最主要的死亡原因。然而,当前的技术在空间和时间维度上对心肌的运动估计不完整且不准确,阻碍了心肌功能障碍的早期识别。为了应对这些挑战,本文介绍了神经心脏运动场(NeuralCMF)。 NeuralCMF 利用隐式神经表示 (INR) 对心脏的 3D 结构和全面的 6D 向前/向后运动进行建模。该方法超越了像素方面的限制,能够在整个心动周期的任何特定点连续查询心肌的精确形状和运动,从而增强对心脏动力学的详细分析,超越传统的斑点跟踪。值得注意的是,NeuralCMF 的运行不需要配对数据集,其优化是通过空间和时间维度的物理知识先验进行自我监督的,确保与 2D 和 3D 超声心动图视频输入兼容。三个代表性数据集的实验验证支持 NeuralCMF 的稳健性和创新性,与心脏成像和运动跟踪领域现有的最先进方法相比具有显着优势。代码位于:https://njuvision.github.io/NeuralCMF。
AU Yang, Wenhui
Gao, Shuo
Zhang, Hao
Yu, Hong
Xu, Menglei
Chong, Puimun
Zhang, Weijie
Wang, Hong
Zhang, Wenjuan
Qian, Airong
AU Yang、高文辉、张硕、于浩、徐红、冲梦蕾、张佩蒙、王伟杰、张红、钱文娟、爱荣
PtbNet: Based on Local Few-Shot Classes and Small Objects to accurately
detect PTB.
PtbNet:基于局部少样本类和小对象来准确检测PTB。
Pulmonary Tuberculosis (PTB) is one of the world's most infectious
illnesses, and its early detection is critical for preventing PTB.
Digital Radiography (DR) has been the most common and effective
technique to examine PTB. However, due to the variety and weak
specificity of phenotypes on DR chest X-ray (DCR), it is difficult to
make reliable diagnoses for radiologists. Although artificial
intelligence technology has made considerable gains in assisting the
diagnosis of PTB, it lacks methods to identify the lesions of PTB with
few-shot classes and small objects. To solve these problems, geometric
data augmentation was used to increase the size of the DCRs. For this
purpose, a diffusion probability model was implemented for six few-shot
classes. Importantly, we propose a new multi-lesion detector PtbNet
based on RetinaNet, which was constructed to detect small objects of PTB
lesions. The results showed that by two data augmentations, the number
of DCRs increased by 80% from 570 to 2,859. In the pre-evaluation
experiments with the baseline, RetinaNet, the AP improved by 9.9 for six
few-shot classes. Our extensive empirical evaluation showed that the AP
of PtbNet achieved 28.2, outperforming the other 9 state-of-the-art
methods. In the ablation study, combined with BiFPN+ and PSPD-Conv, the
AP increased by 2.1, APs increased by 5.0, and grew by an average of 9.8
in APm and APl. In summary, PtbNet not only improves the detection of
small-object lesions but also enhances the ability to detect different
types of PTB uniformly, which helps physicians diagnose PTB lesions
accurately. The code is available at
https://github.com/Wenhui-person/PtbNet/tree/master.
肺结核(PTB)是世界上传染性最强的疾病之一,早期发现对于预防肺结核至关重要。数字放射线摄影 (DR) 是检查 PTB 的最常见和最有效的技术。然而,由于DR胸部X线(DCR)表型的多样性和特异性弱,放射科医生很难做出可靠的诊断。尽管人工智能技术在辅助PTB诊断方面取得了相当大的成果,但缺乏识别少镜头类、小物体的PTB病灶的方法。为了解决这些问题,使用几何数据增强来增加 DCR 的大小。为此,针对六个小样本类别实施了扩散概率模型。重要的是,我们提出了一种基于 RetinaNet 的新型多病灶检测器 PtbNet,其构建用于检测 PTB 病灶的小物体。结果显示,通过两次数据增强,DCR 的数量增加了 80%,从 570 个增加到 2,859 个。在使用基线 RetinaNet 的预评估实验中,六个小样本类别的 AP 提高了 9.9。我们广泛的实证评估表明,PtbNet 的 AP 达到了 28.2,优于其他 9 种最先进的方法。在消融研究中,结合BiFPN+和PSPD-Conv,AP增加了2.1,APs增加了5.0,APm和APl平均增长了9.8。综上所述,PtbNet不仅提高了小物体病灶的检测能力,还增强了统一检测不同类型PTB的能力,有助于医生准确诊断PTB病灶。代码可在 https://github.com/Wenhui-person/PtbNet/tree/master 获取。
EI 1558-254X
DA 2024-06-29
UT MEDLINE:38923480
PM 38923480
ER
EI 1558-254X DA 2024-06-29 UT MEDLINE:38923480 PM 38923480 ER
AU Ching-Roa, Vincent D.
Huang, Chi Z.
Giacomelli, Michael G.
AU Ching-Roa、Vincent D. Huang、Chi Z. Giacomelli、Michael G.
Suppression of Subpixel Jitter in Resonant Scanning Systems With
Phase-locked Sampling
利用锁相采样抑制谐振扫描系统中的子像素抖动
Resonant scanning is critical to high speed and in vivo imaging in many
applications of laser scanning microscopy. However, resonant scanning
suffers from well-known image artifacts due to scanner jitter, limiting
adoption of high-speed imaging technologies. Here, we introduce a
real-time, inexpensive and all electrical method to suppress jitter more
than an order of magnitude below the diffraction limit that can be
applied to most existing microscope systems with no software changes. By
phase-locking imaging to the resonant scanner period, we demonstrate an
86% reduction in pixel jitter, a 15% improvement in point spread
function with resonant scanning and show that this approach enables two
widely used models of resonant scanners to achieve comparable accuracy
to galvanometer scanners running two orders of magnitude slower.
Finally, we demonstrate the versatility of this method by retrofitting a
commercial two photon microscope and show that this approach enables
significant quantitative and qualitative improvements in biological
imaging.
在激光扫描显微镜的许多应用中,共振扫描对于高速体内成像至关重要。然而,共振扫描由于扫描仪抖动而存在众所周知的图像伪影,限制了高速成像技术的采用。在这里,我们介绍了一种实时、廉价且全电气的方法,可以将抖动抑制在衍射极限以下一个数量级以上,该方法可以应用于大多数现有的显微镜系统,而无需更改软件。通过将成像锁相到共振扫描仪周期,我们证明了共振扫描的像素抖动减少了 86%,点扩散函数提高了 15%,并表明这种方法使两种广泛使用的共振扫描仪模型能够达到与检流计扫描仪的运行速度慢两个数量级。最后,我们通过改装商用双光子显微镜证明了该方法的多功能性,并表明该方法可以在生物成像方面实现显着的定量和定性改进。
AU Wang, Pengyu
Zhang, Huaqi
Zhu, Meilu
Jiang, Xi
Qin, Jing
Yuan, Yixuan
王AU、张鹏宇、朱华琪、蒋美露、秦曦、袁静、艺轩
MGIML: Cancer Grading With Incomplete Radiology-Pathology Data via
Memory Learning and Gradient Homogenization
MGIML:通过记忆学习和梯度均质化使用不完整的放射病理学数据进行癌症分级
Taking advantage of multi-modal radiology-pathology data with
complementary clinical information for cancer grading is helpful for
doctors to improve diagnosis efficiency and accuracy. However, radiology
and pathology data have distinct acquisition difficulties and costs,
which leads to incomplete-modality data being common in applications. In
this work, we propose a Memory- and Gradient-guided Incomplete
Modal-modal Learning (MGIML) framework for cancer grading with
incomplete radiology-pathology data. Firstly, to remedy missing-modality
information, we propose a Memory-driven Hetero-modality Complement
(MH-Complete) scheme, which constructs modal-specific memory banks
constrained by a coarse-grained memory boosting (CMB) loss to record
generic radiology and pathology feature patterns, and develops a
cross-modal memory reading strategy enhanced by a fine-grained memory
consistency (FMC) loss to take missing-modality information from
well-stored memories. Secondly, as gradient conflicts exist between
missing-modality situations, we propose a Rotation-driven Gradient
Homogenization (RG-Homogenize) scheme, which estimates instance-specific
rotation matrices to smoothly change the feature-level gradient
directions, and computes confidence-guided homogenization weights to
dynamically balance gradient magnitudes. By simultaneously mitigating
gradient direction and magnitude conflicts, this scheme well avoids the
negative transfer and optimization imbalance problems. Extensive
experiments on CPTAC-UCEC and CPTAC-PDA datasets show that the proposed
MGIML framework performs favorably against state-of-the-art multi-modal
methods on missing-modality situations.
利用多模态放射病理数据和互补的临床信息进行癌症分级有助于医生提高诊断效率和准确性。然而,放射学和病理学数据具有明显的获取难度和成本,这导致不完整模态数据在应用中很常见。在这项工作中,我们提出了一种记忆和梯度引导的不完整模态学习(MGIML)框架,用于使用不完整的放射病理学数据进行癌症分级。首先,为了弥补缺失的模态信息,我们提出了一种记忆驱动的异模态补充(MH-Complete)方案,该方案构建了受粗粒度记忆增强(CMB)损失约束的模态特定记忆库,以记录通用放射学和病理学特征模式,并开发了一种跨模态记忆读取策略,该策略通过细粒度记忆一致性(FMC)损失来增强,以从存储良好的记忆中获取缺失模态信息。其次,由于缺失模态情况之间存在梯度冲突,我们提出了一种旋转驱动的梯度均质化(RG-Homogenize)方案,该方案估计特定于实例的旋转矩阵以平滑地改变特征级梯度方向,并计算置信引导的均质化权重来动态平衡梯度大小。通过同时缓解梯度方向和幅度冲突,该方案很好地避免了负传递和优化不平衡问题。对 CPTAC-UCEC 和 CPTAC-PDA 数据集的大量实验表明,所提出的 MGIML 框架在模态缺失的情况下优于最先进的多模态方法。
AU Huang, Wei
Zhang, Lei
Wang, Zizhou
Wang, Lituan
AU Huang、张伟、王雷、王子洲、立团
Exploring Inherent Consistency for Semi-supervised Anatomical Structure
Segmentation in Medical Imaging.
探索医学成像中半监督解剖结构分割的固有一致性。
Due to the exorbitant expense of obtaining labeled data in the field of
medical image analysis, semi-supervised learning has emerged as a
favorable method for the segmentation of anatomical structures. Although
semi-supervised learning techniques have shown great potential in this
field, existing methods only utilize image-level spatial consistency to
impose unsupervised regularization on data in label space. Considering
that anatomical structures often possess inherent anatomical properties
that have not been focused on in previous works, this study introduces
the inherent consistency into semi-supervised anatomical structure
segmentation. First, the prediction and the ground-truth are projected
into an embedding space to obtain latent representations that
encapsulate the inherent anatomical properties of the structures. Then,
two inherent consistency constraints are designed to leverage these
inherent properties by aligning these latent representations. The
proposed method is plug-and-play and can be seamlessly integrated with
existing methods, thereby collaborating to improve segmentation
performance and enhance the anatomical plausibility of the results. To
evaluate the effectiveness of the proposed method, experiments are
conducted on three public datasets (ACDC, LA, and Pancreas). Extensive
experimental results demonstrate that the proposed method exhibits good
generalizability and outperforms several state-of-the-art methods.
由于在医学图像分析领域获取标记数据的费用高昂,半监督学习已成为解剖结构分割的有利方法。尽管半监督学习技术在该领域显示出巨大的潜力,但现有方法仅利用图像级空间一致性对标签空间中的数据施加无监督正则化。考虑到解剖结构往往具有先前工作中未关注的固有解剖特性,本研究将固有一致性引入半监督解剖结构分割。首先,将预测和真实情况投影到嵌入空间中,以获得封装结构固有解剖特性的潜在表示。然后,设计两个固有的一致性约束,通过对齐这些潜在表示来利用这些固有属性。所提出的方法是即插即用的,可以与现有方法无缝集成,从而协作提高分割性能并增强结果的解剖合理性。为了评估所提出方法的有效性,在三个公共数据集(ACDC、LA 和 Pancreas)上进行了实验。大量的实验结果表明,所提出的方法具有良好的通用性,并且优于几种最先进的方法。
AU Fu, Wenli
Hu, Huijun
Li, Xinyue
Guo, Rui
Chen, Tao
Qian, Xiaohua
AU Fu、胡文丽、李慧君、郭新月、陈锐、陶谦、晓华
A Generalizable Causal-Invariance-Driven Segmentation Model for
Peripancreatic Vessels.
一种可推广的因果不变性驱动的胰周血管分割模型。
Segmenting peripancreatic vessels in CT, including the superior
mesenteric artery (SMA), the coeliac artery (CA), and the partial portal
venous system (PPVS), is crucial for preoperative resectability analysis
in pancreatic cancer. However, the clinical applicability of vessel
segmentation methods is impeded by the low generalizability on
multi-center data, mainly attributed to the wide variations in image
appearance, namely the spurious correlation factor. Therefore, we
propose a causal-invariance-driven generalizable segmentation model for
peripancreatic vessels. It incorporates interventions at both image and
feature levels to guide the model to capture causal information by
enforcing consistency across datasets, thus enhancing the generalization
performance. Specifically, firstly, a contrast-driven image intervention
strategy is proposed to construct image-level interventions by
generating images with various contrast-related appearances and seeking
invariant causal features. Secondly, the feature intervention strategy
is designed, where various patterns of feature bias across different
centers are simulated to pursue invariant prediction. The proposed model
achieved high DSC scores (79.69%, 82.62%, and 83.10%) for the three
vessels on a cross-validation set containing 134 cases. Its
generalizability was further confirmed on three independent test sets of
233 cases. Overall, the proposed method provides an accurate and
generalizable segmentation model for peripancreatic vessels and offers a
promising paradigm for increasing the generalizability of segmentation
models from a causality perspective. Our source codes will be released
at https://github.com/SJTUBME-QianLab/PC_VesselSeg.
在 CT 中分割胰周血管,包括肠系膜上动脉 (SMA)、腹腔动脉 (CA) 和部分门静脉系统 (PPVS),对于胰腺癌术前可切除性分析至关重要。然而,血管分割方法的临床适用性因多中心数据的通用性较低而受到阻碍,这主要归因于图像外观的巨大变化,即虚假相关因子。因此,我们提出了一种因果不变性驱动的胰周血管广义分割模型。它结合了图像和特征级别的干预措施,通过强制数据集之间的一致性来指导模型捕获因果信息,从而提高泛化性能。具体来说,首先,提出了一种对比度驱动的图像干预策略,通过生成具有各种对比度相关外观的图像并寻求不变的因果特征来构建图像级干预。其次,设计了特征干预策略,模拟不同中心的各种特征偏差模式以追求不变的预测。所提出的模型在包含 134 个案例的交叉验证集上为三艘船取得了高 DSC 分数(79.69%、82.62% 和 83.10%)。其普遍性在包含 233 个案例的三个独立测试集上得到进一步证实。总体而言,所提出的方法为胰周血管提供了准确且可概括的分割模型,并为从因果关系角度提高分割模型的概括性提供了一个有前景的范例。我们的源代码将在 https://github.com/SJTUBME-QianLab/PC_VesselSeg 发布。
AU Wang, Jinhong
Xu, Zhe
Zheng, Wenhao
Ying, Haochao
Chen, Tingting
Liu, Zuozhu
Chen, Danny Z.
Yao, Ke
Wu, Jian
王AU、徐金红、郑喆、应文浩、陈浩超、刘婷婷、陈佐助、姚子明、吴克、Jian
A Transformer-Based Knowledge Distillation Network for Cortical Cataract
Grading
基于 Transformer 的皮质白内障分级知识蒸馏网络
Cortical cataract, a common type of cataract, is particularly difficult
to be diagnosed automatically due to the complex features of the
lesions. Recently, many methods based on edge detection or deep learning
were proposed for automatic cataract grading. However, these methods
suffer a large performance drop in cortical cataract grading due to the
more complex cortical opacities and uncertain data. In this paper, we
propose a novel Transformer-based Knowledge Distillation Network, called
TKD-Net, for cortical cataract grading. To tackle the complex opacity
problem, we first devise a zone decomposition strategy to extract more
refined features and introduce special sub-scores to consider critical
factors of clinical cortical opacity assessment (location, area,
density) for comprehensive quantification. Next, we develop a
multi-modal mix-attention Transformer to efficiently fuse sub-scores and
image modality for complex feature learning. However, obtaining the
sub-score modality is a challenge in the clinic, which could cause the
modality missing problem instead. To simultaneously alleviate the issues
of modality missing and uncertain data, we further design a
Transformer-based knowledge distillation method, which uses a teacher
model with perfect data to guide a student model with modality-missing
and uncertain data. We conduct extensive experiments on a dataset of
commonly-used slit-lamp images annotated by the LOCS III grading system
to demonstrate that our TKD-Net outperforms state-of-the-art methods, as
well as the effectiveness of its key components.
皮质性白内障是白内障的一种常见类型,由于病变特征复杂,自动诊断尤为困难。最近,提出了许多基于边缘检测或深度学习的方法用于自动白内障分级。然而,由于更复杂的皮质混浊和不确定的数据,这些方法在皮质白内障分级方面的性能大幅下降。在本文中,我们提出了一种新颖的基于 Transformer 的知识蒸馏网络,称为 TKD-Net,用于皮质白内障分级。为了解决复杂的不透明问题,我们首先设计了一种区域分解策略来提取更精细的特征,并引入特殊的子分数来考虑临床皮质不透明评估的关键因素(位置、面积、密度)以进行全面量化。接下来,我们开发了一个多模态混合注意力 Transformer,以有效地融合子分数和图像模态以进行复杂的特征学习。然而,获得子评分模态在临床中是一个挑战,这可能会导致模态缺失问题。为了同时缓解模态缺失和不确定数据的问题,我们进一步设计了一种基于 Transformer 的知识蒸馏方法,该方法使用具有完美数据的教师模型来指导具有模态缺失和不确定数据的学生模型。我们对由 LOCS III 分级系统注释的常用裂隙灯图像数据集进行了广泛的实验,以证明我们的 TKD-Net 优于最先进的方法及其关键组件的有效性。
AU Vu, Tri
Klippel, Paul
Canning, Aidan J.
Ma, Chenshuo
Zhang, Huijuan
Kasatkina, Ludmila A.
Tang, Yuqi
Xia, Jun
Verkhusha, Vladislav V.
Tuan Vo-Dinh
Jing, Yun
Yao, Junjie
AU Vu、Tri Klippel、Paul Canning、Aidan J. Ma、张晨硕、Huijuan Kasatkina、Ludmila A. Tang、Yuqi Xia、Jun Verkhusha、Vladislav V. Tuan Vo-Dinh Jing、Yun Yao、Junjie
On the Importance of Low-Frequency Signals in Functional and Molecular
Photoacoustic Computed Tomography
低频信号在功能和分子光声计算机断层扫描中的重要性
In photoacoustic computed tomography (PACT) with short-pulsed laser
excitation, wideband acoustic signals are generated in biological
tissues with frequencies related to the effective shapes and sizes of
the optically absorbing targets. Low-frequency photoacoustic signal
components correspond to slowly varying spatial features and are often
omitted during imaging due to the limited detection bandwidth of the
ultrasound transducer, or during image reconstruction as undesired
background that degrades image contrast. Here we demonstrate that
low-frequency photoacoustic signals, in fact, contain functional and
molecular information, and can be used to enhance structural visibility,
improve quantitative accuracy, and reduce spare-sampling artifacts. We
provide an in-depth theoretical analysis of low-frequency signals in
PACT, and experimentally evaluate their impact on several representative
PACT applications, such as mapping temperature in photothermal
treatment, measuring blood oxygenation in a hypoxia challenge, and
detecting photoswitchable molecular probes in deep organs. Our results
strongly suggest that low-frequency signals are important for functional
and molecular PACT.
在采用短脉冲激光激发的光声计算机断层扫描 (PACT) 中,生物组织中会产生宽带声信号,其频率与光吸收目标的有效形状和尺寸相关。低频光声信号分量对应于缓慢变化的空间特征,并且由于超声换能器的有限检测带宽而在成像期间经常被忽略,或者在图像重建期间作为降低图像对比度的不需要的背景而被忽略。在这里,我们证明低频光声信号实际上包含功能和分子信息,可用于增强结构可视性、提高定量精度并减少备用采样伪影。我们对 PACT 中的低频信号进行了深入的理论分析,并通过实验评估了它们对几种代表性 PACT 应用的影响,例如绘制光热治疗中的温度图、测量缺氧挑战中的血氧饱和度以及检测深部光开关分子探针。器官。我们的结果强烈表明低频信号对于功能和分子 PACT 很重要。
AU Xie, Jiaming
Zhang, Qing
Cui, Zhiming
Ma, Chong
Zhou, Yan
Wang, Wenping
Shen, Dinggang
谢AU、张家明、崔青、马志明、周冲、王艳、沉文平、丁刚
Integrating Eye Tracking with Grouped Fusion Networks for Semantic
Segmentation on Mammogram Images.
将眼动追踪与分组融合网络相结合,对乳房 X 光图像进行语义分割。
Medical image segmentation has seen great progress in recent years,
largely due to the development of deep neural networks. However, unlike
in computer vision, high-quality clinical data is relatively scarce, and
the annotation process is often a burden for clinicians. As a result,
the scarcity of medical data limits the performance of existing medical
image segmentation models. In this paper, we propose a novel framework
that integrates eye tracking information from experienced radiologists
during the screening process to improve the performance of deep neural
networks with limited data. Our approach, a grouped hierarchical
network, guides the network to learn from its faults by using gaze
information as weak supervision. We demonstrate the effectiveness of our
framework on mammogram images, particularly for handling segmentation
classes with large scale differences.We evaluate the impact of gaze
information on medical image segmentation tasks and show that our method
achieves better segmentation performance compared to state-of-the-art
models. A robustness study is conducted to investigate the influence of
distraction or inaccuracies in gaze collection. We also develop a
convenient system for collecting gaze data without interrupting the
normal clinical workflow. Our work offers novel insights into the
potential benefits of integrating gaze information into medical image
segmentation tasks.
近年来,医学图像分割取得了长足的进步,这很大程度上归功于深度神经网络的发展。然而,与计算机视觉不同的是,高质量的临床数据相对稀缺,并且注释过程往往是临床医生的负担。因此,医学数据的稀缺限制了现有医学图像分割模型的性能。在本文中,我们提出了一种新颖的框架,该框架在筛选过程中集成了经验丰富的放射科医生的眼动追踪信息,以提高深度神经网络在有限数据下的性能。我们的方法是一个分组分层网络,通过使用注视信息作为弱监督来引导网络从错误中学习。我们展示了我们的框架在乳房X光图像上的有效性,特别是在处理具有大尺度差异的分割类方面。我们评估了注视信息对医学图像分割任务的影响,并表明我们的方法与现有技术相比实现了更好的分割性能艺术模型。进行稳健性研究来调查视线收集中注意力分散或不准确的影响。我们还开发了一种方便的系统,用于在不中断正常临床工作流程的情况下收集眼动数据。我们的工作为将注视信息集成到医学图像分割任务中的潜在好处提供了新颖的见解。
AU Gao, Bin
Yu, Aiju
Qiao, Chen
Calhoun, Vince D
Stephen, Julia M
Wilson, Tony W
Wang, Yu-Ping
AU 高、于斌、乔爱菊、Chen Calhoun、Vince D Stephen、Julia M Wilson、Tony W Wang、Yu-Ping
An Explainable Unified Framework of Spatio-Temporal Coupling Learning
with Application to Dynamic Brain Functional Connectivity Analysis.
时空耦合学习的可解释统一框架及其应用于动态脑功能连接分析。
Time-series data such as fMRI and MEG carry a wealth of inherent
spatio-temporal coupling relationship, and their modeling via deep
learning is essential for uncovering biological mechanisms. However,
current machine learning models for mining spatio-temporal information
usually overlook this intrinsic coupling association, in addition to
poor explainability. In this paper, we present an explainable learning
framework for spatio-temporal coupling. Specifically, this framework
constructs a deep learning network based on spatio-temporal correlation,
which can well integrate the time-varying coupled relationships between
node representation and inter-node connectivity. Furthermore, it
explores spatio-temporal evolution at each time step, providing a better
explainability of the analysis results. Finally, we apply the proposed
framework to brain dynamic functional connectivity (dFC) analysis.
Experimental results demonstrate that it can effectively capture the
variations in dFC during brain development and the evolution of
spatio-temporal information at the resting state. Two distinct
developmental functional connectivity (FC) patterns are identified.
Specifically, the connectivity among regions related to emotional
regulation decreases, while the connectivity associated with cognitive
activities increases. In addition, children and young adults display
notable cyclic fluctuations in resting-state brain dFC.
fMRI和MEG等时间序列数据具有丰富的固有时空耦合关系,通过深度学习对其进行建模对于揭示生物机制至关重要。然而,当前用于挖掘时空信息的机器学习模型除了可解释性差之外,通常忽视了这种内在的耦合关联。在本文中,我们提出了一个可解释的时空耦合学习框架。具体来说,该框架构建了一个基于时空相关性的深度学习网络,可以很好地融合节点表示和节点间连接性之间的时变耦合关系。此外,它还探索了每个时间步骤的时空演化,为分析结果提供了更好的可解释性。最后,我们将所提出的框架应用于大脑动态功能连接(dFC)分析。实验结果表明,它可以有效捕捉大脑发育过程中dFC的变化以及静息状态下时空信息的演化。确定了两种不同的发育功能连接(FC)模式。具体来说,与情绪调节相关的区域之间的连通性减少,而与认知活动相关的区域之间的连通性增加。此外,儿童和年轻人的静息态大脑 dFC 表现出显着的周期性波动。
AU Huang, Wenhao
Gong, Haifan
Zhang, Huan
Wang, Yu
Wan, Xiang
Li, Guanbin
Li, Haofeng
Shen, Hong
黄AU、龚文浩、张海帆、王欢、万宇、李翔、李冠斌、沉浩峰、洪
BCNet: Bronchus Classification via Structure Guided Representation
Learning.
BCNet:通过结构引导表示学习进行支气管分类。
CT-based bronchial tree analysis is a key step for the diagnosis of lung
and airway diseases. However, the topology of bronchial trees varies
across individuals, which presents a challenge to the automatic bronchus
classification. To solve this issue, we propose the Bronchus
Classification Network (BCNet), a structure-guided framework that
exploits the segment-level topological information using point clouds to
learn the voxel-level features. BCNet has two branches, a Point-Voxel
Graph Neural Network (PV-GNN) for segment classification, and a
Convolutional Neural Network (CNN) for voxel labeling. The two branches
are simultaneously trained to learn topology-aware features for their
shared backbone while it is feasible to run only the CNN branch for the
inference. Therefore, BCNet maintains the same inference efficiency as
its CNN baseline. Experimental results show that BCNet significantly
exceeds the state-of-the-art methods by over 8.0% both on F1-score for
classifying bronchus. Furthermore, we contribute BronAtlas: an
open-access benchmark of bronchus imaging analysis with high-quality
voxel-wise annotations of both anatomical and abnormal bronchial
segments. The benchmark is available at link1.
基于 CT 的支气管树分析是诊断肺部和气道疾病的关键步骤。然而,支气管树的拓扑结构因个体而异,这对支气管自动分类提出了挑战。为了解决这个问题,我们提出了支气管分类网络(BCNet),这是一种结构引导框架,利用点云利用段级拓扑信息来学习体素级特征。 BCNet 有两个分支,用于分段分类的点体素图神经网络(PV-GNN)和用于体素标记的卷积神经网络(CNN)。这两个分支同时接受训练,以学习其共享主干的拓扑感知特征,同时仅运行 CNN 分支进行推理也是可行的。因此,BCNet 保持了与其 CNN 基线相同的推理效率。实验结果表明,BCNet 在支气管分类的 F1 分数上均显着超过最先进的方法 8.0% 以上。此外,我们还贡献了 BronAtlas:支气管成像分析的开放获取基准,对解剖和异常支气管段进行高质量的体素注释。该基准测试可在 link1 上找到。
AU Caravaca, Javier
Bobba, Kondapa Naidu
Du, Shixian
Peter, Robin
Gullberg, Grant T.
Bidkar, Anil P.
Flavell, Robert R.
Seo, Youngho
AU Caravaca、Javier Bobba、Kondapa Naidu Du、Shixian Peter、Robin Gullberg、Grant T. Bidkar、Anil P. Flavell、Robert R. Seo、Youngho
A Technique to Quantify Very Low Activities in Regions of Interest With
a Collimatorless Detector
使用无准直仪探测器量化感兴趣区域中极低活性的技术
We present a new method to measure sub-microcurie activities of
photon-emitting radionuclides in organs and lesions of small animals in
vivo. Our technique, named the collimator-less likelihood fit, combines
a very high sensitivity collimatorless detector with a Monte Carlo-based
likelihood fit in order to estimate the activities in previously
segmented regions of interest along with their uncertainties. This is
done directly from the photon projections in our collimatorless detector
and from the region of interest segmentation provided by an x-ray
computed tomography scan. We have extensively validated our approach
with Ac-225 experimentally in spherical phantoms and mouse phantoms, and
also numerically with simulations of a realistic mouse anatomy. Our
method yields statistically unbiased results with uncertainties smaller
than 20% for activities as low as similar to 111Bq (3nCi) and for
exposures under 30 minutes. We demonstrate that our method yields more
robust recovery coefficients when compared to SPECT imaging with a
commercial pre-clinical scanner, specially at very low activities. Thus,
our technique is complementary to traditional SPECT/CT imaging since it
provides a more accurate and precise organ and tumor dosimetry, with a
more limited spatial information. Finally, our technique is specially
significant in extremely low-activity scenarios when SPECT/CT imaging is
simply not viable.
我们提出了一种新方法来测量小动物体内器官和病变中光子发射放射性核素的亚微居里活性。我们的技术称为无准直器似然拟合,它将非常高灵敏度的无准直器探测器与基于蒙特卡罗的似然拟合相结合,以估计先前分割的感兴趣区域中的活动及其不确定性。这是直接通过无准直仪探测器中的光子投影和 X 射线计算机断层扫描提供的感兴趣区域分割来完成的。我们使用 Ac-225 在球形模型和小鼠模型中进行了实验,并通过对真实小鼠解剖结构进行数值模拟来广泛验证了我们的方法。我们的方法可产生统计上无偏差的结果,对于低至 111Bq (3nCi) 的活动以及暴露时间低于 30 分钟的情况,不确定性小于 20%。我们证明,与使用商用临床前扫描仪进行 SPECT 成像相比,我们的方法可产生更稳健的恢复系数,特别是在活性非常低的情况下。因此,我们的技术是对传统 SPECT/CT 成像的补充,因为它提供了更准确和精确的器官和肿瘤剂量测定,且空间信息更有限。最后,当 SPECT/CT 成像根本不可行时,我们的技术在活动极低的情况下特别重要。
AU Cui, Hengfei
Li, Yan
Wang, Yifan
Xu, Di
Wu, Lian-Ming
Xia, Yong
崔AU、李恒飞、王艳、徐一凡、吴迪、夏连明、勇
Toward Accurate Cardiac MRI Segmentation With Variational
Autoencoder-Based Unsupervised Domain Adaptation
通过基于变分自动编码器的无监督域适应实现准确的心脏 MRI 分割
Accurate myocardial segmentation is crucial in the diagnosis and
treatment of myocardial infarction (MI), especially in Late Gadolinium
Enhancement (LGE) cardiac magnetic resonance (CMR) images, where the
infarcted myocardium exhibits a greater brightness. However,
segmentation annotations for LGE images are usually not available.
Although knowledge gained from CMR images of other modalities with ample
annotations, such as balanced-Steady State Free Precession (bSSFP), can
be transferred to the LGE images, the difference in image distribution
between the two modalities (i.e., domain shift) usually results in a
significant degradation in model performance. To alleviate this, an
end-to-end Variational autoencoder based feature Alignment Module
Combining Explicit and Implicit features (VAMCEI) is proposed. We first
re-derive the Kullback-Leibler (KL) divergence between the posterior
distributions of the two domains as a measure of the global distribution
distance. Second, we calculate the prototype contrastive loss between
the two domains, bringing closer the prototypes of the same category
across domains and pushing away the prototypes of different categories
within or across domains. Finally, a domain discriminator is added to
the output space, which indirectly aligns the feature distribution and
forces the extracted features to be more favorable for segmentation. In
addition, by combining CycleGAN and VAMCEI, we propose a more refined
multi-stage unsupervised domain adaptation (UDA) framework for
myocardial structure segmentation. We conduct extensive experiments on
the MSCMRSeg 2019, MyoPS 2020 and MM-WHS 2017 datasets. The experimental
results demonstrate that our framework achieves superior performances
than state-of-the-art methods.
准确的心肌分割对于心肌梗死(MI)的诊断和治疗至关重要,特别是在晚期钆增强(LGE)心脏磁共振(CMR)图像中,梗塞心肌表现出更大的亮度。然而,LGE 图像的分割注释通常不可用。尽管从具有充足注释的其他模态的 CMR 图像(例如平衡稳态自由进动 (bSSFP))获得的知识可以转移到 LGE 图像,但这两种模态之间的图像分布差异(即域偏移)通常会导致模型性能显着下降。为了缓解这个问题,提出了一种基于端到端变分自动编码器的特征对齐模块组合显式和隐式特征(VAMCEI)。我们首先重新推导两个域的后验分布之间的 Kullback-Leibler (KL) 散度作为全局分布距离的度量。其次,我们计算两个领域之间的原型对比损失,使跨领域的同一类别的原型更加接近,并推开领域内或跨领域的不同类别的原型。最后,在输出空间中添加域鉴别器,间接对齐特征分布并强制提取的特征更有利于分割。此外,通过结合CycleGAN和VAMCEI,我们提出了一种更精细的多阶段无监督域适应(UDA)框架用于心肌结构分割。我们在 MSCMRSeg 2019、MyoPS 2020 和 MM-WHS 2017 数据集上进行了广泛的实验。实验结果表明,我们的框架比最先进的方法具有更优越的性能。
AU Wang, Zihao
Yang, Yingyu
Chen, Yuzhou
Yuan, Tingting
Sermesant, Maxime
Delingette, Herve
Wu, Ona
AU Wang、杨子豪、陈英宇、袁雨洲、婷婷 Sermesant、Maxime Delingette、Herve Wu、Ona
Mutual Information Guided Diffusion for Zero-Shot Cross-Modality Medical
Image Translation
零样本跨模态医学图像翻译的互信息引导扩散
Cross-modality data translation has attracted great interest in medical
image computing. Deep generative models show performance improvement in
addressing related challenges. Nevertheless, as a fundamental challenge
in image translation, the problem of zero-shot learning cross-modality
image translation with fidelity remains unanswered. To bridge this gap,
we propose a novel unsupervised zero-shot learning method called Mutual
Information guided Diffusion Model, which learns to translate an unseen
source image to the target modality by leveraging the inherent
statistical consistency of Mutual Information between different
modalities. To overcome the prohibitive high dimensional Mutual
Information calculation, we propose a differentiable local-wise mutual
information layer for conditioning the iterative denoising process. The
Local-wise-Mutual-Information-Layer captures identical cross-modality
features in the statistical domain, offering diffusion guidance without
relying on direct mappings between the source and target domains. This
advantage allows our method to adapt to changing source domains without
the need for retraining, making it highly practical when sufficient
labeled source domain data is not available. We demonstrate the superior
performance of MIDiffusion in zero-shot cross-modality translation tasks
through empirical comparisons with other generative models, including
adversarial-based and diffusion-based models. Finally, we showcase the
real-world application of MIDiffusion in 3D zero-shot learning-based
cross-modality image segmentation tasks.
跨模态数据翻译引起了医学图像计算的极大兴趣。深度生成模型在解决相关挑战方面表现出性能改进。然而,作为图像翻译的一个基本挑战,零样本学习保真度的跨模态图像翻译问题仍然没有得到解决。为了弥补这一差距,我们提出了一种新颖的无监督零样本学习方法,称为互信息引导扩散模型,该方法学习通过利用不同模态之间互信息的固有统计一致性将看不见的源图像转换为目标模态。为了克服令人望而却步的高维互信息计算,我们提出了一种可微的局部互信息层来调节迭代去噪过程。局部互信息层捕获统计域中相同的跨模态特征,提供扩散指导,而不依赖于源域和目标域之间的直接映射。这一优势使我们的方法能够适应不断变化的源域,而无需重新训练,这使得它在没有足够的标记源域数据可用时非常实用。我们通过与其他生成模型(包括基于对抗性和基于扩散的模型)的实证比较,证明了 MIDiffusion 在零样本跨模态翻译任务中的优越性能。最后,我们展示了 MIDiffusion 在基于 3D 零样本学习的跨模态图像分割任务中的实际应用。
AU Yang, Kun
Li, Qiang
Xu, Jiahong
Tang, Meng-Xing
Wang, Zhibiao
Tsui, Po-Hsiang
Zhou, Xiaowei
欧阳、李坤、徐强、唐嘉宏、王孟兴、徐志标、周博翔、晓伟
Frequency-Domain Robust PCA for Real-Time Monitoring of HIFU Treatment
用于实时监测 HIFU 治疗的频域鲁棒 PCA
High intensity focused ultrasound (HIFU) is a thriving non-invasive
technique for thermal ablation of tumors, but significant challenges
remain in its real-time monitoring with medical imaging. Ultrasound
imaging is one of the main imaging modalities for monitoring HIFU
surgery in organs other than the brain, mainly due to its good temporal
resolution. However, strong acoustic interference from HIFU irradiation
severely obscures the B-mode images and compromises the monitoring. To
address this problem, we proposed a frequency-domain robust principal
component analysis (FRPCA) method to separate the HIFU interference from
the contaminated B-mode images. Ex-vivo and in-vivo experiments were
conducted to validate the proposed method based on a clinical HIFU
therapy system combined with an ultrasound imaging platform. The
performance of the FRPCA method was compared with the conventional notch
filtering method. Results demonstrated that the FRPCA method can
effectively remove HIFU interference from the B-mode images, which
allowed HIFU-induced grayscale changes at the focal region to be
recovered. Compared to notch-filtered images, the FRPCA-processed images
showed an 8.9% improvement in terms of the structural similarity (SSIM)
index to the uncontaminated B-mode images. These findings demonstrate
that the FRPCA method presents an effective signal processing framework
to remove the strong HIFU acoustic interference, obtains better dynamic
visualization in monitoring the HIFU irradiation process, and offers
great potential to improve the efficacy and safety of HIFU treatment and
other focused ultrasound related applications.
高强度聚焦超声(HIFU)是一种蓬勃发展的肿瘤热消融非侵入性技术,但其医学成像实时监测仍面临重大挑战。超声成像是监测大脑以外器官 HIFU 手术的主要成像方式之一,主要是由于其良好的时间分辨率。然而,HIFU 辐射产生的强烈声干扰严重模糊了 B 型图像并影响了监测。为了解决这个问题,我们提出了一种频域鲁棒主成分分析(FRPCA)方法来将 HIFU 干扰与污染的 B 模式图像分开。进行了体外和体内实验来验证所提出的基于临床 HIFU 治疗系统结合超声成像平台的方法。将FRPCA方法的性能与传统的陷波滤波方法进行了比较。结果表明,FRPCA 方法可以有效去除 B 模式图像中的 HIFU 干扰,从而可以恢复 HIFU 引起的焦点区域的灰度变化。与陷波滤波图像相比,FRPCA 处理的图像与未污染的 B 模式图像的结构相似性 (SSIM) 指数提高了 8.9%。这些研究结果表明,FRPCA方法提供了一种有效的信号处理框架来消除强HIFU声学干扰,在监测HIFU照射过程中获得更好的动态可视化,并为提高HIFU治疗和其他聚焦超声相关的疗效和安全性提供了巨大的潜力。应用程序。
AU Mahapatra, Dwarikanath
Yepes, Antonio Jimeno
Bozorgtabar, Behzad
Roy, Sudipta
Ge, Zongyuan
Reyes, Mauricio
AU Mahapatra、Dwarikanath Yepes、Antonio Jimeno Bozorgtabar、Behzad Roy、Sudipta Ge、Zongyuan Reyes、Mauricio
Multi-Label Generalized Zero Shot Chest Xray Classification By Combining
Image-Text Information With Feature Disentanglement.
通过将图像文本信息与特征解缠结合起来进行多标签广义零射击胸部 X 射线分类。
In fully supervised learning-based medical image classification, the
robustness of a trained model is influenced by its exposure to the range
of candidate disease classes. Generalized Zero Shot Learning (GZSL) aims
to correctly predict seen and novel unseen classes. Current GZSL
approaches have focused mostly on the single-label case. However, it is
common for chest X-rays to be labelled with multiple disease classes. We
propose a novel multi-modal multi-label GZSL approach that leverages
feature disentanglement andmulti-modal information to synthesize
features of unseen classes. Disease labels are processed through a
pre-trained BioBert model to obtain text embeddings that are used to
create a dictionary encoding similarity among different labels. We then
use disentangled features and graph aggregation to learn a second
dictionary of inter-label similarities. A subsequent clustering step
helps to identify representative vectors for each class. The multi-modal
multi-label dictionaries and the class representative vectors are used
to guide the feature synthesis step, which is the most important
component of our pipeline, for generating realistic multi-label disease
samples of seen and unseen classes. Our method is benchmarked against
multiple competing methods and we outperform all of them based on
experiments conducted on the publicly available NIH and CheXpert chest
X-ray datasets.
在基于完全监督学习的医学图像分类中,经过训练的模型的稳健性受到其接触候选疾病类别范围的影响。广义零样本学习(GZSL)旨在正确预测已见过的和新的未见的类。目前的 GZSL 方法主要关注单标签情况。然而,胸部 X 光检查通常会标记多种疾病类别。我们提出了一种新颖的多模态多标签 GZSL 方法,该方法利用特征解缠结和多模态信息来合成未见过的类的特征。疾病标签通过预先训练的 BioBert 模型进行处理,以获得文本嵌入,用于创建编码不同标签之间相似性的字典。然后,我们使用解缠结的特征和图聚合来学习标签间相似性的第二个字典。随后的聚类步骤有助于识别每个类别的代表向量。多模态多标签字典和类代表向量用于指导特征合成步骤,这是我们管道中最重要的组成部分,用于生成可见和未见类别的真实多标签疾病样本。我们的方法以多种竞争方法为基准,并且根据在公开的 NIH 和 CheXpert 胸部 X 射线数据集上进行的实验,我们优于所有方法。
AU Yin, Yi
Clark, Alys R.
Collins, Sally L.
AU Yin、Yi Clark、Alys R. Collins、Sally L.
3D Single Vessel Fractional Moving Blood Volume (3D-svFMBV): Fully
Automated Tissue Perfusion Estimation Using Ultrasound
3D 单血管分数移动血容量 (3D-svFMBV):使用超声进行全自动组织灌注估计
Power Doppler ultrasound (PD-US) is the ideal modality to assess tissue
perfusion as it is cheap, patient-friendly and does not require ionizing
radiation. However, meaningful inter-patient comparison only occurs if
differences in tissue-attenuation are corrected for. This can be done by
standardizing the PD-US signal to a blood vessel assumed to have 100%
vascularity. The original method to do this is called fractional moving
blood volume (FMBV). We describe a novel, fully-automated method
combining image processing, numerical modelling, and deep learning to
estimate three-dimensional single vessel fractional moving blood volume
(3D-svFMBV). We map the PD signals to a characteristic intensity profile
within a single large vessel to define the standardization value at the
high shear vessel margins. This removes the need for mathematical
correction for background signal which can introduce error. The
3D-svFMBV was first tested on synthetic images generated using the
characteristics of uterine artery and physiological ultrasound noise
levels, demonstrating prediction of standardization value close to the
theoretical ideal. Clinical utility was explored using 143
first-trimester placental ultrasound volumes. More biologically
plausible perfusion estimates were obtained, showing improved prediction
of pre-eclampsia compared with those generated with the semi-automated
original 3D-FMBV technique. The proposed 3D-svFMBV method overcomes the
limitations of the original technique to provide accurate and robust
placental perfusion estimation. This not only has the potential to
provide an early pregnancy screening tool but may also be used to assess
perfusion of different organs and tumors.
能量多普勒超声 (PD-US) 是评估组织灌注的理想方式,因为它价格便宜、对患者友好且不需要电离辐射。然而,只有在组织衰减的差异得到纠正的情况下,才会发生有意义的患者间比较。这可以通过将 PD-US 信号标准化到假设具有 100% 血管分布的血管来完成。最初的方法称为移动血容量分数 (FMBV)。我们描述了一种新颖的全自动方法,结合图像处理、数值建模和深度学习来估计三维单血管移动血容量(3D-svFMBV)。我们将 PD 信号映射到单个大血管内的特征强度分布,以定义高剪切血管边缘的标准化值。这消除了对背景信号进行数学校正的需要,因为背景信号可能会引入误差。 3D-svFMBV 首先在使用子宫动脉特征和生理超声噪声水平生成的合成图像上进行测试,证明标准化值的预测接近理论理想值。使用 143 个早孕期胎盘超声体积探讨了临床实用性。获得了更符合生物学合理性的灌注估计值,与半自动原始 3D-FMBV 技术生成的灌注估计值相比,显示出对先兆子痫的预测得到了改善。所提出的 3D-svFMBV 方法克服了原始技术的局限性,提供准确且稳健的胎盘灌注估计。这不仅有可能提供早孕筛查工具,还可以用于评估不同器官和肿瘤的灌注。
AU Wang, Pengyu
Zhang, Huaqi
Yuan, Yixuan
王AU、张鹏宇、袁华琪、艺轩
MCPL: Multi-modal Collaborative Prompt Learning for Medical
Vision-Language Model.
MCPL:医学视觉语言模型的多模式协作即时学习。
Multi-modal prompt learning is a high-performance and cost-effective
learning paradigm, which learns text as well as image prompts to tune
pre-trained vision-language (V-L) models like CLIP for adapting multiple
downstream tasks. However, recent methods typically treat text and image
prompts as independent components without considering the dependency
between prompts. Moreover, extending multi-modal prompt learning into
the medical field poses challenges due to a significant gap between
general- and medical-domain data. To this end, we propose a Multi-modal
Collaborative Prompt Learning (MCPL) pipeline to tune a frozen V-L model
for aligning medical text-image representations, thereby achieving
medical downstream tasks. We first construct the anatomy-pathology (AP)
prompt for multi-modal prompting jointly with text and image prompts.
The AP prompt introduces instance-level anatomy and pathology
information, thereby making a V-L model better comprehend medical
reports and images. Next, we propose graph-guided prompt collaboration
module (GPCM), which explicitly establishes multi-way couplings between
the AP, text, and image prompts, enabling collaborative multi-modal
prompt producing and updating for more effective prompting. Finally, we
develop a novel prompt configuration scheme, which attaches the AP
prompt to the query and key, and the text/image prompt to the value in
self-attention layers for improving the interpretability of multi-modal
prompts. Extensive experiments on numerous medical classification and
object detection datasets show that the proposed pipeline achieves
excellent effectiveness and generalization. Compared with
state-of-the-art prompt learning methods, MCPL provides a more reliable
multi-modal prompt paradigm for reducing tuning costs of V-L models on
medical downstream tasks. Our code:
https://github.com/CUHK-AIM-Group/MCPL.
多模态提示学习是一种高性能且经济高效的学习范例,它学习文本和图像提示来调整预先训练的视觉语言(VL)模型(例如 CLIP),以适应多个下游任务。然而,最近的方法通常将文本和图像提示视为独立的组件,而不考虑提示之间的依赖性。此外,由于通用数据和医学领域数据之间存在巨大差距,将多模式即时学习扩展到医学领域也带来了挑战。为此,我们提出了一种多模态协作提示学习(MCPL)管道来调整冻结的 VL 模型,以对齐医学文本图像表示,从而实现医学下游任务。我们首先构建解剖病理学(AP)提示,用于结合文本和图像提示的多模式提示。 AP提示引入了实例级解剖学和病理学信息,从而使VL模型更好地理解医学报告和图像。接下来,我们提出了图引导提示协作模块(GPCM),它明确地建立了AP、文本和图像提示之间的多路耦合,实现协作式多模态提示生成和更新,以实现更有效的提示。最后,我们开发了一种新颖的提示配置方案,将 AP 提示附加到查询和密钥,将文本/图像提示附加到自注意力层中的值,以提高多模态提示的可解释性。对大量医学分类和目标检测数据集的广泛实验表明,所提出的流程实现了出色的有效性和泛化性。 与最先进的提示学习方法相比,MCPL 提供了更可靠的多模态提示范式,可降低 VL 模型在医疗下游任务上的调整成本。我们的代码:https://github.com/CUHK-AIM-Group/MCPL。
AU Mei, Xin
Yang, Libin
Gao, Denghong
Cai, Xiaoyan
Han, Junwei
Liu, Tianming
区梅、杨鑫、高立斌、蔡登红、韩晓燕、刘俊伟、天明
PhraseAug: An Augmented Medical Report Generation Model with Phrasebook.
PhraseAug:带有 Phrasebook 的增强型医疗报告生成模型。
Medical report generation is a valuable and challenging task, which
automatically generates accurate and fluent diagnostic reports for
medical images, reducing workload of radiologists and improving
efficiency of disease diagnosis. Fine-grained alignment of medical
images and reports facilitates the exploration of close correlations
between images and texts, which is crucial for cross-modal generation.
However, visual and linguistic biases caused by radiologists' writing
styles make cross-modal image-text alignment difficult. To alleviate
visual-linguistic bias, this paper discretizes medical reports and
introduces an intermediate modality, i.e. phrasebook, consisting of key
noun phrases. As discretized representation of medical reports,
phrasebook contains both disease-related medical terms, and synonymous
phrases representing different writing styles which can identify
synonymous sentences, thereby promoting fine-grained alignment between
images and reports. In this paper, an augmented two-stage medical report
generation model with phrasebook (PhraseAug) is developed, which
combines medical images, clinical histories and writing styles to
generate diagnostic reports. In the first stage, phrasebook is used to
extract semantically relevant important features and predict key phrases
contained in the report. In the second stage, medical reports are
generated according to the predicted key phrases which contain
synonymous phrases, promoting our model to adapt to different writing
styles and generating diverse medical reports. Experimental results on
two public datasets, IU-Xray and MIMIC-CXR, demonstrate that our
proposed PhraseAug outperforms state-of-the-art baselines.
医学报告生成是一项有价值且具有挑战性的任务,它自动生成准确、流畅的医学图像诊断报告,减少放射科医生的工作量,提高疾病诊断的效率。医学图像和报告的细粒度对齐有助于探索图像和文本之间的密切相关性,这对于跨模式生成至关重要。然而,放射科医生的写作风格造成的视觉和语言偏差使得跨模式图像文本对齐变得困难。为了减轻视觉语言偏差,本文对医疗报告进行离散化,并引入了一种中间模式,即由关键名词短语组成的短语手册。作为医学报告的离散化表示,短语手册既包含与疾病相关的医学术语,也包含代表不同写作风格的同义词短语,可以识别同义句子,从而促进图像和报告之间的细粒度对齐。本文开发了一种带有短语手册(PhraseAug)的增强型两阶段医疗报告生成模型,该模型结合医学图像、临床病史和写作风格来生成诊断报告。在第一阶段,短语手册用于提取语义相关的重要特征并预测报告中包含的关键短语。在第二阶段,根据包含同义短语的预测关键词生成医疗报告,促进我们的模型适应不同的写作风格并生成多样化的医疗报告。两个公共数据集 IU-Xray 和 MIMIC-CXR 的实验结果表明,我们提出的 PhraseAug 优于最先进的基线。
AU Wang, Enpeng
Liu, Yueang
Tu, Puxun
Taylor, Zeike A
Chen, Xiaojun
王AU、刘恩鹏、屠悦昂、泰勒普勋、陈泽科、晓军
Video-based Soft Tissue Deformation Tracking for Laparoscopic Augmented
Reality-based Navigation in Kidney Surgery.
基于视频的软组织变形跟踪,用于肾脏手术中腹腔镜增强现实导航。
Minimally invasive surgery (MIS) remains technically demanding due to
the difficulty of tracking hidden critical structures within the moving
anatomy of the patient. In this study, we propose a soft tissue
deformation tracking augmented reality (AR) navigation pipeline for
laparoscopic surgery of the kidneys. The proposed navigation pipeline
addresses two main sub-problems: the initial registration and
deformation tracking. Our method utilizes preoperative MR or CT data and
binocular laparoscopes without any additional interventional hardware.
The initial registration is resolved through a probabilistic rigid
registration algorithm and elastic compensation based on dense point
cloud reconstruction. For deformation tracking, the sparse feature point
displacement vector field continuously provides temporal boundary
conditions for the biomechanical model. To enhance the accuracy of the
displacement vector field, a novel feature points selection strategy
based on deep learning is proposed. Moreover, an ex-vivo experimental
method for internal structures error assessment is presented. The
ex-vivo experiments indicate an external surface reprojection error of
4.07 ± 2.17mm and a maximum mean absolutely error for internal
structures of 2.98mm. In-vivo experiments indicate mean absolutely error
of 3.28 ± 0.40mm and 1.90±0.24mm, respectively. The combined qualitative
and quantitative findings indicated the potential of our AR-assisted
navigation system in improving the clinical application of laparoscopic
kidney surgery.
由于难以跟踪患者移动解剖结构中隐藏的关键结构,微创手术 (MIS) 在技术上仍然要求很高。在这项研究中,我们提出了一种用于肾脏腹腔镜手术的软组织变形跟踪增强现实(AR)导航管道。所提出的导航管道解决两个主要子问题:初始配准和变形跟踪。我们的方法利用术前 MR 或 CT 数据和双目腹腔镜,无需任何额外的介入硬件。通过概率刚性配准算法和基于密集点云重建的弹性补偿来解决初始配准。对于变形跟踪,稀疏特征点位移矢量场连续为生物力学模型提供时间边界条件。为了提高位移矢量场的精度,提出了一种基于深度学习的特征点选择策略。此外,还提出了一种用于内部结构误差评估的离体实验方法。离体实验表明外表面重投影误差为 4.07 ± 2.17mm,内部结构的最大平均绝对误差为 2.98mm。体内实验表明平均绝对误差分别为 3.28 ± 0.40mm 和 1.90±0.24mm。定性和定量相结合的研究结果表明,我们的 AR 辅助导航系统在改善腹腔镜肾脏手术的临床应用方面具有潜力。
AU Bi, Xia-An
Yang, Zicheng
Huang, Yangjun
Chen, Ke
Xing, Zhaoxu
Xu, Luyun
Wu, Zihao
Liu, Zhengliang
Li, Xiang
Liu, Tianming
毕毕、杨夏安、黄子成、陈杨军、邢克、徐朝旭、吴路云、刘子豪、李正亮、刘翔、天明
CE-GAN: Community Evolutionary Generative Adversarial Network for
Alzheimer's Disease Risk Prediction.
CE-GAN:用于阿尔茨海默病风险预测的社区进化生成对抗网络。
In the studies of neurodegenerative diseases such as Alzheimer's Disease
(AD), researchers often focus on the associations among multi-omics
pathogeny based on imaging genetics data. However, current studies
overlook the communities in brain networks, leading to inaccurate models
of disease development. This paper explores the developmental patterns
of AD from the perspective of community evolution. We first establish a
mathematical model to describe functional degeneration in the brain as
the community evolution driven by entropy information propagation. Next,
we propose an interpretable Community Evolutionary Generative
Adversarial Network (CE-GAN) to predict disease risk. In the generator
of CE-GAN, community evolutionary convolutions are designed to capture
the evolutionary patterns of AD. The experiments are conducted using
functional magnetic resonance imaging (fMRI) data and single nucleotide
polymorphism (SNP) data. CE-GAN achieves 91.67% accuracy and 91.83% area
under curve (AUC) in AD risk prediction tasks, surpassing advanced
methods on the same dataset. In addition, we validated the effectiveness
of CE-GAN for pathogeny extraction. The source code of this work is
available at https://github.com/fmri123456/CE-GAN.
在阿尔茨海默病(AD)等神经退行性疾病的研究中,研究人员经常关注基于成像遗传学数据的多组学病因之间的关联。然而,目前的研究忽视了大脑网络中的群落,导致疾病发展模型不准确。本文从群落演化的角度探讨AD的发展模式。我们首先建立一个数学模型来描述大脑功能退化作为熵信息传播驱动的群落进化。接下来,我们提出了一个可解释的社区进化生成对抗网络(CE-GAN)来预测疾病风险。在CE-GAN的生成器中,社区进化卷积被设计用来捕获AD的进化模式。这些实验是使用功能磁共振成像(fMRI)数据和单核苷酸多态性(SNP)数据进行的。 CE-GAN 在 AD 风险预测任务中实现了 91.67% 的准确率和 91.83% 的曲线下面积 (AUC),超越了同一数据集上的先进方法。此外,我们还验证了 CE-GAN 在病原提取方面的有效性。这项工作的源代码可在 https://github.com/fmri123456/CE-GAN 获取。
AU Kang, Eunsong
Heo, Da-Woon
Lee, Jiwon
Suk, Heung-, II
AU Kang、Eunsong Heo、Da-Woon Lee、Jiwon Suk、Heung-、II
A Learnable Counter-Condition Analysis Framework for Functional
Connectivity-Based Neurological Disorder Diagnosis
基于功能连接的神经系统疾病诊断的可学习反条件分析框架
To understand the biological characteristics of neurological disorders
with functional connectivity (FC), recent studies have widely utilized
deep learning-based models to identify the disease and conducted
post-hoc analyses via explainable models to discover disease-related
biomarkers. Most existing frameworks consist of three stages, namely,
feature selection, feature extraction for classification, and analysis,
where each stage is implemented separately. However, if the results at
each stage lack reliability, it can cause misdiagnosis and incorrect
analysis in afterward stages. In this study, we propose a novel unified
framework that systemically integrates diagnoses (i.e., feature
selection and feature extraction) and explanations. Notably, we devised
an adaptive attention network as a feature selection approach to
identify individual-specific disease-related connections. We also
propose a functional network relational encoder that summarizes the
global topological properties of FC by learning the inter-network
relations without pre-defined edges between functional networks. Last
but not least, our framework provides a novel explanatory power for
neuroscientific interpretation, also termed counter-condition analysis.
We simulated the FC that reverses the diagnostic information (i.e.,
counter-condition FC): converting a normal brain to be abnormal and vice
versa. We validated the effectiveness of our framework by using two
large resting-state functional magnetic resonance imaging (fMRI)
datasets, Autism Brain Imaging Data Exchange (ABIDE) and REST-meta-MDD,
and demonstrated that our framework outperforms other competing methods
for disease identification. Furthermore, we analyzed the disease-related
neurological patterns based on counter-condition analysis.
为了了解具有功能连接(FC)的神经系统疾病的生物学特征,最近的研究广泛利用基于深度学习的模型来识别疾病,并通过可解释的模型进行事后分析,以发现与疾病相关的生物标志物。大多数现有框架由三个阶段组成,即特征选择、分类特征提取和分析,每个阶段都是单独实现的。但如果每个阶段的结果缺乏可靠性,就会导致后续阶段的误诊和错误分析。在本研究中,我们提出了一种新颖的统一框架,系统地集成了诊断(即特征选择和特征提取)和解释。值得注意的是,我们设计了一个自适应注意力网络作为特征选择方法来识别个体特定的疾病相关联系。我们还提出了一种功能网络关系编码器,通过学习网络间关系来总结 FC 的全局拓扑特性,而无需功能网络之间预先定义的边。最后但并非最不重要的一点是,我们的框架为神经科学解释提供了一种新颖的解释力,也称为反条件分析。我们模拟了逆转诊断信息的FC(即反条件FC):将正常大脑转换为异常大脑,反之亦然。我们通过使用两个大型静息态功能磁共振成像 (fMRI) 数据集、自闭症脑成像数据交换 (ABIDE) 和 REST-meta-MDD 验证了我们框架的有效性,并证明我们的框架优于其他疾病识别竞争方法。此外,我们根据反条件分析分析了与疾病相关的神经模式。
AU Gao, Cong
Feng, Anqi
Liu, Xingtong
Taylor, Russell H.
Armand, Mehran
Unberath, Mathias
AU 高、冯丛、刘安琪、Xingtong Taylor、Russell H. Armand、Mehran Unberath、Mathias
A Fully Differentiable Framework for 2D/3D Registration and the
Projective Spatial Transformers
2D/3D 配准和投影空间变换器的完全可微框架
Image-based 2D/3D registration is a critical technique for fluoroscopic
guided surgical interventions. Conventional intensity-based 2D/3D
registration approaches suffer from a limited capture range due to the
presence of local minima in hand-crafted image similarity functions. In
this work, we aim to extend the 2D/3D registration capture range with a
fully differentiable deep network framework that learns to approximate a
convex-shape similarity function. The network uses a novel Projective
Spatial Transformer (ProST) module that has unique differentiability
with respect to 3D pose parameters, and is trained using an innovative
double backward gradient-driven loss function. We compare the most
popular learning-based pose regression methods in the literature and use
the well-established CMAES intensity-based registration as a benchmark.
We report registration pose error, target registration error (TRE) and
success rate (SR) with a threshold of 10mm for mean TRE. For the pelvis
anatomy, the median TRE of ProST followed by CMAES is 4.4mm with a SR of
65.6% in simulation, and 2.2mm with a SR of 73.2% in real data. The
CMAES SRs without using ProST registration are 28.5% and 36.0% in
simulation and real data, respectively. Our results suggest that the
proposed ProST network learns a practical similarity function, which
vastly extends the capture range of conventional intensity-based 2D/3D
registration. We believe that the unique differentiable property of
ProST has the potential to benefit related 3D medical imaging research
applications. The source code is available at
https://github.com/gaocong13/Projective-Spatial-Transformers.
基于图像的 2D/3D 配准是荧光镜引导手术干预的关键技术。由于手工制作的图像相似性函数中存在局部最小值,传统的基于强度的 2D/3D 配准方法的捕获范围有限。在这项工作中,我们的目标是通过学习近似凸形状相似函数的完全可微的深度网络框架来扩展 2D/3D 配准捕获范围。该网络使用新颖的投影空间变换器 (ProST) 模块,该模块对于 3D 姿态参数具有独特的可微性,并使用创新的双反向梯度驱动损失函数进行训练。我们比较了文献中最流行的基于学习的姿态回归方法,并使用成熟的 CMAES 基于强度的配准作为基准。我们报告配准姿势误差、目标配准误差 (TRE) 和成功率 (SR),平均 TRE 阈值为 10mm。对于骨盆解剖结构,模拟中 ProST 和 CMAES 的中位 TRE 为 4.4 毫米,SR 为 65.6%;实际数据中为 2.2 毫米,SR 为 73.2%。不使用 ProST 注册的 CMAES SR 在模拟和实际数据中分别为 28.5% 和 36.0%。我们的结果表明,所提出的 ProST 网络学习了实用的相似性函数,这极大地扩展了传统的基于强度的 2D/3D 配准的捕获范围。我们相信 ProST 独特的可微分特性有可能使相关 3D 医学成像研究应用受益。源代码可在 https://github.com/gaocong13/Projective-Spatial-Transformers 获取。
AU Sun, Kaicong
Wang, Qian
Shen, Dinggang
AU Sun、王凯聪、沉谦、丁刚
Joint Cross-Attention Network With Deep Modality Prior for Fast MRI
Reconstruction
具有深度模态先验的联合交叉注意网络用于快速 MRI 重建
Current deep learning-based reconstruction models for accelerated
multi-coil magnetic resonance imaging (MRI) mainly focus on subsampled
k-space data of single modality using convolutional neural network
(CNN). Although dual-domain information and data consistency constraint
are commonly adopted in fast MRI reconstruction, the performance of
existing models is still limited mainly by three factors: inaccurate
estimation of coil sensitivity, inadequate utilization of structural
prior, and inductive bias of CNN. To tackle these challenges, we propose
an unrolling-based joint Cross-Attention Network, dubbed as jCAN, using
deep guidance of the already acquired intra-subject data. Particularly,
to improve the performance of coil sensitivity estimation, we
simultaneously optimize the latent MR image and sensitivity map (SM).
Besides, we introduce Gating layer and Gaussian layer into SM estimation
to alleviate the "defocus" and "over-coupling" effects and further
ameliorate the SM estimation. To enhance the representation ability of
the proposed model, we deploy Vision Transformer (ViT) and CNN in the
image and k-space domains, respectively. Moreover, we exploit
pre-acquired intra-subject scan as reference modality to guide the
reconstruction of subsampled target modality by resorting to the self-
and cross-attention scheme. Experimental results on public knee and
in-house brain datasets demonstrate that the proposed jCAN outperforms
the state-of-the-art methods by a large margin in terms of SSIM and PSNR
for different acceleration factors and sampling masks.
当前基于深度学习的加速多线圈磁共振成像(MRI)重建模型主要集中于使用卷积神经网络(CNN)的单模态下采样k空间数据。尽管双域信息和数据一致性约束在快速MRI重建中被普遍采用,但现有模型的性能仍然主要受到三个因素的限制:线圈灵敏度估计不准确、结构先验利用不充分以及CNN的归纳偏差。为了应对这些挑战,我们提出了一种基于展开的联合交叉注意力网络,称为 jCAN,使用已获取的受试者内数据的深度指导。特别是,为了提高线圈灵敏度估计的性能,我们同时优化了潜在 MR 图像和灵敏度图(SM)。此外,我们将门控层和高斯层引入SM估计中,以减轻“散焦”和“过耦合”效应,进一步改善SM估计。为了增强所提出模型的表示能力,我们分别在图像和 k 空间域中部署 Vision Transformer (ViT) 和 CNN。此外,我们利用预先获得的受试者内扫描作为参考模态,通过自我和交叉注意方案来指导子采样目标模态的重建。在公共膝盖和内部大脑数据集上的实验结果表明,对于不同的加速因子和采样掩模,所提出的 jCAN 在 SSIM 和 PSNR 方面大幅优于最先进的方法。
AU Chi, Jianning
Sun, Zhiyi
Meng, Liuyi
Wang, Siqi
Yu, Xiaosheng
Wei, Xiaolin
Yang, Bin
AU Chi、孙建宁、孟志毅、王六一、余思琪、魏晓生、杨小林、斌
Low-dose CT image super-resolution with noise suppression based on prior
degradation estimator and self-guidance mechanism.
基于先验退化估计器和自引导机制的具有噪声抑制的低剂量 CT 图像超分辨率。
The anatomies in low-dose computer tomography (LDCT) are usually
distorted during the zooming-in observation process due to the small
amount of quantum. Super-resolution (SR) methods have been proposed to
enhance qualities of LDCT images as post-processing approaches without
increasing radiation damage to patients, but suffered from incorrect
prediction of degradation information and incomplete leverage of
internal connections within the 3D CT volume, resulting in the imbalance
between noise removal and detail sharpening in the super-resolution
results. In this paper, we propose a novel LDCT SR network where the
degradation information self-parsed from the LDCT slice and the 3D
anatomical information captured from the LDCT volume are integrated to
guide the backbone network. The prior degradation estimator (PDE) is
proposed following the contrastive learning strategy to estimate the
degradation features in the LDCT images without paired low-normal dose
CT images. The self-guidance fusion module (SGFM) is designed to capture
anatomical features with internal 3D consistencies between the squashed
images along the coronal, sagittal, and axial views of the CT volume.
Finally, the features representing degradation and anatomical structures
are integrated to recover the CT images with higher resolutions. We
apply the proposed method to the 2016 NIH-AAPM Mayo Clinic LDCT Grand
Challenge dataset and our collected LDCT dataset to evaluate its ability
to recover LDCT images. Experimental results illustrate the superiority
of our network concerning quantitative metrics and qualitative
observations, demonstrating its potential in recovering detail-sharp and
noise-free CT images with higher resolutions from the practical LDCT
images.
由于量子量较小,低剂量计算机断层扫描(LDCT)中的解剖结构在放大观察过程中通常会发生扭曲。超分辨率(SR)方法已被提出作为后处理方法来增强 LDCT 图像的质量,而不增加对患者的辐射损伤,但由于对退化信息的错误预测以及 3D CT 体积内内部连接的利用不完整,导致超分辨率结果中噪声去除和细节锐化之间的不平衡。在本文中,我们提出了一种新颖的 LDCT SR 网络,其中从 LDCT 切片自解析的退化信息和从 LDCT 体积捕获的 3D 解剖信息被集成以指导骨干网络。先验退化估计器(PDE)是按照对比学习策略提出的,用于在没有配对低正常剂量CT图像的情况下估计LDCT图像中的退化特征。自引导融合模块 (SGFM) 旨在捕获沿 CT 体积的冠状、矢状和轴向视图的压缩图像之间具有内部 3D 一致性的解剖特征。最后,融合代表退化和解剖结构的特征,以恢复更高分辨率的 CT 图像。我们将所提出的方法应用于 2016 年 NIH-AAPM Mayo Clinic LDCT Grand Challenge 数据集和我们收集的 LDCT 数据集,以评估其恢复 LDCT 图像的能力。实验结果说明了我们的网络在定量指标和定性观察方面的优越性,证明了其从实际 LDCT 图像中恢复细节清晰、无噪声、分辨率更高的 CT 图像的潜力。
EI 1558-254X
DA 2024-09-06
UT MEDLINE:39231060
PM 39231060
ER
EI 1558-254X DA 2024-09-06 UT MEDLINE:39231060 PM 39231060 ER
AU Yan, Renao
Sun, Qiehe
Jin, Cheng
Liu, Yiqing
He, Yonghong
Guan, Tian
Chen, Hao
区彦、孙热闹、金切和、刘成、何一清、关永红、陈天、郝
Shapley Values-enabled Progressive Pseudo Bag Augmentation for
Whole-Slide Image Classification.
支持 Shapley 值的渐进式伪袋增强,用于全幻灯片图像分类。
In computational pathology, whole-slide image (WSI) classification
presents a formidable challenge due to its gigapixel resolution and
limited fine-grained annotations. Multiple-instance learning (MIL)
offers a weakly supervised solution, yet refining instance-level
information from bag-level labels remains challenging. While most of the
conventional MIL methods use attention scores to estimate instance
importance scores (IIS) which contribute to the prediction of the slide
labels, these often lead to skewed attention distributions and
inaccuracies in identifying crucial instances. To address these issues,
we propose a new approach inspired by cooperative game theory: employing
Shapley values to assess each instance's contribution, thereby improving
IIS estimation. The computation of the Shapley value is then accelerated
using attention, meanwhile retaining the enhanced instance
identification and prioritization. We further introduce a framework for
the progressive assignment of pseudo bags based on estimated IIS,
encouraging more balanced attention distributions in MIL models. Our
extensive experiments on CAMELYON-16, BRACS, TCGA-LUNG, and TCGA-BRCA
datasets show our method's superiority over existing state-of-the-art
approaches, offering enhanced interpretability and class-wise insights.
We will release the code upon acceptance.
在计算病理学中,全切片图像(WSI)分类由于其十亿像素分辨率和有限的细粒度注释而提出了巨大的挑战。多实例学习 (MIL) 提供了弱监督解决方案,但从包级标签中提炼实例级信息仍然具有挑战性。虽然大多数传统的 MIL 方法使用注意力分数来估计有助于预测幻灯片标签的实例重要性分数 (IIS),但这些通常会导致注意力分布不均以及识别关键实例时的不准确。为了解决这些问题,我们提出了一种受合作博弈论启发的新方法:利用 Shapley 值来评估每个实例的贡献,从而改进 IIS 估计。然后使用注意力加速 Shapley 值的计算,同时保留增强的实例识别和优先级。我们进一步引入了一个基于估计的 IIS 的渐进式伪袋分配框架,鼓励 MIL 模型中更加平衡的注意力分配。我们对 CAMELYON-16、BRACS、TCGA-LUNG 和 TCGA-BRCA 数据集进行的广泛实验表明,我们的方法优于现有最先进的方法,提供增强的可解释性和分类洞察。我们将在接受后发布代码。
AU Borazjani, Kasra
Khosravan, Naji
Ying, Leslie
Hosseinalipour, Seyyedali
AU Borazjani、Kasra Khosravan、Naji Ying、Leslie Hosseinalipour、Seyyedali
Multi-Modal Federated Learning for Cancer Staging over Non-IID Datasets
with Unbalanced Modalities.
具有不平衡模式的非独立同分布数据集上的癌症分期的多模式联合学习。
The use of machine learning (ML) for cancer staging through medical
image analysis has gained substantial interest across medical
disciplines. When accompanied by the innovative federated learning (FL)
framework, ML techniques can further overcome privacy concerns related
to patient data exposure. Given the frequent presence of diverse data
modalities within patient records, leveraging FL in a multi-modal
learning framework holds considerable promise for cancer staging.
However, existing works on multi-modal FL often presume that all
data-collecting institutions have access to all data modalities. This
oversimplified approach neglects institutions that have access to only a
portion of data modalities within the system. In this work, we introduce
a novel FL architecture designed to accommodate not only the
heterogeneity of data samples, but also the inherent
heterogeneity/non-uniformity of data modalities across institutions. We
shed light on the challenges associated with varying convergence speeds
observed across different data modalities within our FL system.
Subsequently, we propose a solution to tackle these challenges by
devising a distributed gradient blending and proximity-aware client
weighting strategy tailored for multi-modal FL. To show the superiority
of our method, we conduct experiments using The Cancer Genome Atlas
program (TCGA) datalake considering different cancer types and three
modalities of data: mRNA sequences, histopathological image data, and
clinical information. Our results further unveil the impact and severity
of class-based vs type-based heterogeneity across institutions on the
model performance, which widens the perspective to the notion of data
heterogeneity in multi-modal FL literature.
通过医学图像分析使用机器学习 (ML) 进行癌症分期已经引起了各个医学学科的浓厚兴趣。当与创新的联邦学习 (FL) 框架相结合时,机器学习技术可以进一步克服与患者数据暴露相关的隐私问题。鉴于患者记录中经常存在不同的数据模式,在多模式学习框架中利用 FL 为癌症分期带来了巨大的希望。然而,现有的多模态 FL 工作通常假设所有数据收集机构都可以访问所有数据模态。这种过于简单化的方法忽略了只能访问系统内部分数据模式的机构。在这项工作中,我们引入了一种新颖的 FL 架构,其设计不仅可以适应数据样本的异质性,还可以适应跨机构数据模式固有的异质性/不均匀性。我们揭示了与 FL 系统中不同数据模式观察到的不同收敛速度相关的挑战。随后,我们提出了一个解决方案,通过设计专为多模态 FL 定制的分布式梯度混合和邻近感知客户端加权策略来应对这些挑战。为了展示我们方法的优越性,我们使用癌症基因组图谱计划 (TCGA) 数据湖进行实验,考虑不同的癌症类型和三种数据模式:mRNA 序列、组织病理学图像数据和临床信息。我们的结果进一步揭示了跨机构基于类别与基于类型的异质性对模型性能的影响和严重性,这拓宽了多模态 FL 文献中数据异质性概念的视角。
AU Feng, Yidan
Deng, Sen
Lyu, Jun
Cai, Jing
Wei, Mingqiang
Qin, Jing
AU Feng、邓一丹、吕森、蔡军、静伟、秦明强、景
Bridging MRI Cross-Modality Synthesis and Multi-Contrast
Super-Resolution by Fine-Grained Difference Learning.
通过细粒度差异学习桥接 MRI 跨模态合成和多对比度超分辨率。
In multi-modal magnetic resonance imaging (MRI), the tasks of imputing
or reconstructing the target modality share a common obstacle: the
accurate modeling of fine-grained inter-modal differences, which has
been sparingly addressed in current literature. These differences stem
from two sources: 1) spatial misalignment remaining after coarse
registration and 2) structural distinction arising from
modality-specific signal manifestations. This paper integrates the
previously separate research trajectories of cross-modality synthesis
(CMS) and multi-contrast super-resolution (MCSR) to address this
pervasive challenge within a unified framework. Connected through
generalized down-sampling ratios, this unification not only emphasizes
their common goal in reducing structural differences, but also
identifies the key task distinguishing MCSR from CMS: modeling the
structural distinctions using the limited information from the
misaligned target input. Specifically, we propose a composite network
architecture with several key components: a label correction module to
align the coordinates of multi-modal training pairs, a CMS module
serving as the base model, an SR branch to handle target inputs, and a
difference projection discriminator for structural distinction-centered
adversarial training. When training the SR branch as the generator, the
adversarial learning is enhanced with distinction-aware incremental
modulation to ensure better-controlled generation. Moreover, the SR
branch integrates deformable convolutions to address cross-modal spatial
misalignment at the feature level. Experiments conducted on three public
datasets demonstrate that our approach effectively balances structural
accuracy and realism, exhibiting overall superiority in comprehensive
evaluations for both tasks over current state-of-the-art approaches. The
code is available at https://github.com/papshare/FGDL.
在多模态磁共振成像(MRI)中,输入或重建目标模态的任务有一个共同的障碍:细粒度模态间差异的精确建模,而当前的文献对此很少提及。这些差异源于两个来源:1)粗略配准后剩余的空间错位;2)由特定模态信号表现引起的结构区别。本文整合了之前独立的跨模态合成(CMS)和多对比度超分辨率(MCSR)的研究轨迹,以在统一的框架内解决这一普遍的挑战。通过广义下采样比率连接,这种统一不仅强调了它们减少结构差异的共同目标,而且还确定了区分 MCSR 和 CMS 的关键任务:使用来自未对齐目标输入的有限信息对结构差异进行建模。具体来说,我们提出了一种具有几个关键组件的复合网络架构:用于对齐多模态训练对坐标的标签校正模块、用作基础模型的 CMS 模块、用于处理目标输入的 SR 分支以及差异投影鉴别器以结构区分为中心的对抗性训练。当将 SR 分支训练为生成器时,通过区分感知增量调制增强对抗性学习,以确保更好地控制生成。此外,SR 分支集成了可变形卷积来解决特征级别的跨模态空间错位问题。 在三个公共数据集上进行的实验表明,我们的方法有效地平衡了结构准确性和真实性,在这两项任务的综合评估中表现出相对于当前最先进方法的整体优越性。该代码可在 https://github.com/papshare/FGDL 获取。
EI 1558-254X
DA 2024-08-21
UT MEDLINE:39159018
PM 39159018
ER
EI 1558-254X DA 2024-08-21 UT MEDLINE:39159018 PM 39159018 ER
AU Cheng, Ziming
Wang, Shidong
Xin, Tong
Zhou, Tao
Zhang, Haofeng
Shao, Ling
AU Cheng、王子明、辛世东、周同、张涛、邵浩峰、凌
Few-Shot Medical Image Segmentation via Generating Multiple
Representative Descriptors
通过生成多个代表性描述符进行少样本医学图像分割
Automatic medical image segmentation has witnessed significant
development with the success of large models on massive datasets.
However, acquiring and annotating vast medical image datasets often
proves to be impractical due to the time consumption, specialized
expertise requirements, and compliance with patient privacy standards,
etc. As a result, Few-shot Medical Image Segmentation (FSMIS) has become
an increasingly compelling research direction. Conventional FSMIS
methods usually learn prototypes from support images and apply
nearest-neighbor searching to segment the query images. However, only a
single prototype cannot well represent the distribution of each class,
thus leading to restricted performance. To address this problem, we
propose to Generate Multiple Representative Descriptors (GMRD), which
can comprehensively represent the commonality within the corresponding
class distribution. In addition, we design a Multiple Affinity Maps
based Prediction (MAMP) module to fuse the multiple affinity maps
generated by the aforementioned descriptors. Furthermore, to address
intra-class variation and enhance the representativeness of descriptors,
we introduce two novel losses. Notably, our model is structured as a
dual-path design to achieve a balance between foreground and background
differences in medical images. Extensive experiments on four publicly
available medical image datasets demonstrate that our method outperforms
the state-of-the-art methods, and the detailed analysis also verifies
the effectiveness of our designed module.
随着大型模型在海量数据集上的成功,自动医学图像分割取得了显着的发展。然而,由于时间消耗、专业知识要求以及遵守患者隐私标准等原因,获取和注释大量医学图像数据集通常被证明是不切实际的。因此,少镜头医学图像分割(FSMIS)已成为越来越多的研究领域。引人注目的研究方向。传统的 FSMIS 方法通常从支持图像中学习原型,并应用最近邻搜索来分割查询图像。然而,只有一个原型无法很好地代表每个类的分布,从而导致性能受到限制。为了解决这个问题,我们提出生成多个代表描述符(GMRD),它可以全面地表示相应类分布内的共性。此外,我们设计了一个基于多亲和力图的预测(MAMP)模块来融合由上述描述符生成的多个亲和力图。此外,为了解决类内变异并增强描述符的代表性,我们引入了两种新颖的损失。值得注意的是,我们的模型采用双路径设计,以实现医学图像中前景和背景差异之间的平衡。对四个公开可用的医学图像数据集的广泛实验表明,我们的方法优于最先进的方法,详细的分析也验证了我们设计的模块的有效性。
AU Wang, Hongyu
He, Jiang
Cui, Hengfei
Yuan, Bo
Xia, Yong
王AU、何宏宇、崔江、袁恒飞、夏波、勇
Robust Stochastic Neural Ensemble Learning With Noisy Labels for
Thoracic Disease Classification
用于胸部疾病分类的带有噪声标签的鲁棒随机神经集成学习
Chest radiography is the most common radiology examination for thoracic
disease diagnosis, such as pneumonia. A tremendous number of chest
X-rays prompt data-driven deep learning models in constructing
computer-aided diagnosis systems for thoracic diseases. However, in
realistic radiology practice, a deep learning-based model often suffers
from performance degradation when trained on data with noisy labels
possibly caused by different types of annotation biases. To this end, we
present a novel stochastic neural ensemble learning (SNEL) framework for
robust thoracic disease diagnosis using chest X-rays. The core idea of
our method is to learn from noisy labels by constructing model ensembles
and designing noise-robust loss functions. Specifically, we propose a
fast neural ensemble method that collects parameters simultaneously
across model instances and along optimization trajectories. Moreover, we
propose a loss function that both optimizes a robust measure and
characterizes a diversity measure of ensembles. We evaluated our
proposed SNEL method on three publicly available hospital-scale chest
X-ray datasets. The experimental results indicate that our method
outperforms competing methods and demonstrate the effectiveness and
robustness of our method in learning from noisy labels. Our code is
available at https://github.com/hywang01/SNEL.
胸部X光检查是诊断肺炎等胸部疾病最常见的放射学检查。大量的胸部X光片促使数据驱动的深度学习模型构建胸部疾病的计算机辅助诊断系统。然而,在现实的放射学实践中,基于深度学习的模型在使用可能由不同类型的注释偏差引起的带有噪声标签的数据进行训练时,常常会出现性能下降的问题。为此,我们提出了一种新颖的随机神经集成学习 (SNEL) 框架,用于使用胸部 X 光进行稳健的胸部疾病诊断。我们方法的核心思想是通过构建模型集成和设计抗噪声损失函数来从噪声标签中学习。具体来说,我们提出了一种快速神经集成方法,可以跨模型实例并沿着优化轨迹同时收集参数。此外,我们提出了一种损失函数,它既可以优化鲁棒性度量,又可以表征集合的多样性度量。我们在三个公开的医院规模胸部 X 射线数据集上评估了我们提出的 SNEL 方法。实验结果表明,我们的方法优于竞争方法,并证明了我们的方法在从噪声标签中学习方面的有效性和鲁棒性。我们的代码可在 https://github.com/hywang01/SNEL 获取。
AU Zhou, Quan
Yu, Bin
Xiao, Feng
Ding, Mingyue
Wang, Zhiwei
Zhang, Xuming
周AU、余泉、肖斌、丁峰、王明月、张志伟、徐明
Robust Semi-Supervised 3D Medical Image Segmentation With Diverse
Joint-Task Learning and Decoupled Inter-Student Learning
具有多样化联合任务学习和解耦学生间学习的鲁棒半监督 3D 医学图像分割
Semi-supervised segmentation is highly significant in 3D medical image
segmentation. The typical solutions adopt a teacher-student dual-model
architecture, and they constrain the two models' decision consistency on
the same segmentation task. However, the scarcity of medical samples can
lower the diversity of tasks, reducing the effectiveness of consistency
constraint. The issue can further worsen as the weights of the models
gradually become synchronized. In this work, we have proposed to
construct diverse joint-tasks using masked image modelling for enhancing
the reliability of the consistency constraint, and develop a novel
architecture consisting of a single teacher but multiple students to
enjoy the additional knowledge decoupled from the synchronized weights.
Specifically, the teacher and student models 'see' varied
randomly-masked versions of an input, and are trained to segment the
same targets but reconstruct different missing regions concurrently.
Such joint-task of segmentation and reconstruction can have the two
learners capture related but complementary features to derive
instructive knowledge when constraining their consistency. Moreover, two
extra students join the original one to perform an inter-student
learning. The three students share the same encoding but different
decoding designs, and learn decoupled knowledge by constraining their
mutual consistencies, preventing themselves from suboptimally converging
to the biased predictions of the dictatorial teacher. Experimental on
four medical datasets show that our approach performs better than six
mainstream semi-supervised methods. Particularly, our approach achieves
at least 0.61% and 0.36% higher Dice and Jaccard values, respectively,
than the most competitive approach on our in-house dataset. The code
will be released at https://github.com/zxmboshi/DDL.
半监督分割在 3D 医学图像分割中具有非常重要的意义。典型的解决方案采用师生双模型架构,它们限制了两个模型在同一分割任务上的决策一致性。然而,医学样本的稀缺会降低任务的多样性,降低一致性约束的有效性。随着模型权重逐渐同步,这个问题可能会进一步恶化。在这项工作中,我们提出使用掩模图像建模构建不同的联合任务,以增强一致性约束的可靠性,并开发一种由单个教师和多个学生组成的新颖架构,以享受与同步权重解耦的额外知识。具体来说,教师和学生模型“看到”输入的各种随机屏蔽版本,并经过训练来分割相同的目标,但同时重建不同的缺失区域。这种分割和重建的联合任务可以让两个学习者捕获相关但互补的特征,从而在限制其一致性时导出指导性知识。此外,两名额外的学生加入原来的学生进行学生间学习。三个学生共享相同的编码但不同的解码设计,并通过约束彼此的一致性来学习解耦的知识,防止自己次优地收敛于独裁老师的有偏见的预测。对四个医学数据集的实验表明,我们的方法比六种主流半监督方法表现更好。特别是,与我们内部数据集上最具竞争力的方法相比,我们的方法的 Dice 和 Jaccard 值分别高出至少 0.61% 和 0.36%。 代码将发布在https://github.com/zxmboshi/DDL。
AU Ma, Yulan
Cui, Weigang
Liu, Jingyu
Guo, Yuzhu
Chen, Huiling
Li, Yang
马AU、崔玉兰、刘伟刚、郭靖宇、陈玉珠、李慧玲、杨
A Multi-Graph Cross-Attention-Based Region-Aware Feature Fusion Network
Using Multi-Template for Brain Disorder Diagnosis
基于多图交叉注意力的区域感知特征融合网络,使用多模板进行脑部疾病诊断
Functional connectivity (FC) networks based on resting-state functional
magnetic imaging (rs-fMRI) are reliable and sensitive for brain disorder
diagnosis. However, most existing methods are limited by using a single
template, which may be insufficient to reveal complex brain
connectivities. Furthermore, these methods usually neglect the
complementary information between static and dynamic brain networks, and
the functional divergence among different brain regions, leading to
suboptimal diagnosis performance. To address these limitations, we
propose a novel multi-graph cross-attention based region-aware feature
fusion network (MGCA-RAFFNet) by using multi-template for brain disorder
diagnosis. Specifically, we first employ multi-template to parcellate
the brain space into different regions of interest (ROIs). Then, a
multi-graph cross-attention network (MGCAN), including static and
dynamic graph convolutions, is developed to explore the deep features
contained in multi-template data, which can effectively analyze complex
interaction patterns of brain networks for each template, and further
adopt a dual-view cross-attention (DVCA) to acquire complementary
information. Finally, to efficiently fuse multiple static-dynamic
features, we design a region-aware feature fusion network (RAFFNet),
which is beneficial to improve the feature discrimination by considering
the underlying relations among static-dynamic features in different
brain regions. Our proposed method is evaluated on both public ADNI-2
and ABIDE-I datasets for diagnosing mild cognitive impairment (MCI) and
autism spectrum disorder (ASD). Extensive experiments demonstrate that
the proposed method outperforms the state-of-the-art methods.
基于静息态功能磁共振成像 (rs-fMRI) 的功能连接 (FC) 网络对于脑部疾病诊断来说是可靠且敏感的。然而,大多数现有方法都受到使用单一模板的限制,这可能不足以揭示复杂的大脑连接。此外,这些方法通常忽略静态和动态大脑网络之间的互补信息以及不同大脑区域之间的功能差异,导致诊断性能不佳。为了解决这些限制,我们通过使用多模板进行脑部疾病诊断,提出了一种新颖的基于多图交叉注意的区域感知特征融合网络(MGCA-RAFFNet)。具体来说,我们首先采用多模板将大脑空间分割成不同的感兴趣区域(ROI)。然后,开发了包括静态和动态图卷积的多图交叉注意网络(MGCAN)来探索多模板数据中包含的深层特征,可以有效地分析每个模板的大脑网络的复杂交互模式,并且进一步采用双视图交叉注意(DVCA)来获取补充信息。最后,为了有效地融合多个静态-动态特征,我们设计了一个区域感知特征融合网络(RAFFNet),通过考虑不同大脑区域的静态-动态特征之间的潜在关系,有利于提高特征辨别力。我们提出的方法在公共 ADNI-2 和 ABIDE-I 数据集上进行了评估,用于诊断轻度认知障碍 (MCI) 和自闭症谱系障碍 (ASD)。大量的实验表明,所提出的方法优于最先进的方法。
AU Li, Fangda
Hu, Zhiqiang
Chen, Wen
Kak, Avinash
AU Li、胡芳达、陈志强、Wen Kak、Avinash
A Laplacian Pyramid Based Generative H&E Stain Augmentation Network
基于拉普拉斯金字塔的生成 H&E 染色增强网络
Hematoxylin and Eosin (H&E) staining is a widely used sample preparation
procedure for enhancing the saturation of tissue sections and the
contrast between nuclei and cytoplasm in histology images for medical
diagnostics. However, various factors, such as the differences in the
reagents used, result in high variability in the colors of the stains
actually recorded. This variability poses a challenge in achieving
generalization for machine-learning based computer-aided diagnostic
tools. To desensitize the learned models to stain variations, we propose
the Generative Stain Augmentation Network (G-SAN) - a GAN-based
framework that augments a collection of cell images with simulated yet
realistic stain variations. At its core, G-SAN uses a novel and highly
computationally efficient Laplacian Pyramid (LP) based generator
architecture, that is capable of disentangling stain from cell
morphology. Through the task of patch classification and nucleus
segmentation, we show that using G-SAN-augmented training data provides
on average 15.7% improvement in F1 score and 7.3% improvement in
panoptic quality, respectively. Our code is available at
https://github.com/lifangda01/GSAN-Demo.
苏木精和伊红 (H&E) 染色是一种广泛使用的样品制备程序,用于增强组织切片的饱和度以及医学诊断组织学图像中细胞核和细胞质之间的对比度。然而,各种因素,例如所用试剂的差异,导致实际记录的染色颜色存在很大差异。这种可变性对实现基于机器学习的计算机辅助诊断工具的泛化提出了挑战。为了使学习模型对染色变化不敏感,我们提出了生成染色增强网络(G-SAN)——一种基于 GAN 的框架,可以通过模拟但真实的染色变化来增强细胞图像集合。 G-SAN 的核心使用了一种新颖且计算效率高的基于拉普拉斯金字塔 (LP) 的生成器架构,能够将染色与细胞形态分离。通过斑块分类和核分割的任务,我们表明使用 G-SAN 增强的训练数据分别使 F1 分数平均提高 15.7%,全景质量提高 7.3%。我们的代码可在 https://github.com/lifangda01/GSAN-Demo 获取。
AU Chen, Haobo
Cai, Yehua
Wang, Changyan
Chen, Lin
Zhang, Bo
Han, Hong
Guo, Yuqing
Ding, Hong
Zhang, Qi
陈AU、蔡浩波、王业华、陈昌彦、张林、韩波、郭洪、丁雨清、张洪、齐
Multi-Organ Foundation Model for Universal Ultrasound Image Segmentation
with Task Prompt and Anatomical Prior.
具有任务提示和解剖先验的通用超声图像分割的多器官基础模型。
Semantic segmentation of ultrasound (US) images with deep learning has
played a crucial role in computer-aided disease screening, diagnosis and
prognosis. However, due to the scarcity of US images and small field of
view, resulting segmentation models are tailored for a specific single
organ and may lack robustness, overlooking correlations among anatomical
structures of multiple organs. To address these challenges, we propose
the Multi-Organ FOundation (MOFO) model for universal US image
segmentation. The MOFO is optimized jointly from multiple organs across
various anatomical regions to overcome the data scarcity and explore
correlations between multiple organs. The MOFO extracts organ-invariant
representations from US images. Simultaneously, the task prompt is
employed to refine organ-specific representations for segmentation
predictions. Moreover, the anatomical prior is incorporated to enhance
the consistency of the anatomical structures. A multi-organ US database,
comprising 7039 images from 10 organs across various regions of the
human body, has been established to evaluate our model. Results
demonstrate that the MOFO outperforms single-organ methods in terms of
the Dice coefficient, 95% Hausdorff distance and average symmetric
surface distance with statistically sufficient margins. Our experiments
in multi-organ universal segmentation for US images serve as a
pioneering exploration of improving segmentation performance by
leveraging semantic and anatomical relationships within US images of
multiple organs.
利用深度学习对超声(US)图像进行语义分割在计算机辅助疾病筛查、诊断和预后中发挥了至关重要的作用。然而,由于超声图像的稀缺性和视野较小,所得到的分割模型是针对特定的单个器官定制的,并且可能缺乏鲁棒性,忽略了多个器官解剖结构之间的相关性。为了应对这些挑战,我们提出了用于通用美国图像分割的多器官基础(MOFO)模型。 MOFO 由不同解剖区域的多个器官联合优化,以克服数据稀缺性并探索多个器官之间的相关性。 MOFO 从 US 图像中提取器官不变的表示。同时,任务提示用于细化分割预测的器官特异性表示。此外,结合解剖先验以增强解剖结构的一致性。已经建立了一个多器官美国数据库来评估我们的模型,该数据库包含来自人体不同区域的 10 个器官的 7039 张图像。结果表明,MOFO 在 Dice 系数、95% Hausdorff 距离和平均对称表面距离方面优于单器官方法,并且具有统计上足够的裕度。我们在美国图像的多器官通用分割方面的实验是通过利用多个器官的美国图像中的语义和解剖关系来提高分割性能的开创性探索。
AU Liu, Yuyuan
Tian, Yu
Wang, Chong
Chen, Yuanhong
Liu, Fengbei
Belagiannis, Vasileios
Carneiro, Gustavo
AU Liu、田雨媛、王雨、陈冲、刘媛红、Fengbei Belagiannis、Vasileios Carneiro、Gustavo
Translation Consistent Semi-supervised Segmentation for 3D Medical
Images.
3D 医学图像的翻译一致半监督分割。
3D medical image segmentation methods have been successful, but their
dependence on large amounts of voxel-level annotated data is a
disadvantage that needs to be addressed given the high cost to obtain
such annotation. Semi-supervised learning (SSL) solves this issue by
training models with a large unlabelled and a small labelled dataset.
The most successful SSL approaches are based on consistency learning
that minimises the distance between model responses obtained from
perturbed views of the unlabelled data. These perturbations usually keep
the spatial input context between views fairly consistent, which may
cause the model to learn segmentation patterns from the spatial input
contexts instead of the foreground objects. In this paper, we introduce
the Translation Consistent Co-training (TraCoCo) which is a consistency
learning SSL method that perturbs the input data views by varying their
spatial input context, allowing the model to learn segmentation patterns
from foreground objects. Furthermore, we propose a new Confident
Regional Cross entropy (CRC) loss, which improves training convergence
and keeps the robustness to co-training pseudo-labelling mistakes. Our
method yields state-of-the-art (SOTA) results for several 3D data
benchmarks, such as the Left Atrium (LA), Pancreas-CT (Pancreas), and
Brain Tumor Segmentation (BraTS19). Our method also attains best results
on a 2D-slice benchmark, namely the Automated Cardiac Diagnosis
Challenge (ACDC), further demonstrating its effectiveness. Our code,
training logs and checkpoints are available at
https://github.com/yyliu01/ TraCoCo.
3D 医学图像分割方法已经取得了成功,但它们对大量体素级注释数据的依赖是一个缺点,考虑到获得此类注释的成本很高,需要解决这一缺点。半监督学习 (SSL) 通过使用大型未标记数据集和小型标记数据集训练模型来解决此问题。最成功的 SSL 方法基于一致性学习,该学习可以最小化从未标记数据的扰动视图获得的模型响应之间的距离。这些扰动通常使视图之间的空间输入上下文相当一致,这可能导致模型从空间输入上下文而不是前景对象中学习分割模式。在本文中,我们介绍了翻译一致性协同训练(TraCoCo),这是一种一致性学习 SSL 方法,它通过改变输入数据视图的空间输入上下文来扰乱输入数据视图,从而允许模型从前景对象中学习分割模式。此外,我们提出了一种新的置信区域交叉熵(CRC)损失,它提高了训练收敛性并保持了对协同训练伪标签错误的鲁棒性。我们的方法为多个 3D 数据基准提供了最先进的 (SOTA) 结果,例如左心房 (LA)、胰腺 CT (胰腺) 和脑肿瘤分割 (BraTS19)。我们的方法还在 2D 切片基准测试(即自动心脏诊断挑战(ACDC))上获得了最佳结果,进一步证明了其有效性。我们的代码、训练日志和检查点可在 https://github.com/yyliu01/TraCoCo 上获取。
AU Huang, Wendong
Hu, Jinwu
Xiao, Junhao
Wei, Yang
Bi, Xiuli
Xiao, Bin
黄AU, 胡文东, 肖金武, 魏俊豪, 毕杨, 肖秀丽, 斌
Prototype-Guided Graph Reasoning Network for Few-Shot Medical Image
Segmentation.
用于少镜头医学图像分割的原型引导图推理网络。
Few-shot semantic segmentation (FSS) is of tremendous potential for
data-scarce scenarios, particularly in medical segmentation tasks with
merely a few labeled data. Most of the existing FSS methods typically
distinguish query objects with the guidance of support prototypes.
However, the variances in appearance and scale between support and query
objects from the same anatomical class are often exceedingly
considerable in practical clinical scenarios, thus resulting in
undesirable query segmentation masks. To tackle the aforementioned
challenge, we propose a novel prototype-guided graph reasoning network
(PGRNet) to explicitly explore potential contextual relationships in
structured query images. Specifically, a prototype-guided graph
reasoning module is proposed to perform information interaction on the
query graph under the guidance of support prototypes to fully exploit
the structural properties of query images to overcome intra-class
variances. Moreover, instead of fixed support prototypes, a dynamic
prototype generation mechanism is devised to yield a collection of
dynamic support prototypes by mining rich contextual information from
support images to further boost the efficiency of information
interaction between support and query branches. Equipped with the
proposed two components, PGRNet can learn abundant contextual
representations for query images and is therefore more resilient to
object variations. We validate our method on three publicly available
medical segmentation datasets, namely CHAOS-T2, MS-CMRSeg, and Synapse.
Experiments indicate that the proposed PGRNet outperforms previous FSS
methods by a considerable margin and establishes a new state-of-the-art
performance.
少镜头语义分割(FSS)对于数据稀缺的场景具有巨大的潜力,特别是在只有少量标记数据的医学分割任务中。大多数现有的 FSS 方法通常在支持原型的指导下区分查询对象。然而,在实际临床场景中,来自同一解剖类别的支持对象和查询对象之间的外观和尺度差异通常非常大,从而导致不期望的查询分割掩模。为了解决上述挑战,我们提出了一种新颖的原型引导图推理网络(PGRNet)来明确探索结构化查询图像中潜在的上下文关系。具体来说,提出了原型引导的图推理模块,在支持原型的指导下对查询图进行信息交互,以充分利用查询图像的结构特性来克服类内方差。此外,设计了动态原型生成机制,而不是固定的支持原型,通过从支持图像中挖掘丰富的上下文信息来生成动态支持原型的集合,以进一步提高支持和查询分支之间的信息交互效率。配备了所提出的两个组件,PGRNet 可以学习查询图像的丰富上下文表示,因此对对象变化更具弹性。我们在三个公开可用的医学分割数据集(即 CHAOS-T2、MS-CMRSeg 和 Synapse)上验证了我们的方法。实验表明,所提出的 PGRNet 大大优于以前的 FSS 方法,并建立了新的最先进性能。
AU Luo, Yilin Huang, Hsuan-Kai Sastry, Karteekeya Hu, Peng Tong, Xin Kuo, Joseph Aborahama, Yousuf Na, Shuai Villa, Umberto Anastasio, Mark A Wang, Lihong V
Full-wave Image Reconstruction in Transcranial Photoacoustic Computed
Tomography using a Finite Element Method.
使用有限元方法进行经颅光声计算机断层扫描的全波图像重建。
Transcranial photoacoustic computed tomography presents challenges in
human brain imaging due to skull-induced acoustic aberration. Existing
full-wave image reconstruction methods rely on a unified elastic wave
equation for skull shear and longitudinal wave propagation, therefore
demanding substantial computational resources. We propose an efficient
discrete imaging model based on finite element discretization. The
elastic wave equation for solids is solely applied to the hard-tissue
skull region, while the soft-tissue or coupling-medium region that
dominates the simulation domain is modeled with the simpler acoustic
wave equation for liquids. The solid-liquid interfaces are explicitly
modeled with elastic-acoustic coupling. Furthermore, finite element
discretization allows coarser, irregular meshes to conform to object
geometry. These factors significantly reduce the linear system size by
20 times to facilitate accurate whole-brain simulations with improved
speed. We derive a matched forward-adjoint operator pair based on the
model to enable integration with various optimization algorithms. We
validate the reconstruction framework through numerical simulations and
phantom experiments.
由于颅骨引起的声像差,经颅光声计算机断层扫描对人脑成像提出了挑战。现有的全波图像重建方法依赖于颅骨剪切和纵波传播的统一弹性波方程,因此需要大量的计算资源。我们提出了一种基于有限元离散化的高效离散成像模型。固体的弹性波方程仅应用于硬组织颅骨区域,而主导模拟域的软组织或耦合介质区域则使用更简单的液体声波方程进行建模。固液界面通过弹性声耦合进行显式建模。此外,有限元离散化允许更粗糙、不规则的网格符合物体的几何形状。这些因素将线性系统的尺寸显着减小了 20 倍,以促进精确的全脑模拟并提高速度。我们根据模型推导出匹配的前向伴随算子对,以实现与各种优化算法的集成。我们通过数值模拟和模型实验验证了重建框架。
AU Tang, Kunming
Jiang, Zhiguo
Wu, Kun
Shi, Jun
Xie, Fengying
Wang, Wei
Wu, Haibo
Zheng, Yushan
唐AU、姜昆明、吴志国、施坤、谢军、王凤英、吴伟、郑海波、玉山
Self-Supervised Representation Distribution Learning for Reliable Data
Augmentation in Histopathology WSI Classification.
用于组织病理学 WSI 分类中可靠数据增强的自监督表示分布学习。
Multiple instance learning (MIL) based whole slide image (WSI)
classification is often carried out on the representations of patches
extracted from WSI with a pre-trained patch encoder. The performance of
classification relies on both patch-level representation learning and
MIL classifier training. Most MIL methods utilize a frozen model
pre-trained on ImageNet or a model trained with self-supervised learning
on histopathology image dataset to extract patch image representations
and then fix these representations in the training of the MIL
classifiers for efficiency consideration. However, the invariance of
representations cannot meet the diversity requirement for training a
robust MIL classifier, which has significantly limited the performance
of the WSI classification. In this paper, we propose a Self-Supervised
Representation Distribution Learning framework (SSRDL) for patch-level
representation learning with an online representation sampling strategy
(ORS) for both patch feature extraction and WSI-level data augmentation.
The proposed method was evaluated on three datasets under three MIL
frameworks. The experimental results have demonstrated that the proposed
method achieves the best performance in histopathology image
representation learning and data augmentation and outperforms
state-of-the-art methods under different WSI classification frameworks.
The code is available at https://github.com/lazytkm/SSRDL.
基于多实例学习 (MIL) 的整个幻灯片图像 (WSI) 分类通常是使用预训练的补丁编码器对从 WSI 中提取的补丁表示进行的。分类的性能依赖于补丁级表示学习和 MIL 分类器训练。大多数 MIL 方法利用在 ImageNet 上预训练的冻结模型或在组织病理学图像数据集上经过自监督学习训练的模型来提取补丁图像表示,然后在 MIL 分类器的训练中修复这些表示以考虑效率。然而,表示的不变性无法满足训练鲁棒MIL分类器的多样性要求,这极大地限制了WSI分类的性能。在本文中,我们提出了一种用于补丁级表示学习的自监督表示分布学习框架(SSRDL),并使用在线表示采样策略(ORS)来进行补丁特征提取和 WSI 级数据增强。所提出的方法在三个 MIL 框架下的三个数据集上进行了评估。实验结果表明,所提出的方法在组织病理学图像表示学习和数据增强方面实现了最佳性能,并且在不同的 WSI 分类框架下优于最先进的方法。该代码可在 https://github.com/lazytkm/SSRDL 获取。
AU Amaan Valiuddin, M M
Viviers, Christiaan G A
Van Sloun, Ruud J G
De With, Peter H N
Sommen, Fons van der
AU Amaan Valiuddin、MM Viviers、Christiaan GA Van Sloun、Ruud JG De With、Peter HN Sommen、Fons van der
Investigating and Improving Latent Density Segmentation Models for
Aleatoric Uncertainty Quantification in Medical Imaging.
研究和改进医学成像中任意不确定性量化的潜在密度分割模型。
Data uncertainties, such as sensor noise, occlusions or limitations in
the acquisition method can introduce irreducible ambiguities in images,
which result in varying, yet plausible, semantic hypotheses. In Machine
Learning, this ambiguity is commonly referred to as aleatoric
uncertainty. In image segmentation, latent density models can be
utilized to address this problem. The most popular approach is the
Probabilistic U-Net (PU-Net), which uses latent Normal densities to
optimize the conditional data log-likelihood Evidence Lower Bound. In
this work, we demonstrate that the PU-Net latent space is severely
sparse and heavily under-utilized. To address this, we introduce mutual
information maximization and entropy-regularized Sinkhorn Divergence in
the latent space to promote homogeneity across all latent dimensions,
effectively improving gradient-descent updates and latent space
informativeness. Our results show that by applying this on public
datasets of various clinical segmentation problems, our proposed
methodology receives up to 11% performance gains compared against
preceding latent variable models for probabilistic segmentation on the
Hungarian-Matched Intersection over Union. The results indicate that
encouraging a homogeneous latent space significantly improves latent
density modeling for medical image segmentation.
数据不确定性,例如传感器噪声、遮挡或采集方法的限制,可能会在图像中引入不可减少的模糊性,从而导致不同但合理的语义假设。在机器学习中,这种模糊性通常被称为任意不确定性。在图像分割中,可以利用潜在密度模型来解决这个问题。最流行的方法是概率 U-Net (PU-Net),它使用潜在正态密度来优化条件数据对数似然证据下界。在这项工作中,我们证明了 PU-Net 潜在空间严重稀疏且严重未得到充分利用。为了解决这个问题,我们在潜在空间中引入互信息最大化和熵正则化 Sinkhorn 散度,以促进所有潜在维度的同质性,有效提高梯度下降更新和潜在空间信息量。我们的结果表明,通过将其应用于各种临床分割问题的公共数据集,与之前在匈牙利匹配交集联合上进行概率分割的潜在变量模型相比,我们提出的方法获得了高达 11% 的性能提升。结果表明,鼓励均匀的潜在空间可以显着改善医学图像分割的潜在密度建模。
AU Xu, Jing
Huang, Kai
Zhong, Lianzhen
Gao, Yuan
Sun, Kai
Liu, Wei
Zhou, Yanjie
Guo, Wenchao
Guo, Yuan
Zou, Yuanqiang
Duan, Yuping
Lu, Le
Wang, Yu
Chen, Xiang
Zhao, Shuang
徐AU、黄静、钟凯、高连珍、孙元、刘凯、周伟、郭彦杰、郭文超、邹渊、段元强、路玉萍、王乐、陈宇、赵翔、爽
RemixFormer++: A Multi-modal Transformer Model for Precision Skin Tumor
Differential Diagnosis with Memory-efficient Attention.
RemixFormer++:一种多模态 Transformer 模型,用于具有内存高效注意力的精确皮肤肿瘤鉴别诊断。
Diagnosing malignant skin tumors accurately at an early stage can be
challenging due to ambiguous and even confusing visual characteristics
displayed by various categories of skin tumors. To improve diagnosis
precision, all available clinical data from multiple sources,
particularly clinical images, dermoscopy images, and medical history,
could be considered. Aligning with clinical practice, we propose a novel
Transformer model, named Remix-Former++ that consists of a clinical
image branch, a dermoscopy image branch, and a metadata branch. Given
the unique characteristics inherent in clinical and dermoscopy images,
specialized attention strategies are adopted for each type. Clinical
images are processed through a top-down architecture, capturing both
localized lesion details and global contextual information. Conversely,
dermoscopy images undergo a bottom-up processing with two-level
hierarchical encoders, designed to pinpoint fine-grained structural and
textural features. A dedicated metadata branch seamlessly integrates
non-visual information by encoding relevant patient data. Fusing
features from three branches substantially boosts disease classification
accuracy. RemixFormer++ demonstrates exceptional performance on four
single-modality datasets (PAD-UFES-20, ISIC 2017/2018/2019). Compared
with the previous best method using a public multi-modal Derm7pt
dataset, we achieved an absolute 5.3% increase in averaged F1 and 1.2%
in accuracy for the classification of five skin tumors. Furthermore,
using a large-scale in-house dataset of 10,351 patients with the twelve
most common skin tumors, our method obtained an overall classification
accuracy of 92.6%. These promising results, on par or better with the
performance of 191 dermatologists through a comprehensive reader study,
evidently imply the potential clinical usability of our method.
由于不同类别的皮肤肿瘤所显示的视觉特征不明确甚至令人困惑,因此在早期准确诊断恶性皮肤肿瘤可能具有挑战性。为了提高诊断精度,可以考虑来自多个来源的所有可用临床数据,特别是临床图像、皮肤镜图像和病史。结合临床实践,我们提出了一种新颖的 Transformer 模型,名为 Remix-Former++,由临床图像分支、皮肤镜图像分支和元数据分支组成。鉴于临床和皮肤镜图像固有的独特特征,每种类型都采用专门的关注策略。临床图像通过自上而下的架构进行处理,捕获局部病变细节和全局上下文信息。相反,皮肤镜图像通过两级分层编码器进行自下而上的处理,旨在精确定位细粒度的结构和纹理特征。专用元数据分支通过对相关患者数据进行编码来无缝集成非视觉信息。融合三个分支的特征大大提高了疾病分类的准确性。 RemixFormer++ 在四个单模态数据集(PAD-UFES-20、ISIC 2017/2018/2019)上展示了卓越的性能。与之前使用公共多模态 Derm7pt 数据集的最佳方法相比,我们的平均 F1 绝对提高了 5.3%,五种皮肤肿瘤的分类准确率提高了 1.2%。此外,使用包含 10,351 名患有 12 种最常见皮肤肿瘤的患者的大规模内部数据集,我们的方法获得了 92.6% 的总体分类准确率。 通过全面的读者研究,这些有希望的结果与 191 名皮肤科医生的表现相当或更好,显然意味着我们的方法具有潜在的临床可用性。
EI 1558-254X
DA 2024-08-11
UT MEDLINE:39120989
PM 39120989
ER
EI 1558-254X DA 2024-08-11 UT MEDLINE:39120989 PM 39120989 ER
AU Quan, Quan
Yao, Qingsong
Zhu, Heqin
Kevin Zhou, S
AU Quan, Quan Yao, 朱庆松, Heqin Kevin Zhou, S
IGU-Aug: Information-guided unsupervised augmentation and pixel-wise
contrastive learning for medical image analysis.
IGU-Aug:用于医学图像分析的信息引导的无监督增强和逐像素对比学习。
Contrastive learning (CL) is a form of self-supervised learning and has
been widely used for various tasks. Different from widely studied
instance-level contrastive learning, pixel-wise contrastive learning
mainly helps with pixel-wise dense prediction tasks. The counter-part to
an instance in instance-level CL is a pixel, along with its neighboring
context, in pixel-wise CL. Aiming to build better feature
representation, there is a vast literature about designing instance
augmentation strategies for instance-level CL; but there is little
similar work on pixel augmentation for pixel-wise CL with a pixel
granularity. In this paper, we attempt to bridge this gap. We first
classify a pixel into three categories, namely low-, medium-, and
high-informative, based on the information quantity the pixel contains.
We then adaptively design separate augmentation strategies for each
category in terms of augmentation intensity and sampling ratio.
Extensive experiments validate that our information-guided pixel
augmentation strategy succeeds in encoding more discriminative
representations and surpassing other competitive approaches in
unsupervised local feature matching. Furthermore, our pretrained model
improves the performance of both one-shot and fully supervised models.
To the best of our knowledge, we are the first to propose a pixel
augmentation method with a pixel granularity for enhancing unsupervised
pixel-wise contrastive learning. Code is available at https:
//github.com/Curli-quan/IGU-Aug.
对比学习(CL)是自我监督学习的一种形式,已广泛应用于各种任务。与广泛研究的实例级对比学习不同,逐像素对比学习主要有助于逐像素密集预测任务。实例级 CL 中实例的对应部分是像素级 CL 中的像素及其相邻上下文。为了构建更好的特征表示,有大量关于为实例级 CL 设计实例增强策略的文献;但对于具有像素粒度的逐像素 CL 的像素增强,几乎没有类似的工作。在本文中,我们试图弥合这一差距。我们首先根据像素包含的信息量将像素分为三类,即低信息量、中信息量和高信息量。然后,我们根据增强强度和采样率自适应地为每个类别设计单独的增强策略。大量的实验验证了我们的信息引导像素增强策略成功地编码了更具辨别力的表示,并在无监督局部特征匹配中超越了其他竞争方法。此外,我们的预训练模型提高了一次性模型和完全监督模型的性能。据我们所知,我们是第一个提出一种具有像素粒度的像素增强方法,用于增强无监督的逐像素对比学习。代码可在 https://github.com/Curli-quan/IGU-Aug 获取。
AU Daneshmand, Parisa Ghaderi
Rabbani, Hossein
AU Daneshmand、帕里莎·加德里·拉巴尼、侯赛因
Tensor Ring Decomposition Guided Dictionary Learning for OCT Image
Denoising
用于 OCT 图像去噪的张量环分解引导字典学习
Optical coherence tomography (OCT) is a non-invasive and effective tool
for the imaging of retinal tissue. However, the heavy speckle noise,
resulting from multiple scattering of the light waves, obscures
important morphological structures and impairs the clinical diagnosis of
ocular diseases. In this paper, we propose a novel and powerful model
known as tensor ring decomposition-guided dictionary learning (TRGDL)
for OCT image denoising, which can simultaneously utilize two useful
complementary priors, i.e., three-dimensional low-rank and sparsity
priors, under a unified framework. Specifically, to effectively use the
strong correlation between nearby OCT frames, we construct the OCT group
tensors by extracting cubic patches from OCT images and clustering
similar patches. Then, since each created OCT group tensor has a
low-rank structure, to exploit spatial, non-local, and its temporal
correlations in a balanced way, we enforce the TR decomposition model on
each OCT group tensor. Next, to use the beneficial three-dimensional
inter-group sparsity, we learn shared dictionaries in both spatial and
temporal dimensions from all of the stacked OCT group tensors.
Furthermore, we develop an effective algorithm to solve the resulting
optimization problem by using two efficient optimization approaches,
including proximal alternating minimization and the alternative
direction method of multipliers. Finally, extensive experiments on OCT
datasets from various imaging devices are conducted to prove the
generality and usefulness of the proposed TRGDL model. Experimental
simulation results show that the suggested TRGDL model outperforms
state-of-the-art approaches for OCT image denoising both qualitatively
and quantitatively.
光学相干断层扫描(OCT)是一种非侵入性且有效的视网膜组织成像工具。然而,光波多次散射产生的严重散斑噪声掩盖了重要的形态结构,损害了眼部疾病的临床诊断。在本文中,我们提出了一种新颖且强大的模型,称为张量环分解引导字典学习(TRGDL),用于 OCT 图像去噪,该模型可以同时利用两个有用的互补先验,即三维低秩和稀疏先验,在一个统一的框架。具体来说,为了有效利用附近 OCT 帧之间的强相关性,我们通过从 OCT 图像中提取立方块并对相似块进行聚类来构造 OCT 组张量。然后,由于每个创建的 OCT 组张量都具有低秩结构,为了以平衡的方式利用空间、非局部及其时间相关性,我们在每个 OCT 组张量上强制执行 TR 分解模型。接下来,为了利用有益的三维组间稀疏性,我们从所有堆叠的 OCT 组张量中学习空间和时间维度上的共享字典。此外,我们开发了一种有效的算法,通过使用两种有效的优化方法来解决由此产生的优化问题,包括近端交替最小化和乘法器的替代方向方法。最后,对来自各种成像设备的 OCT 数据集进行了广泛的实验,以证明所提出的 TRGDL 模型的通用性和实用性。实验模拟结果表明,所提出的 TRGDL 模型在定性和定量方面均优于 OCT 图像去噪的最先进方法。
AU Liu, Mengjun
Zhang, Huifeng
Liu, Mianxin
Chen, Dongdong
Zhuang, Zixu
Wang, Xin
Zhang, Lichi
Peng, Daihui
Wang, Qian
刘AU、张孟军、刘惠峰、陈勉新、庄东东、王子旭、张新、彭丽驰、王代辉、钱
Randomizing Human Brain Function Representation for Brain Disease
Diagnosis
随机化人脑功能表征以进行脑疾病诊断
Resting-state fMRI (rs-fMRI) is an effective tool for quantifying
functional connectivity (FC), which plays a crucial role in exploring
various brain diseases. Due to the high dimensionality of fMRI data, FC
is typically computed based on the region of interest (ROI), whose
parcellation relies on a pre-defined atlas. However, utilizing the brain
atlas poses several challenges including 1) subjective selection bias in
choosing from various brain atlases, 2) parcellation of each subject's
brain with the same atlas yet disregarding individual specificity; 3)
lack of interaction between brain region parcellation and downstream
ROI-based FC analysis. To address these limitations, we propose a novel
randomizing strategy for generating brain function representation to
facilitate neural disease diagnosis. Specifically, we randomly sample
brain patches, thus avoiding ROI parcellations of the brain atlas. Then,
we introduce a new brain function representation framework for the
sampled patches. Each patch has its function description by referring to
anchor patches, as well as the position description. Furthermore, we
design an adaptive-selection-assisted Transformer network to optimize
and integrate the function representations of all sampled patches within
each brain for neural disease diagnosis. To validate our framework, we
conduct extensive evaluations on three datasets, and the experimental
results establish the effectiveness and generality of our proposed
method, offering a promising avenue for advancing neural disease
diagnosis beyond the confines of traditional atlas-based methods. Our
code is available at https://github.com/mjliu2020/RandomFR.
静息态功能磁共振成像(rs-fMRI)是量化功能连接(FC)的有效工具,在探索各种脑部疾病中发挥着至关重要的作用。由于功能磁共振成像数据的高维性,FC 通常是根据感兴趣区域 (ROI) 计算的,其分割依赖于预定义的图集。然而,利用大脑图谱带来了一些挑战,包括1)从不同的大脑图谱中进行选择时的主观选择偏差,2)使用相同的图谱对每个受试者的大脑进行分区,但忽略个体特异性; 3)大脑区域分割和下游基于 ROI 的 FC 分析之间缺乏相互作用。为了解决这些限制,我们提出了一种新的随机策略来生成大脑功能表示,以促进神经疾病的诊断。具体来说,我们随机对大脑斑块进行采样,从而避免大脑图谱的 ROI 分割。然后,我们为采样的补丁引入了一个新的大脑功能表示框架。每个补丁都有其参考锚点补丁的功能描述,以及位置描述。此外,我们设计了一个自适应选择辅助的 Transformer 网络来优化和集成每个大脑内所有采样斑块的功能表示,以进行神经疾病诊断。为了验证我们的框架,我们对三个数据集进行了广泛的评估,实验结果证实了我们提出的方法的有效性和通用性,为超越传统基于图集的方法的范围推进神经疾病诊断提供了一条有前途的途径。我们的代码可在 https://github.com/mjliu2020/RandomFR 获取。
AU Pan, Jiazhen
Huang, Wenqi
Rueckert, Daniel
Kustner, Thomas
Hammernik, Kerstin
AU Pan、黄家珍、Wenqi Rueckert、Daniel Kustner、Thomas Hammernik、Kerstin
Motion-Compensated MR CINE Reconstruction With Reconstruction-Driven
Motion Estimation
通过重建驱动的运动估计进行运动补偿 MR CINE 重建
In cardiac CINE, motion-compensated MR reconstruction (MCMR) is an
effective approach to address highly undersampled acquisitions by
incorporating motion information between frames. In this work, we
propose a novel perspective for addressing the MCMR problem and a more
integrated and efficient solution to the MCMR field. Contrary to
state-of-the-art (SOTA) MCMR methods which break the original problem
into two sub-optimization problems, i.e. motion estimation and
reconstruction, we formulate this problem as a single entity with one
single optimization. Our approach is unique in that the motion
estimation is directly driven by the ultimate goal, reconstruction, but
not by the canonical motion-warping loss (similarity measurement between
motion-warped images and target images). We align the objectives of
motion estimation and reconstruction, eliminating the drawbacks of
artifacts-affected motion estimation and therefore error-propagated
reconstruction. Further, we can deliver high-quality reconstruction and
realistic motion without applying any regularization/smoothness loss
terms, circumventing the non-trivial weighting factor tuning. We
evaluate our method on two datasets: 1) an in-house acquired 2D CINE
dataset for the retrospective study and 2) the public OCMR cardiac
dataset for the prospective study. The conducted experiments indicate
that the proposed MCMR framework can deliver artifact-free motion
estimation and high-quality MR images even for imaging accelerations up
to 20x, outperforming SOTA non-MCMR and MCMR methods in both qualitative
and quantitative evaluation across all experiments. The code is
available at https://github.com/JZPeterPan/MCMR-Recon-Driven-Motion.
在心脏 CINE 中,运动补偿 MR 重建 (MCMR) 是一种通过合并帧之间的运动信息来解决高度欠采样采集问题的有效方法。在这项工作中,我们提出了解决 MCMR 问题的新视角,以及 MCMR 领域更集成、更高效的解决方案。与将原始问题分解为两个子优化问题(即运动估计和重建)的最先进(SOTA)MCMR 方法相反,我们将此问题表述为具有单个优化的单个实体。我们的方法的独特之处在于,运动估计是由最终目标重建直接驱动的,而不是由规范运动扭曲损失(运动扭曲图像和目标图像之间的相似性测量)驱动的。我们将运动估计和重建的目标结合起来,消除了受伪影影响的运动估计的缺点,从而消除了误差传播重建的缺点。此外,我们可以提供高质量的重建和真实的运动,而无需应用任何正则化/平滑度损失项,从而规避了重要的权重因子调整。我们在两个数据集上评估我们的方法:1)用于回顾性研究的内部获取的 2D CINE 数据集和 2)用于前瞻性研究的公共 OCMR 心脏数据集。进行的实验表明,即使成像加速度高达 20 倍,所提出的 MCMR 框架也可以提供无伪影运动估计和高质量 MR 图像,在所有实验的定性和定量评估中均优于 SOTA 非 MCMR 和 MCMR 方法。该代码可在 https://github.com/JZPeterPan/MCMR-Recon-Driven-Motion 获取。
AU Han, Kangfu
Li, Gang
Fang, Zhiwen
Yang, Feng
AU 韩、李康富、方刚、杨志文、冯
Multi-Template Meta-Information Regularized Network for Alzheimer's
Disease Diagnosis Using Structural MRI
使用结构 MRI 诊断阿尔茨海默病的多模板元信息正则化网络
Structural magnetic resonance imaging (sMRI) has been widely applied in
computer-aided Alzheimer's disease (AD) diagnosis, owing to its
capabilities in providing detailed brain morphometric patterns and
anatomical features in vivo. Although previous works have validated the
effectiveness of incorporating metadata (e.g., age, gender, and
educational years) for sMRI-based AD diagnosis, existing methods solely
paid attention to metadata-associated correlation to AD (e.g., gender
bias in AD prevalence) or confounding effects (e.g., the issue of normal
aging and metadata-related heterogeneity). Hence, it is difficult to
fully excavate the influence of metadata on AD diagnosis. To address
these issues, we constructed a novel Multi-template Meta-information
Regularized Network (MMRN) for AD diagnosis. Specifically, considering
diagnostic variation resulting from different spatial transformations
onto different brain templates, we first regarded different
transformations as data augmentation for self-supervised learning after
template selection. Since the confounding effects may arise from
excessive attention to meta-information owing to its correlation with
AD, we then designed the modules of weakly supervised meta-information
learning and mutual information minimization to learn and disentangle
meta-information from learned class-related representations, which
accounts for meta-information regularization for disease diagnosis. We
have evaluated our proposed MMRN on two public multi-center cohorts,
including the Alzheimer's Disease Neuroimaging Initiative (ADNI) with
1,950 subjects and the National Alzheimer's Coordinating Center (NACC)
with 1,163 subjects. The experimental results have shown that our
proposed method outperformed the state-of-the-art approaches in both
tasks of AD diagnosis, mild cognitive impairment (MCI) conversion
prediction, and normal control (NC) vs. MCI vs. AD classification.
结构磁共振成像(sMRI)因其能够提供详细的大脑形态测量模式和体内解剖特征而被广泛应用于计算机辅助阿尔茨海默病(AD)诊断。尽管之前的工作已经验证了将元数据(例如年龄、性别和受教育年限)纳入基于 sMRI 的 AD 诊断的有效性,但现有方法仅关注与 AD 相关的元数据相关性(例如 AD 患病率中的性别偏见)或混杂效应(例如,正常衰老问题和元数据相关的异质性)。因此,很难充分挖掘元数据对 AD 诊断的影响。为了解决这些问题,我们构建了一种新颖的用于 AD 诊断的多模板元信息正则化网络(MMRN)。具体来说,考虑到不同大脑模板上的不同空间变换所导致的诊断变化,我们首先将不同的变换视为模板选择后自我监督学习的数据增强。由于由于元信息与AD的相关性而过度关注元信息可能会产生混杂效应,因此我们设计了弱监督元信息学习和互信息最小化模块,以从学习到的类相关表示中学习和分离元信息,这解释了疾病诊断的元信息正则化。我们在两个公共多中心队列中评估了我们提出的 MMRN,其中包括阿尔茨海默病神经影像计划 (ADNI) 的 1,950 名受试者和国家阿尔茨海默病协调中心 (NACC) 的 1,163 名受试者。 实验结果表明,我们提出的方法在 AD 诊断、轻度认知障碍 (MCI) 转换预测以及正常对照 (NC) 与 MCI 与 AD 分类这两项任务中均优于最先进的方法。
AU Liang, Quanmin
Ma, Junji
Chen, Xitian
Lin, Qixiang
Shu, Ni
Dai, Zhengjia
Lin, Ying
区亮、马全民、陈俊吉、林西天、舒其翔、戴倪、林正佳、英
A Hybrid Routing Pattern in Human Brain Structural Network Revealed By
Evolutionary Computation
进化计算揭示人脑结构网络中的混合路由模式
The human brain functional connectivity network (FCN) is constrained and
shaped by the communication processes in the structural connectivity
network (SCN). The underlying communication mechanism thus becomes a
critical issue for understanding the formation and organization of the
FCN. A number of communication models supported by different routing
strategies have been proposed, with shortest path (SP), random diffusion
(DIF), and spatial navigation (NAV) as the most typical, respectively
requiring network global knowledge, local knowledge, and both for path
seeking. Yet these models all assumed every brain region to use one
routing strategy uniformly, ignoring convergent evidence that supports
the regional heterogeneity in both terms of biological substrates and
functional roles. In this regard, the current study developed a hybrid
communication model that allowed each brain region to choose a routing
strategy from SP, DIF, and NAV independently. A genetic algorithm was
designed to uncover the underlying region-wise hybrid routing strategy
(namely HYB). The HYB was found to outperform the three typical routing
strategies in predicting FCN and facilitating robust communication.
Analyses on HYB further revealed that brain regions in lower-order
functional modules inclined to route signals using global knowledge,
while those in higher-order functional modules preferred DIF that
requires only local knowledge. Compared to regions that used global
knowledge for routing, regions using DIF had denser structural
connections, participated in more functional modules, but played a less
dominant role within modules. Together, our findings further evidenced
that hybrid routing underpins efficient SCN communication and locally
heterogeneous structure-function coupling.
人脑功能连接网络(FCN)受到结构连接网络(SCN)中通信过程的约束和塑造。因此,底层的通信机制成为理解 FCN 形成和组织的关键问题。人们提出了多种支持不同路由策略的通信模型,其中最典型的是最短路径(SP)、随机扩散(DIF)和空间导航(NAV),分别需要网络全局知识、局部知识以及两者的支持。路径寻求。然而,这些模型都假设每个大脑区域统一使用一种路由策略,忽略了支持生物基质和功能角色方面区域异质性的收敛证据。对此,本研究开发了一种混合通信模型,允许每个大脑区域独立地从 SP、DIF 和 NAV 中选择路由策略。遗传算法旨在揭示潜在的区域混合路由策略(即 HYB)。研究发现 HYB 在预测 FCN 和促进稳健通信方面优于三种典型路由策略。对 HYB 的分析进一步表明,低阶功能模块中的大脑区域倾向于使用全局知识来路由信号,而高阶功能模块中的大脑区域则更喜欢仅需要局部知识的 DIF。与使用全局知识进行路由的区域相比,使用DIF的区域具有更密集的结构连接,参与更多的功能模块,但在模块内发挥的主导作用较小。总之,我们的研究结果进一步证明混合路由支持高效的 SCN 通信和局部异构结构功能耦合。
AU Yao, Qingsong
He, Zecheng
Li, Yuexiang
Lin, Yi
Ma, Kai
Zheng, Yefeng
Zhou, S. Kevin
姚AU、何庆松、李泽成、林跃翔、马一、郑凯、周业峰、S. Kevin
Adversarial Medical Image With Hierarchical Feature Hiding
具有层次特征隐藏的对抗性医学图像
Deep learning based methods for medical images can be easily compromised
by adversarial examples (AEs), posing a great security flaw in clinical
decision-making. It has been discovered that conventional adversarial
attacks like PGD which optimize the classification logits, are easy to
distinguish in the feature space, resulting in accurate reactive
defenses. To better understand this phenomenon and reassess the
reliability of the reactive defenses for medical AEs, we thoroughly
investigate the characteristic of conventional medical AEs.
Specifically, we first theoretically prove that conventional adversarial
attacks change the outputs by continuously optimizing vulnerable
features in a fixed direction, thereby leading to outlier
representations in the feature space. Then, a stress test is conducted
to reveal the vulnerability of medical images, by comparing with natural
images. Interestingly, this vulnerability is a double-edged sword, which
can be exploited to hide AEs. We then propose a simple-yet-effective
hierarchical feature constraint (HFC), a novel add-on to conventional
white-box attacks, which assists to hide the adversarial feature in the
target feature distribution. The proposed method is evaluated on three
medical datasets, both 2D and 3D, with different modalities. The
experimental results demonstrate the superiority of HFC, i.e., it
bypasses an array of state-of-the-art adversarial medical AE detectors
more efficiently than competing adaptive attacks, which reveals the
deficiencies of medical reactive defense and allows to develop more
robust defenses in future.
基于深度学习的医学图像方法很容易受到对抗性例子(AE)的影响,给临床决策带来很大的安全缺陷。人们发现,像PGD这样优化分类逻辑的传统对抗性攻击很容易在特征空间中区分,从而产生准确的反应性防御。为了更好地理解这一现象并重新评估医学不良事件反应防御的可靠性,我们深入研究了传统医学不良事件的特征。具体来说,我们首先从理论上证明,传统的对抗性攻击通过在固定方向上不断优化易受攻击的特征来改变输出,从而导致特征空间中的异常表示。然后,通过与自然图像进行比较,进行压力测试以揭示医学图像的脆弱性。有趣的是,这个漏洞是一把双刃剑,可以用来隐藏 AE。然后,我们提出了一种简单而有效的分层特征约束(HFC),这是传统白盒攻击的一种新颖的附加功能,有助于隐藏目标特征分布中的对抗特征。所提出的方法在具有不同模式的 2D 和 3D 三个医学数据集上进行评估。实验结果证明了 HFC 的优越性,即它比竞争性自适应攻击更有效地绕过一系列最先进的对抗性医学 AE 探测器,这揭示了医学反应防御的缺陷,并允许在以下领域开发更强大的防御未来。
AU Zhang, Fan
Cho, Kang Ik Kevin
Seitz-Holland, Johanna
Ning, Lipeng
Legarreta, Jon Haitz
Rathi, Yogesh
Westin, Carl-Fredrik
O'Donnell, Lauren J.
Pasternak, Ofer
AU 张、Fan Cho、Kang Ik Kevin Seitz-Holland、Johanna Ning、Lipeng Legarreta、Jon Haitz Rathi、Yogesh Westin、Carl-Fredrik O'Donnell、Lauren J. Pasternak、Ofer
DDParcel: Deep Learning Anatomical Brain Parcellation From Diffusion MRI
DDParcel:利用扩散 MRI 进行深度学习解剖脑分区
Parcellation of anatomically segregated cortical and subcortical brain
regions is required in diffusion MRI (dMRI) analysis for region-specific
quantification and better anatomical specificity of tractography. Most
current dMRI parcellation approaches compute the parcellation from
anatomical MRI (T1- or T2-weighted) data, using tools such as FreeSurfer
or CAT12, and then register it to the diffusion space. However, the
registration is challenging due to image distortions and low resolution
of dMRI data, often resulting in mislabeling in the derived brain
parcellation. Furthermore, these approaches are not applicable when
anatomical MRI data is unavailable. As an alternative we developed the
Deep Diffusion Parcellation (DDParcel), a deep learning method for fast
and accurate parcellation of brain anatomical regions directly from dMRI
data. The input to DDParcel are dMRI parameter maps and the output are
labels for 101 anatomical regions corresponding to the FreeSurfer
Desikan-Killiany (DK) parcellation. A multi-level fusion network
leverages complementary information in the different input maps, at
three network levels: input, intermediate layer, and output. DDParcel
learns the registration of diffusion features to anatomical MRI from the
high-quality Human Connectome Project data. Then, to predict brain
parcellation for a new subject, the DDParcel network no longer requires
anatomical MRI data but only the dMRI data. Comparing DDParcel's
parcellation with T1w-based parcellation shows higher test-retest
reproducibility and a higher regional homogeneity, while requiring much
less computational time. Generalizability is demonstrated on a range of
populations and dMRI acquisition protocols. Utility of DDParcel's
parcellation is demonstrated on tractography analysis for fiber tract
identification.
在扩散 MRI (dMRI) 分析中,需要对解剖学上分离的皮质和皮质下脑区域进行分区,以实现区域特异性量化和纤维束成像更好的解剖特异性。当前大多数 dMRI 分割方法使用 FreeSurfer 或 CAT12 等工具根据解剖 MRI(T1 或 T2 加权)数据计算分割,然后将其配准到扩散空间。然而,由于图像失真和 dMRI 数据分辨率低,配准具有挑战性,通常会导致派生的大脑分区中的错误标记。此外,当解剖 MRI 数据不可用时,这些方法就不适用。作为替代方案,我们开发了深度扩散分割 (DDParcel),这是一种深度学习方法,可直接根据 dMRI 数据快速准确地分割大脑解剖区域。 DDParcel 的输入是 dMRI 参数图,输出是与 FreeSurfer Desikan-Killiany (DK) 分区相对应的 101 个解剖区域的标签。多级融合网络在三个网络级别(输入、中间层和输出)利用不同输入映射中的互补信息。 DDParcel 从高质量的人类连接组项目数据中学习扩散特征与解剖 MRI 的配准。然后,为了预测新受试者的大脑分区,DDParcel 网络不再需要解剖 MRI 数据,而只需要 dMRI 数据。将 DDParcel 的分区与基于 T1w 的分区进行比较,显示出更高的重测再现性和更高的区域同质性,同时需要更少的计算时间。普遍性在一系列人群和 dMRI 采集协议中得到了证明。 DDParcel 分割的实用性在用于纤维束识别的纤维束成像分析中得到了证明。
AU Hashemi, Ali
Cai, Chang
Gao, Yijing
Ghosh, Sanjay
Mueller, Klaus-Robert
Nagarajan, Srikantan S.
Haufe, Stefan
AU Hashemi、阿里蔡、高昌、Yijing Ghosh、Sanjay Mueller、Klaus-Robert Nagarajan、Srikantan S. Haufe、Stefan
Joint Learning of Full-Structure Noise in Hierarchical Bayesian
Regression Models
分层贝叶斯回归模型中全结构噪声的联合学习
We consider the reconstruction of brain activity from
electroencephalography (EEG). This inverse problem can be formulated as
a linear regression with independent Gaussian scale mixture priors for
both the source and noise components. Crucial factors influencing the
accuracy of the source estimation are not only the noise level but also
its correlation structure, but existing approaches have not addressed
the estimation of noise covariance matrices with full structure. To
address this shortcoming, we develop hierarchical Bayesian (type-II
maximum likelihood) models for observations with latent variables for
source and noise, which are estimated jointly from data. As an extension
to classical sparse Bayesian learning (SBL), where across-sensor
observations are assumed to be independent and identically distributed,
we consider Gaussian noise with full covariance structure. Using the
majorization-maximization framework and Riemannian geometry, we derive
an efficient algorithm for updating the noise covariance along the
manifold of positive definite matrices. We demonstrate that our
algorithm has guaranteed and fast convergence and validate it in
simulations and with real MEG data. Our results demonstrate that the
novel framework significantly improves upon state-of-the-art techniques
in the real-world scenario where the noise is indeed non-diagonal and
full-structured. Our method has applications in many domains beyond
biomagnetic inverse problems.
我们考虑通过脑电图(EEG)重建大脑活动。这个反问题可以表述为具有源和噪声分量的独立高斯尺度混合先验的线性回归。影响源估计准确性的关键因素不仅是噪声水平,还包括其相关结构,但现有方法尚未解决具有完整结构的噪声协方差矩阵的估计。为了解决这个缺点,我们开发了分层贝叶斯(II 类最大似然)模型,用于具有源和噪声潜在变量的观测,这些变量是根据数据联合估计的。作为经典稀疏贝叶斯学习(SBL)的扩展,假设跨传感器观测值是独立且同分布的,我们考虑具有完整协方差结构的高斯噪声。使用majorization-maximization框架和黎曼几何,我们推导出一种有效的算法来更新沿正定矩阵流形的噪声协方差。我们证明我们的算法能够保证快速收敛,并在模拟和真实 MEG 数据中对其进行验证。我们的结果表明,该新颖的框架显着改进了现实场景中最先进的技术,其中噪声确实是非对角线和全结构化的。我们的方法在生物磁逆问题之外的许多领域都有应用。
AU Lian, Jie
Liu, Jingyu
Zhang, Shu
Gao, Kai
Liu, Xiaoqing
Zhang, Dingwen
Yu, Yizhou
AU Lian、刘杰、张靖宇、高树、刘凯、张晓庆、于丁文、一洲
A Structure-Aware Relation Network for Thoracic Diseases Detection and
Segmentation (vol 40, pg 2042, 2021)
用于胸部疾病检测和分割的结构感知关系网络(第 40 卷,第 2042 页,2021 年)
C1 Deepwise Artificial Intelligence Lab, Beijing 100080, Peoples R China
C1 Peking Univ, Sch Elect Engn & Comp Sci, Beijing 100871, Peoples R China
C1 Northwestern Polytech Univ, Sch Automat, Brain & Artificial Intelligence
Lab, Xian 710072, Peoples R China
C1 Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
C3 Deepwise Artificial Intelligence Lab
SN 0278-0062
EI 1558-254X
DA 2024-05-25
UT WOS:001203303400013
ER
C1 Deepwise 人工智能实验室,北京 100080,人民 R 中国 C1 北京大学工程学院,北京 100871,人民 R 中国 C1 西北工业大学,Sch Automat,脑与人工智能实验室,西安 710072,人民 R 中国 C1香港大学,计算机科学系,香港,人民 R China C3 Deepwise 人工智能实验室 SN 0278-0062 EI 1558-254X DA 2024-05-25 UT WOS:001203303400013 ER
AU Wang, Jiacheng
Jin, Yueming
Stoyanov, Danail
Wang, Liansheng
AU Wang、金家成、Yueming Stoyanov、Danail Wang、连胜
FedDP: Dual Personalization in Federated Medical Image Segmentation
FedDP:联合医学图像分割中的双重个性化
Personalized federated learning (PFL) addresses the data heterogeneity
challenge faced by general federated learning (GFL). Rather than
learning a single global model, with PFL a collection of models are
adapted to the unique feature distribution of each site. However,
current PFL methods rarely consider self-attention networks which can
handle data heterogeneity by long-range dependency modeling and they do
not utilize prediction inconsistencies in local models as an indicator
of site uniqueness. In this paper, we propose FedDP, a novel federated
learning scheme with dual personalization, which improves model
personalization from both feature and prediction aspects to boost image
segmentation results. We leverage long-range dependencies by designing a
local query (LQ) that decouples the query embedding layer out of each
local model, whose parameters are trained privately to better adapt to
the respective feature distribution of the site. We then propose
inconsistency-guided calibration (IGC), which exploits the inter-site
prediction inconsistencies to accommodate the model learning
concentration. By encouraging a model to penalize pixels with larger
inconsistencies, we better tailor prediction-level patterns to each
local site. Experimentally, we compare FedDP with the state-of-the-art
PFL methods on two popular medical image segmentation tasks with
different modalities, where our results consistently outperform others
on both tasks. Our code and models are available at
https://github.com/jcwang123/PFL-Seg-Trans.
个性化联邦学习 (PFL) 解决了通用联邦学习 (GFL) 面临的数据异构性挑战。通过 PFL,模型集合可以适应每个站点的独特特征分布,而不是学习单个全局模型。然而,当前的 PFL 方法很少考虑自注意力网络,它可以通过远程依赖建模来处理数据异质性,并且它们没有利用局部模型中的预测不一致作为站点唯一性的指标。在本文中,我们提出了 FedDP,一种具有双重个性化的新型联邦学习方案,它从特征和预测方面改进了模型个性化,以提高图像分割结果。我们通过设计本地查询(LQ)来利用远程依赖关系,该本地查询将查询嵌入层与每个本地模型分离,其参数经过私下训练,以更好地适应站点各自的特征分布。然后,我们提出了不一致引导校准(IGC),它利用站点间预测的不一致来适应模型学习的集中度。通过鼓励模型惩罚具有较大不一致的像素,我们可以更好地针对每个本地站点定制预测级别模式。在实验上,我们在两种不同模式的流行医学图像分割任务上将 FedDP 与最先进的 PFL 方法进行比较,我们的结果在这两项任务上始终优于其他任务。我们的代码和模型可在 https://github.com/jcwang123/PFL-Seg-Trans 获取。
AU Kadry, Karim
Olender, Max L
Schuh, Andreas
Karmakar, Abhishek
Petersen, Kersten
Schaap, Michiel
Marlevi, David
UpdePac, Adam
Mizukami, Takuya
Taylor, Charles
Edelman, Elazer R
Nezami, Farhad R
AU Kadry、Karim Olender、Max L Schuh、Andreas Karmakar、Abhishek Petersen、Kersten Schaap、Michiel Marlevi、David UpdePac、Adam Mizukami、Takuya Taylor、Charles Edelman、Elazer R Nezami、Farhad R
Morphology-based non-rigid registration of coronary computed tomography
and intravascular images through virtual catheter path optimization.
通过虚拟导管路径优化对冠状动脉计算机断层扫描和血管内图像进行基于形态学的非刚性配准。
Coronary computed tomography angiography (CCTA) provides 3D information
on obstructive coronary artery disease, but cannot fully visualize
high-resolution features within the vessel wall. Intravascular imaging,
in contrast, can spatially resolve atherosclerotic in cross sectional
slices, but is limited in capturing 3D relationships between each slice.
Co-registering CCTA and intravascular images enables a variety of
clinical research applications but is time consuming and user-dependent.
This is due to intravascular images suffering from non-rigid distortions
arising from irregularities in the imaging catheter path. To address
these issues, we present a morphology-based framework for the rigid and
non-rigid matching of intravascular images to CCTA images. To do this,
we find the optimal virtual catheter path that samples the coronary
artery in CCTA image space to recapitulate the coronary artery
morphology observed in the intravascular image. We validate our
framework on a multi-center cohort of 40 patients using bifurcation
landmarks as ground truth for longitudinal and rotational registration.
Our registration approach significantly outperforms other approaches for
bifurcation alignment. By providing a differentiable framework for
multi-modal vascular co-registration, our framework reduces the manual
effort required to conduct large-scale multi-modal clinical studies and
enables the development of machine learning-based co-registration
approaches.
冠状动脉计算机断层扫描血管造影 (CCTA) 提供阻塞性冠状动脉疾病的 3D 信息,但无法完全可视化血管壁内的高分辨率特征。相比之下,血管内成像可以在空间上解析横截面切片中的动脉粥样硬化,但在捕获每个切片之间的 3D 关系方面受到限制。联合配准 CCTA 和血管内图像可实现各种临床研究应用,但非常耗时且依赖于用户。这是由于血管内图像因成像导管路径的不规则性而遭受非刚性扭曲。为了解决这些问题,我们提出了一个基于形态学的框架,用于血管内图像与 CCTA 图像的刚性和非刚性匹配。为此,我们找到了在 CCTA 图像空间中对冠状动脉进行采样的最佳虚拟导管路径,以概括在血管内图像中观察到的冠状动脉形态。我们使用分叉地标作为纵向和旋转配准的基本事实,在 40 名患者的多中心队列上验证了我们的框架。我们的配准方法明显优于其他分叉对齐方法。通过为多模式血管联合注册提供可微分的框架,我们的框架减少了进行大规模多模式临床研究所需的手动工作,并能够开发基于机器学习的联合注册方法。
EI 1558-254X
DA 2024-10-09
UT MEDLINE:39374277
PM 39374277
ER
EI 1558-254X DA 2024-10-09 UT MEDLINE:39374277 PM 39374277 ER
AU Leconte, Alexis
Poree, Jonathan
Rauby, Brice
Wu, Alice
Ghigo, Nin
Xing, Paul
Lee, Stephen
Bourquin, Chloe
Ramos-Palacios, Gerardo
Sadikot, Abbas F
Provost, Jean
AU Leconte、Alexis Poree、Jonathan Rauby、Brice Wu、Alice Ghigo、Nin Xing、Paul Lee、Stephen Bourquin、Chloe Ramos-Palacios、Gerardo Sadikot、Abbas F Provost、Jean
A Tracking prior to Localization workflow for Ultrasound Localization
Microscopy.
超声波定位显微镜的定位之前的跟踪工作流程。
Ultrasound Localization Microscopy (ULM) has proven effective in
resolving microvascular structures and local mean velocities at
sub-diffraction-limited scales, offering high-resolution imaging
capabilities. Dynamic ULM (DULM) enables the creation of angiography or
velocity movies throughout cardiac cycles. Currently, these techniques
rely on a Localization-and-Tracking (LAT) workflow consisting in
detecting microbubbles (MB) in the frames before pairing them to
generate tracks. While conventional LAT methods perform well at low
concentrations, they suffer from longer acquisition times and degraded
localization and tracking accuracy at higher concentrations, leading to
biased angiogram reconstruction and velocity estimation. In this study,
we propose a novel approach to address these challenges by reversing the
current workflow. The proposed method, Tracking-and-Localization (TAL),
relies on first tracking the MB and then performing localization.
Through comprehensive benchmarking using both in silico and in vivo
experiments and employing various metrics to quantify ULM angiography
and velocity maps, we demonstrate that the TAL method consistently
outperforms the reference LAT workflow. Moreover, when applied to DULM,
TAL successfully extracts velocity variations along the cardiac cycle
with improved repeatability. The findings of this work highlight the
effectiveness of the TAL approach in overcoming the limitations of
conventional LAT methods, providing enhanced ULM angiography and
velocity imaging.
超声定位显微镜 (ULM) 已被证明可以有效地解析亚衍射极限尺度的微血管结构和局部平均速度,并提供高分辨率成像功能。动态 ULM (DULM) 可以在整个心动周期内创建血管造影或速度影片。目前,这些技术依赖于定位和跟踪 (LAT) 工作流程,包括在将帧配对以生成轨迹之前检测帧中的微泡 (MB)。虽然传统的 LAT 方法在低浓度下表现良好,但在较高浓度下,它们的采集时间较长,定位和跟踪精度下降,导致血管造影重建和速度估计出现偏差。在这项研究中,我们提出了一种通过扭转当前工作流程来应对这些挑战的新方法。所提出的方法“跟踪和定位”(TAL) 依赖于首先跟踪 MB,然后执行定位。通过使用计算机和体内实验进行全面基准测试,并采用各种指标来量化 ULM 血管造影和速度图,我们证明 TAL 方法始终优于参考 LAT 工作流程。此外,当应用于 DULM 时,TAL 成功提取了心动周期中的速度变化,并提高了可重复性。这项工作的结果强调了 TAL 方法在克服传统 LAT 方法的局限性方面的有效性,提供增强的 ULM 血管造影和速度成像。
AU Song, Xuegang
Shu, Kaixiang
Yang, Peng
Zhao, Cheng
Zhou, Feng
Frangi, Alejandro F
Xiao, Xiaohua
Dong, Lei
Wang, Tianfu
Wang, Shuqiang
Lei, Baiying
区松、舒学刚、杨凯翔、赵鹏、周成、Feng Frangi、Alejandro F Xiao、董晓华、王雷、王天福、雷树强、白英
Knowledge-aware Multisite Adaptive Graph Transformer for Brain Disorder
Diagnosis.
用于脑部疾病诊断的知识感知多站点自适应图形转换器。
Brain disorder diagnosis via resting-state functional magnetic resonance
imaging (rs-fMRI) is usually limited due to the complex imaging features
and sample size. For brain disorder diagnosis, the graph convolutional
network (GCN) has achieved remarkable success by capturing interactions
between individuals and the population. However, there are mainly three
limitations: 1) The previous GCN approaches consider the non-imaging
information in edge construction but ignore the sensitivity differences
of features to non-imaging information. 2) The previous GCN approaches
solely focus on establishing interactions between subjects (i.e.,
individuals and the population), disregarding the essential relationship
between features. 3) Multisite data increase the sample size to help
classifier training, but the inter-site heterogeneity limits the
performance to some extent. This paper proposes a knowledge-aware
multisite adaptive graph Transformer to address the above problems.
First, we evaluate the sensitivity of features to each piece of
non-imaging information, and then construct feature-sensitive and
feature-insensitive subgraphs. Second, after fusing the above subgraphs,
we integrate a Transformer module to capture the intrinsic relationship
between features. Third, we design a domain adaptive GCN using multiple
loss function terms to relieve data heterogeneity and to produce the
final classification results. Last, the proposed framework is validated
on two brain disorder diagnostic tasks. Experimental results show that
the proposed framework can achieve state-of-the-art performance.
由于复杂的成像特征和样本量,通过静息态功能磁共振成像 (rs-fMRI) 进行脑部疾病诊断通常受到限制。对于脑部疾病诊断,图卷积网络(GCN)通过捕获个体与群体之间的相互作用取得了显着的成功。然而,主要存在三个局限性:1)先前的GCN方法在边缘构建中考虑了非图像信息,但忽略了特征对非图像信息的敏感性差异。 2)以前的GCN方法仅关注建立主体(即个体和群体)之间的相互作用,而忽略了特征之间的本质关系。 3)多站点数据增加了样本量以帮助分类器训练,但站点间的异质性在一定程度上限制了性能。本文提出了一种知识感知的多站点自适应图 Transformer 来解决上述问题。首先,我们评估特征对每条非图像信息的敏感性,然后构建特征敏感和特征不敏感子图。其次,在融合上述子图之后,我们集成了一个 Transformer 模块来捕获特征之间的内在关系。第三,我们设计了一个域自适应 GCN,使用多个损失函数项来减轻数据异质性并产生最终的分类结果。最后,所提出的框架在两项脑部疾病诊断任务上得到了验证。实验结果表明,所提出的框架可以实现最先进的性能。
AU Chakravarty, Arunava
Emre, Taha
Leingang, Oliver
Riedl, Sophie
Mai, Julia
Scholl, Hendrik P. N.
Sivaprasad, Sobha
Rueckert, Daniel
Lotery, Andrew
Schmidt-Erfurth, Ursula
Bogunovic, Hrvoje
CA PINNACLE Consortium
AU Chakravarty、Arunava Emre、Taha Leingang、Oliver Riedl、Sophie Mai、Julia Scholl、Hendrik PN Sivaprasad、Sobha Rueckert、Daniel Lotery、Andrew Schmidt-Erfurth、Ursula Bogunovic、Hrvoje CA PINNACLE 联盟
Morph-SSL: Self-Supervision With Longitudinal Morphing for Forecasting
AMD Progression From OCT Volumes
Morph-SSL:利用纵向变形进行自我监督,用于根据 OCT 体积预测 AMD 进展
The lack of reliable biomarkers makes predicting the conversion from
intermediate to neovascular age-related macular degeneration (iAMD,
nAMD) a challenging task. We develop a Deep Learning (DL) model to
predict the future risk of conversion of an eye from iAMD to nAMD from
its current OCT scan. Although eye clinics generate vast amounts of
longitudinal OCT scans to monitor AMD progression, only a small subset
can be manually labeled for supervised DL. To address this issue, we
propose Morph-SSL, a novel Self-supervised Learning (SSL) method for
longitudinal data. It uses pairs of unlabelled OCT scans from different
visits and involves morphing the scan from the previous visit to the
next. The Decoder predicts the transformation for morphing and ensures a
smooth feature manifold that can generate intermediate scans between
visits through linear interpolation. Next, the Morph-SSL trained
features are input to a Classifier which is trained in a supervised
manner to model the cumulative probability distribution of the time to
conversion with a sigmoidal function. Morph-SSL was trained on
unlabelled scans of 399 eyes (3570 visits). The Classifier was evaluated
with a five-fold cross-validation on 2418 scans from 343 eyes with
clinical labels of the conversion date. The Morph-SSL features achieved
an AUC of 0.779 in predicting the conversion to nAMD within the next 6
months, outperforming the same network when trained end-to-end from
scratch or pre-trained with popular SSL methods. Automated prediction of
the future risk of nAMD onset can enable timely treatment and
individualized AMD management.
由于缺乏可靠的生物标志物,预测从中间型到新生血管性年龄相关性黄斑变性(iAMD、nAMD)的转化成为一项具有挑战性的任务。我们开发了一种深度学习 (DL) 模型,通过当前的 OCT 扫描来预测眼睛从 iAMD 转换为 nAMD 的未来风险。尽管眼科诊所生成大量纵向 OCT 扫描来监测 AMD 进展,但只有一小部分可以手动标记为监督 DL。为了解决这个问题,我们提出了 Morph-SSL,一种新颖的纵向数据自监督学习(SSL)方法。它使用来自不同访问的成对未标记 OCT 扫描,并涉及将上次访问的扫描变形为下一次访问的扫描。解码器预测变形的变换并确保平滑的特征流形,可以通过线性插值在访问之间生成中间扫描。接下来,将 Morph-SSL 训练的特征输入到分类器中,该分类器以监督方式进行训练,以使用 sigmoidal 函数对转换时间的累积概率分布进行建模。 Morph-SSL 在 399 只眼睛(3570 次访问)的未标记扫描上进行了训练。该分类器通过五倍交叉验证对 343 只眼睛的 2418 次扫描进行了评估,并附有转换日期的临床标签。 Morph-SSL 功能在预测未来 6 个月内向 nAMD 的转换方面实现了 0.779 的 AUC,在从头开始进行端到端训练或使用流行的 SSL 方法进行预训练时,其性能优于同一网络。自动预测 nAMD 未来发病风险可以实现及时治疗和个性化 AMD 管理。
AU Noelke, Jan-Hinrich
Adler, Tim J.
Schellenberg, Melanie
Dreher, Kris K.
Holzwarth, Niklas
Bender, Christoph J.
Tizabi, Minu D.
Seitel, Alexander
Maier-Hein, Lena
AU Noelke、Jan-Hinrich Adler、Tim J. Schellenberg、Melanie Dreher、Kris K. Holzwarth、Niklas Bender、Christoph J. Tizabi、Minu D. Seitel、Alexander Maier-Hein、Lena
Photoacoustic Quantification of Tissue Oxygenation Using Conditional
Invertible Neural Networks
使用条件可逆神经网络对组织氧合进行光声定量
Intelligent systems in interventional healthcare depend on the reliable
perception of the environment. In this context, photoacoustic tomography
(PAT) has emerged as a non-invasive, functional imaging modality with
great clinical potential. Current research focuses on converting the
high-dimensional, not human-interpretable spectral data into the
underlying functional information, specifically the blood oxygenation.
One of the largely unexplored issues stalling clinical advances is the
fact that the quantification problem is ambiguous, i.e. that radically
different tissue parameter configurations could lead to almost identical
photoacoustic spectra. In the present work, we tackle this problem with
conditional Invertible Neural Networks (cINNs). Going beyond traditional
point estimates, our network is used to compute an approximation of the
conditional posterior density of tissue parameters given the
photoacoustic spectrum. To this end, an automatic mode detection
algorithm extracts the plausible solution from the sample-based
posterior. According to a comprehensive validation study based on both
synthetic and real images, our approach is well-suited for exploring
ambiguity in quantitative PAT.
介入医疗保健中的智能系统依赖于对环境的可靠感知。在这种背景下,光声断层扫描(PAT)已成为一种具有巨大临床潜力的非侵入性功能成像方式。目前的研究重点是将高维的、非人类可解释的光谱数据转换为潜在的功能信息,特别是血液氧合。阻碍临床进展的很大程度上尚未探索的问题之一是量化问题不明确,即完全不同的组织参数配置可能导致几乎相同的光声光谱。在目前的工作中,我们使用条件可逆神经网络(cINN)来解决这个问题。超越传统的点估计,我们的网络用于计算给定光声光谱的组织参数的条件后验密度的近似值。为此,自动模式检测算法从基于样本的后验中提取合理的解决方案。根据基于合成图像和真实图像的综合验证研究,我们的方法非常适合探索定量 PAT 中的模糊性。
AU Ye, Shuquan
Xu, Yan
Chen, Dongdong
Han, Songfang
Liao, Jing
区野、徐淑全、陈彦、韩东东、廖松芳、静
Learning a Single Network for Robust Medical Image Segmentation With
Noisy Labels
学习单个网络以实现具有噪声标签的鲁棒医学图像分割
Robust segmenting with noisy labels is an important problem in medical
imaging due to the difficulty of acquiring high-quality annotations.
Despite the enormous success of recent developments, these developments
still require multiple networks to construct their frameworks and focus
on limited application scenarios, which leads to inflexibility in
practical applications. They also do not explicitly consider the coarse
boundary label problem, which results in sub-optimal results. To
overcome these challenges, we propose a novel Simultaneous Edge
Alignment and Memory-Assisted Learning (SEAMAL) framework for
noisy-label robust segmentation. It achieves single-network robust
learning, which is applicable for both 2D and 3D segmentation, in both
Set-HQ-knowable and Set-HQ-agnostic scenarios. Specifically, to achieve
single-model noise robustness, we design a Memory-assisted Selection and
Correction module (MSC) that utilizes predictive history consistency
from the Prediction Memory Bank to distinguish between reliable and
non-reliable labels pixel-wisely, and that updates the reliable ones at
the superpixel level. To overcome the coarse boundary label problem,
which is common in practice, and to better utilize shape-relevant
information at the boundary, we propose an Edge Detection Branch (EDB)
that explicitly learns the boundary via an edge detection layer with
only slight additional computational cost, and we improve the sharpness
and precision of the boundary with a thinning loss. Extensive
experiments verify that SEAMAL outperforms previous works significantly.
由于获取高质量注释的困难,带有噪声标签的鲁棒分割是医学成像中的一个重要问题。尽管最近的发展取得了巨大的成功,但这些发展仍然需要多个网络来构建其框架,并且专注于有限的应用场景,这导致实际应用中缺乏灵活性。他们也没有明确考虑粗边界标签问题,这导致了次优结果。为了克服这些挑战,我们提出了一种新颖的同步边缘对齐和记忆辅助学习(SEAMAL)框架,用于噪声标签鲁棒分割。它实现了单网络鲁棒学习,适用于 2D 和 3D 分割、Set-HQ-knowable 和 Set-HQ-agnostic 场景。具体来说,为了实现单模型噪声鲁棒性,我们设计了一个内存辅助选择和校正模块(MSC),该模块利用预测内存库中的预测历史一致性来逐像素区分可靠和不可靠标签,并更新超像素级别的可靠。为了克服实践中常见的粗边界标签问题,并更好地利用边界处的形状相关信息,我们提出了一种边缘检测分支(EDB),它通过边缘检测层显式地学习边界,只需少量的额外计算成本,并且我们通过细化损失提高了边界的清晰度和精度。大量的实验验证了 SEAMAL 的性能显着优于之前的工作。
AU Dai, Tianjie
Zhang, Ruipeng
Hong, Feng
Yao, Jiangchao
Zhang, Ya
Wang, Yanfeng
戴AU、张天杰、洪瑞鹏、姚峰、张江超、王亚、燕峰
UniChest: Conquer-and-Divide Pre-Training for Multi-Source Chest X-Ray
Classification
UniChest:多源胸部 X 射线分类的征服和划分预训练
Vision-Language Pre-training (VLP) that utilizes the multi-modal
information to promote the training efficiency and effectiveness, has
achieved great success in vision recognition of natural domains and
shown promise in medical imaging diagnosis for the Chest X-Rays (CXRs).
However, current works mainly pay attention to the exploration on single
dataset of CXRs, which locks the potential of this powerful paradigm on
larger hybrid of multi-source CXRs datasets. We identify that although
blending samples from the diverse sources offers the advantages to
improve the model generalization, it is still challenging to maintain
the consistent superiority for the task of each source due to the
existing heterogeneity among sources. To handle this dilemma, we design
a Conquer-and-Divide pre-training framework, termed as UniChest, aiming
to make full use of the collaboration benefit of multiple sources of
CXRs while reducing the negative influence of the source heterogeneity.
Specially, the "Conquer" stage in UniChest encourages the model to
sufficiently capture multi-source common patterns, and the "Divide"
stage helps squeeze personalized patterns into different small experts
(query networks). We conduct thorough experiments on many benchmarks,
e.g., ChestX-ray14, CheXpert, Vindr-CXR, Shenzhen, Open-I and SIIM-ACR
Pneumothorax, verifying the effectiveness of UniChest over a range of
baselines, and release our codes and pre-training models at
https://github.com/Elfenreigen/UniChest.
视觉语言预训练(VLP)利用多模态信息来提高训练效率和效果,在自然领域的视觉识别方面取得了巨大成功,并在胸部X光(CXR)的医学影像诊断中显示出应用前景。然而,当前的工作主要关注对单个 CXR 数据集的探索,这将这种强大范式的潜力锁定在更大的多源 CXR 数据集混合上。我们发现,尽管混合来自不同来源的样本可以提供提高模型泛化能力的优势,但由于来源之间存在异质性,保持每个来源的任务的一致优势仍然具有挑战性。为了解决这个困境,我们设计了一个征服和划分预训练框架,称为UniChest,旨在充分利用多源CXR的协作优势,同时减少源异构性的负面影响。特别是,UniChest 中的“征服”阶段鼓励模型充分捕获多源常见模式,“划分”阶段有助于将个性化模式压缩到不同的小专家(查询网络)中。我们在ChestX-ray14、CheXpert、Vindr-CXR、Shenzhen、Open-I和SIIM-ACR Pneumothorax等许多基准上进行了深入的实验,验证了UniChest在一系列基准上的有效性,并发布了我们的代码和预训练模型位于 https://github.com/Elfenreigen/UniChest。
AU Chen, Ming
Bian, Yijun
Chen, Nanguang
Qiu, Anqi
陈AU、卞明、陈一君、邱南光、安琪
Orthogonal Mixed-Effects Modeling for High-Dimensional Longitudinal
Data: An Unsupervised Learning Approach.
高维纵向数据的正交混合效应建模:一种无监督学习方法。
The linear mixed-effects model is commonly utilized to interpret
longitudinal data, characterizing both the global longitudinal
trajectory across all observations and longitudinal trajectories within
individuals. However, characterizing these trajectories in
high-dimensional longitudinal data presents a challenge. To address
this, our study proposes a novel approach, Unsupervised Orthogonal
Mixed-Effects Trajectory Modeling (UOMETM), that leverages unsupervised
learning to generate latent representations of both global and
individual trajectories. We design an autoencoder with a latent space
where an orthogonal constraint is imposed to separate the space of
global trajectories from individual trajectories. We also devise a
cross-reconstruction loss to ensure consistency of global trajectories
and enhance the orthogonality between representation spaces. To evaluate
UOMETM, we conducted simulation experiments on images to verify that
every component functions as intended. Furthermore, we evaluated its
performance and robustness using longitudinal brain cortical thickness
from two Alzheimer's disease (AD) datasets. Comparative analyses with
state-of-the-art methods revealed UOMETM's superiority in identifying
global and individual longitudinal patterns, achieving a lower
reconstruction error, superior orthogonality, and higher accuracy in AD
classification and conversion forecasting. Remarkably, we found that the
space of global trajectories did not significantly contribute to AD
classification compared to the space of individual trajectories,
emphasizing their clear separation. Moreover, our model exhibited
satisfactory generalization and robustness across different datasets.
The study shows the outstanding performance and potential clinical use
of UOMETM in the context of longitudinal data analysis.
线性混合效应模型通常用于解释纵向数据,表征所有观察结果的全局纵向轨迹和个体内部的纵向轨迹。然而,在高维纵向数据中描述这些轨迹是一个挑战。为了解决这个问题,我们的研究提出了一种新方法,即无监督正交混合效应轨迹建模(UOMETM),它利用无监督学习来生成全局和个体轨迹的潜在表示。我们设计了一个具有潜在空间的自动编码器,其中施加正交约束以将全局轨迹的空间与个体轨迹分开。我们还设计了交叉重建损失来确保全局轨迹的一致性并增强表示空间之间的正交性。为了评估 UOMETM,我们对图像进行了模拟实验,以验证每个组件是否按预期运行。此外,我们使用两个阿尔茨海默病 (AD) 数据集的纵向大脑皮质厚度评估了其性能和稳健性。与最先进方法的比较分析揭示了 UOMETM 在识别全局和个体纵向模式、实现较低的重建误差、优异的正交性以及 AD 分类和转换预测方面更高的准确性方面的优越性。值得注意的是,我们发现与个体轨迹的空间相比,全局轨迹的空间对 AD 分类没有显着贡献,强调了它们的明显分离。此外,我们的模型在不同数据集上表现出令人满意的泛化性和鲁棒性。 该研究在纵向数据分析的背景下展示了 UOMETM 的出色性能和潜在的临床用途。
AU Liu, Jiaxuan
Li, Haitao
Zeng, Bolun
Wang, Huixiang
Kikinis, Ron
Joskowicz, Leo
Chen, Xiaojun
AU 刘、李家轩、曾海涛、王博伦、慧翔 Kikinis、Ron Joskowicz、Leo Chen、晓军
An end-to-end geometry-based pipeline for automatic preoperative
surgical planning of pelvic fracture reduction and fixation.
一种基于几何形状的端到端管道,用于自动进行骨盆骨折复位和固定术前手术规划。
Computer-assisted preoperative planning of pelvic fracture reduction
surgery has the potential to increase the accuracy of the surgery and to
reduce complications. However, the diversity of the pelvic fractures and
the disturbance of small fracture fragments present a great challenge to
perform reliable automatic preoperative planning. In this paper, we
present a comprehensive and automatic preoperative planning pipeline for
pelvic fracture surgery. It includes pelvic fracture labeling, reduction
planning of the fracture, and customized screw implantation. First,
automatic bone fracture labeling is performed based on the separation of
the fracture sections. Then, fracture reduction planning is performed
based on automatic extraction and pairing of the fracture surfaces.
Finally, screw implantation is planned using the adjoint fracture
surfaces. The proposed pipeline was tested on different types of pelvic
fracture in 14 clinical cases. Our method achieved a translational and
rotational accuracy of 2.56 mm and 3.31° in reduction planning. For
fixation planning, a clinical acceptance rate of 86.7% was achieved. The
results demonstrate the feasibility of the clinical application of our
method. Our method has shown accuracy and reliability for complex
multi-body bone fractures, which may provide effective clinical
preoperative guidance and may improve the accuracy of pelvic fracture
reduction surgery.
计算机辅助骨盆骨折复位手术的术前计划有可能提高手术的准确性并减少并发症。然而,骨盆骨折的多样性和小骨折碎片的干扰对进行可靠的自动术前计划提出了巨大的挑战。在本文中,我们提出了一种用于骨盆骨折手术的全面且自动的术前计划流程。它包括骨盆骨折标记、骨折复位计划和定制螺钉植入。首先,根据骨折断面的分离进行自动骨折标记。然后,基于骨折表面的自动提取和配对来执行骨折复位计划。最后,计划使用伴随断裂面进行螺钉植入。所提出的管道在 14 个临床病例中对不同类型的骨盆骨折进行了测试。我们的方法在缩减规划中实现了 2.56 毫米和 3.31° 的平移和旋转精度。对于固定计划,临床接受率为86.7%。结果证明了我们的方法临床应用的可行性。我们的方法对复杂的多体骨折表现出准确性和可靠性,可以为临床提供有效的术前指导,并可以提高骨盆骨折复位手术的准确性。
AU Karageorgos, Grigorios M
Zhang, Jiayong
Peters, Nils
Xia, Wenjun
Niu, Chuang
Paganetti, Harald
Wang, Ge
De Man, Bruno
AU Karageorgos、Grigorios M 张、Jiayong Peters、Nils Xia、牛文君、Chuang Paganetti、Harald Wang、葛德曼、Bruno
A denoising diffusion probabilistic model for metal artifact reduction
in CT.
CT 中减少金属伪影的去噪扩散概率模型。
The presence of metal objects leads to corrupted CT projection
measurements, resulting in metal artifacts in the reconstructed CT
images. AI promises to offer improved solutions to estimate missing
sinogram data for metal artifact reduction (MAR), as previously shown
with convolutional neural networks (CNNs) and generative adversarial
networks (GANs). Recently, denoising diffusion probabilistic models
(DDPM) have shown great promise in image generation tasks, potentially
outperforming GANs. In this study, a DDPM-based approach is proposed for
inpainting of missing sinogram data for improved MAR. The proposed model
is unconditionally trained, free from information on metal objects,
which can potentially enhance its generalization capabilities across
different types of metal implants compared to conditionally trained
approaches. The performance of the proposed technique was evaluated and
compared to the state-of-the-art normalized MAR (NMAR) approach as well
as to CNN-based and GAN-based MAR approaches. The DDPM-based approach
provided significantly higher SSIM and PSNR, as compared to NMAR (SSIM:
p < 10-26; PSNR: p < 10-21), the CNN (SSIM: p < 10-25; PSNR: p < 10-9)
and the GAN (SSIM: p < 10-6; PSNR: p < 0.05) methods. The DDPM-MAR
technique was further evaluated based on clinically relevant image
quality metrics on clinical CT images with virtually introduced metal
objects and metal artifacts, demonstrating superior quality relative to
the other three models. In general, the AI-based techniques showed
improved MAR performance compared to the non-AI-based NMAR approach. The
proposed methodology shows promise in enhancing the effectiveness of
MAR, and therefore improving the diagnostic accuracy of CT.
金属物体的存在会导致 CT 投影测量损坏,从而导致重建的 CT 图像中出现金属伪影。人工智能有望提供改进的解决方案来估计缺失的正弦图数据,以减少金属伪影 (MAR),正如之前的卷积神经网络 (CNN) 和生成对抗网络 (GAN) 所示。最近,去噪扩散概率模型(DDPM)在图像生成任务中显示出巨大的前景,可能优于 GAN。在本研究中,提出了一种基于 DDPM 的方法来修复缺失的正弦图数据,以改进 MAR。所提出的模型是无条件训练的,不受金属物体信息的影响,与有条件训练的方法相比,这可以潜在地增强其在不同类型金属植入物上的泛化能力。对所提出技术的性能进行了评估,并将其与最先进的归一化 MAR (NMAR) 方法以及基于 CNN 和基于 GAN 的 MAR 方法进行了比较。与 NMAR(SSIM:p < 10-26;PSNR:p < 10-21)、CNN(SSIM:p < 10-25;PSNR:PSNR: p < 10-9)和 GAN(SSIM:p < 10-6;PSNR:p < 0.05)方法。基于临床 CT 图像上的临床相关图像质量指标,进一步评估了 DDPM-MAR 技术,其中虚拟引入了金属物体和金属伪影,证明了相对于其他三种模型的卓越质量。总体而言,与非基于 AI 的 NMAR 方法相比,基于 AI 的技术显示出改进的 MAR 性能。所提出的方法有望提高 MAR 的有效性,从而提高 CT 的诊断准确性。
AU Zhou, Jie
Jie, Biao
Wang, Zhengdong
Zhang, Zhixiang
Du, Tongchun
Bian, Weixin
Yang, Yang
Jia, Jun
周杰、王彪、张正东、杜志祥、卞同春、杨伟新、杨佳、军
LCGNet: Local Sequential Feature Coupling Global Representation Learning
for Functional Connectivity Network Analysis with fMRI.
LCGNet:使用 fMRI 进行功能连接网络分析的局部顺序特征耦合全局表示学习。
Analysis of functional connectivity networks (FCNs) derived from
resting-state functional magnetic resonance imaging (rs-fMRI) has
greatly advanced our understanding of brain diseases, including
Alzheimer's disease (AD) and attention deficit hyperactivity disorder
(ADHD). Advanced machine learning techniques, such as convolutional
neural networks (CNNs), have been used to learn high-level feature
representations of FCNs for automated brain disease classification. Even
though convolution operations in CNNs are good at extracting local
properties of FCNs, they generally cannot well capture global temporal
representations of FCNs. Recently, the transformer technique has
demonstrated remarkable performance in various tasks, which is
attributed to its effective self-attention mechanism in capturing the
global temporal feature representations. However, it cannot effectively
model the local network characteristics of FCNs. To this end, in this
paper, we propose a novel network structure for Local sequential feature
Coupling Global representation learning (LCGNet) to take advantage of
convolutional operations and self-attention mechanisms for enhanced FCN
representation learning. Specifically, we first build a dynamic FCN for
each subject using an overlapped sliding window approach. We then
construct three sequential components (i.e., edge-to-vertex layer,
vertex-to-network layer, and network-to-temporality layer) with a dual
backbone branch of CNN and transformer to extract and couple from local
to global topological information of brain networks. Experimental
results on two real datasets (i.e., ADNI and ADHD-200) with rs-fMRI data
show the superiority of our LCGNet.
对静息态功能磁共振成像 (rs-fMRI) 衍生的功能连接网络 (FCN) 的分析极大地增进了我们对脑部疾病的理解,包括阿尔茨海默病 (AD) 和注意力缺陷多动障碍 (ADHD)。先进的机器学习技术,例如卷积神经网络 (CNN),已被用来学习 FCN 的高级特征表示,以实现自动脑部疾病分类。尽管 CNN 中的卷积运算擅长提取 FCN 的局部属性,但它们通常不能很好地捕获 FCN 的全局时间表示。最近,Transformer 技术在各种任务中表现出了卓越的性能,这归因于其在捕获全局时间特征表示方面的有效自注意力机制。然而,它不能有效地模拟FCN的本地网络特征。为此,在本文中,我们提出了一种用于局部顺序特征耦合全局表示学习(LCGNet)的新型网络结构,以利用卷积运算和自注意力机制来增强 FCN 表示学习。具体来说,我们首先使用重叠滑动窗口方法为每个主题构建动态 FCN。然后,我们使用 CNN 和 Transformer 的双主干分支构建三个顺序组件(即边到顶点层、顶点到网络层和网络到时间层),以提取和耦合局部到全局拓扑信息的大脑网络。使用 rs-fMRI 数据在两个真实数据集(即 ADNI 和 ADHD-200)上进行的实验结果表明了我们的 LCGNet 的优越性。
AU Chabouh, Georges
Denis, Louise
Bodard, Sylvain
Lager, Franck
Renault, Gilles
Chavignon, Arthur
Couture, Olivier
AU Chabouh、乔治·丹尼斯、路易丝·博达尔、西尔万·拉格、弗兰克·雷诺、吉尔·夏维农、亚瑟·库图尔、奥利维尔
Whole organ volumetric sensing Ultrasound Localization Microscopy for
characterization of kidney structure.
全器官体积传感超声定位显微镜用于表征肾脏结构。
Glomeruli are the filtration units of the kidney and their function
relies heavily on their microcirculation. Despite its obvious diagnostic
importance, an accurate estimation of blood flow in the capillary bundle
within glomeruli defies the resolution of conventional imaging
modalities. Ultrasound Localization Microscopy (ULM) has demonstrated
its ability to image in-vivo deep organs in the body. Recently, the
concept of sensing ULM or sULM was introduced to classify individual
microbubble behavior based on the expected physiological conditions at
the micrometric scale. In the kidney of both rats and humans, it
revealed glomerular structures in 2D but was severely limited by planar
projection. In this work, we aim to extend sULM in 3D to image the whole
organ and in order to perform an accurate characterization of the entire
kidney structure. The extension of sULM into the 3D domain allows better
localization and more robust tracking. The 3D metrics of velocity and
pathway angular shift made glomerular mask possible. This approach
facilitated the quantification of glomerular physiological parameter
such as an interior traveled distance of approximately 7.5 ± 0.6 microns
within the glomerulus. This study introduces a technique that
characterize the kidney physiology which can serve as a method to
facilite pathology assessment. Furthermore, its potential for clinical
relevance could serve as a bridge between research and practical
application, leading to innovative diagnostics and improved patient
care..
肾小球是肾脏的过滤单位,其功能很大程度上依赖于其微循环。尽管其诊断重要性显而易见,但对肾小球内毛细血管束血流量的准确估计仍无法满足传统成像方式的分辨率。超声定位显微镜 (ULM) 已证明其能够对体内深部器官进行成像。最近,引入了传感 ULM 或 sULM 的概念,以根据微米尺度的预期生理条件对个体微泡行为进行分类。在大鼠和人类的肾脏中,它显示了二维肾小球结构,但受到平面投影的严重限制。在这项工作中,我们的目标是扩展 3D sULM 以对整个器官进行成像,并对整个肾脏结构进行准确的表征。将 sULM 扩展到 3D 域可以实现更好的定位和更稳健的跟踪。速度和通路角位移的 3D 指标使肾小球掩模成为可能。这种方法有助于量化肾小球生理参数,例如肾小球内约 7.5 ± 0.6 微米的内部行进距离。本研究介绍了一种表征肾脏生理学的技术,可作为促进病理学评估的方法。此外,其临床相关性的潜力可以作为研究和实际应用之间的桥梁,从而带来创新的诊断和改善的患者护理。
AU Yang, Bao
Gong, Kuang
Liu, Huafeng
Li, Quanzheng
Zhu, Wentao
欧阳、龚包、刘匡、李华峰、朱全正、文涛
Anatomically Guided PET Image Reconstruction Using Conditional
Weakly-Supervised Multi-Task Learning Integrating Self-Attention
使用整合自注意力的条件弱监督多任务学习进行解剖引导 PET 图像重建
To address the lack of high-quality training labels in positron emission
tomography (PET) imaging, weakly-supervised reconstruction methods that
generate network-based mappings between prior images and noisy targets
have been developed. However, the learned model has an intrinsic
variance proportional to the average variance of the target image. To
suppress noise and improve the accuracy and generalizability of the
learned model, we propose a conditional weakly-supervised multi-task
learning (MTL) strategy, in which an auxiliary task is introduced
serving as an anatomical regularizer for the PET reconstruction main
task. In the proposed MTL approach, we devise a novel multi-channel
self-attention (MCSA) module that helps learn an optimal combination of
shared and task-specific features by capturing both local and global
channel-spatial dependencies. The proposed reconstruction method was
evaluated on NEMA phantom PET datasets acquired at different positions
in a PET/CT scanner and 26 clinical whole-body PET datasets. The phantom
results demonstrate that our method outperforms state-of-the-art
learning-free and weakly-supervised approaches obtaining the best
noise/contrast tradeoff with a significant noise reduction of
approximately 50.0% relative to the maximum likelihood (ML)
reconstruction. The patient study results demonstrate that our method
achieves the largest noise reductions of 67.3% and 35.5% in the liver
and lung, respectively, as well as consistently small biases in 8 tumors
with various volumes and intensities. In addition, network visualization
reveals that adding the auxiliary task introduces more anatomical
information into PET reconstruction than adding only the anatomical
loss, and the developed MCSA can abstract features and retain PET image
details.
为了解决正电子发射断层扫描(PET)成像中缺乏高质量训练标签的问题,开发了弱监督重建方法,可以在先前图像和噪声目标之间生成基于网络的映射。然而,学习模型具有与目标图像的平均方差成比例的内在方差。为了抑制噪声并提高学习模型的准确性和泛化性,我们提出了一种条件弱监督多任务学习(MTL)策略,其中引入辅助任务作为 PET 重建主要任务的解剖正则化器。在所提出的 MTL 方法中,我们设计了一种新颖的多通道自注意力(MCSA)模块,该模块通过捕获局部和全局通道空间依赖性来帮助学习共享和特定于任务的特征的最佳组合。所提出的重建方法在 PET/CT 扫描仪不同位置采集的 NEMA 体模 PET 数据集和 26 个临床全身 PET 数据集上进行了评估。模型结果表明,我们的方法优于最先进的无学习和弱监督方法,获得最佳噪声/对比度权衡,相对于最大似然 (ML) 重建,噪声显着降低约 50.0%。患者研究结果表明,我们的方法在肝脏和肺部分别实现了最大的噪声降低,分别为 67.3% 和 35.5%,并且在 8 个不同体积和强度的肿瘤中始终保持较小的偏差。此外,网络可视化表明,与仅添加解剖损失相比,添加辅助任务将更多的解剖信息引入到 PET 重建中,并且开发的 MCSA 可以抽象特征并保留 PET 图像细节。
AU Payen, Thomas
Crouzet, Sebastien
Guillen, Nicolas
Chen, Yao
Chapelon, Jean-Yves
Lafon, Cyril
Catheline, Stefan
AU Payen、Thomas Crouzet、Sebastien Guillen、Nicolas Chen、Yao Chapelon、Jean-Yves Lafon、Cyril Catheline、Stefan
Passive Elastography for Clinical HIFU Lesion Detection
用于临床 HIFU 病变检测的被动弹性成像
High-intensity Focused Ultrasound (HIFU) is a promising treatment
modality for a wide range of pathologies including prostate cancer.
However, the lack of a reliable ultrasound-based monitoring technique
limits its clinical use. Ultrasound currently provides real-time HIFU
planning, but its use for monitoring is usually limited to detecting the
backscatter increase resulting from chaotic bubble appearance. HIFU has
been shown to generate stiffening in various tissues, so elastography is
an interesting lead for ablation monitoring. However, the standard
techniques usually require the generation of a controlled push which can
be problematic in deeper organs. Passive elastography offers a potential
alternative as it uses the physiological wave field to estimate the
elasticity in tissues and not an external perturbation. This technique
was adapted to process B-mode images acquired with a clinical system. It
was first shown to faithfully assess elasticity in calibrated phantoms.
The technique was then implemented on the Focal One (R) clinical system
to evaluate its capacity to detect HIFU lesions in vitro (CNR = 9.2 dB)
showing its independence regarding the bubbles resulting from HIFU and
in vivo where the physiological wave field was successfully used to
detect and delineate lesions of different sizes in porcine liver.
Finally, the technique was performed for the very first time in four
prostate cancer patients showing strong variation in elasticity before
and after HIFU treatment (average variation of 33.0 +/- 16.0% ). Passive
elastography has shown evidence of its potential to monitor HIFU
treatment and thus help spread its use.
高强度聚焦超声 (HIFU) 是一种有前途的治疗方法,可治疗包括前列腺癌在内的多种疾病。然而,缺乏可靠的基于超声的监测技术限制了其临床应用。超声波目前提供实时 HIFU 规划,但其监测用途通常仅限于检测由于混沌气泡出现而导致的反向散射增加。 HIFU 已被证明会在各种组织中产生僵硬,因此弹性成像是消融监测的一个有趣的线索。然而,标准技术通常需要产生受控的推动力,这在更深的器官中可能会出现问题。被动弹性成像提供了一种潜在的替代方案,因为它使用生理波场来估计组织的弹性而不是外部扰动。该技术适用于处理通过临床系统获取的 B 模式图像。它首先被证明可以忠实地评估校准模型的弹性。然后,该技术在 Focal One (R) 临床系统上实施,以评估其体外检测 HIFU 病变的能力 (CNR = 9.2 dB),显示其对于 HIFU 产生的气泡和体内成功使用生理波场的独立性检测并描绘猪肝脏中不同大小的病变。最后,该技术首次在四名前列腺癌患者身上进行,这些患者在 HIFU 治疗前后表现出强烈的弹性变化(平均变化为 33.0 +/- 16.0%)。被动弹性成像已显示出其监测 HIFU 治疗的潜力,从而有助于推广其使用。
AU Moazami, Saeed
Ray, Deep
Pelletier, Daniel
Oberai, Assad A.
AU Moazami、Saeed Ray、Deep Pelletier、Daniel Oberai、Assad A.
Probabilistic Brain Extraction in MR Images via Conditional Generative
Adversarial Networks
通过条件生成对抗网络在 MR 图像中进行概率性大脑提取
Brain extraction, or the task of segmenting the brain in MR images,
forms an essential step for many neuroimaging applications. These
include quantifying brain tissue volumes, monitoring neurological
diseases, and estimating brain atrophy. Several algorithms have been
proposed for brain extraction, including image-to-image deep learning
methods that have demonstrated significant gains in accuracy. However,
none of them account for the inherent uncertainty in brain extraction.
Motivated by this, we propose a novel, probabilistic deep learning
algorithm for brain extraction that recasts this task as a Bayesian
inference problem and utilizes a conditional generative adversarial
network (cGAN) to solve it. The input to the cGAN's generator is an MR
image of the head, and the output is a collection of likely brain images
drawn from a probability density conditioned on the input. These images
are used to generate a pixel-wise mean image, serving as the estimate
for the extracted brain, and a standard deviation image, which
quantifies the uncertainty in the prediction. We test our algorithm on
head MR images from five datasets: NFBS, CC359, LPBA, IBSR, and their
combination. Our datasets are heterogeneous regarding multiple factors,
including subjects (with and without symptoms), magnetic field
strengths, and manufacturers. Our experiments demonstrate that the
proposed approach is more accurate and robust than a widely used brain
extraction tool and at least as accurate as the other deep learning
methods. They also highlight the utility of quantifying uncertainty in
downstream applications.
大脑提取,或者说在 MR 图像中分割大脑的任务,是许多神经成像应用的一个重要步骤。这些包括量化脑组织体积、监测神经系统疾病和估计脑萎缩。已经提出了几种用于大脑提取的算法,包括图像到图像的深度学习方法,这些方法已经证明了准确性的显着提高。然而,它们都没有解释大脑提取中固有的不确定性。受此启发,我们提出了一种用于大脑提取的新颖的概率深度学习算法,将该任务重新定义为贝叶斯推理问题,并利用条件生成对抗网络(cGAN)来解决它。 cGAN 生成器的输入是头部的 MR 图像,输出是根据输入条件的概率密度绘制的可能大脑图像的集合。这些图像用于生成像素级平均图像(作为提取的大脑的估计)和标准差图像(用于量化预测的不确定性)。我们在来自五个数据集的头部 MR 图像上测试我们的算法:NFBS、CC359、LPBA、IBSR 及其组合。我们的数据集在多个因素方面是异构的,包括受试者(有或没有症状)、磁场强度和制造商。我们的实验表明,所提出的方法比广泛使用的大脑提取工具更准确、更稳健,并且至少与其他深度学习方法一样准确。他们还强调了量化下游应用中不确定性的效用。
AU Wang, Yijun
Lang, Rui
Li, Rui
Zhang, Junsong
王AU、郎一君、李锐、张锐、俊松
NRTR: Neuron Reconstruction With Transformer From 3D Optical Microscopy
Images
NRTR:使用 Transformer 从 3D 光学显微镜图像重建神经元
The neuron reconstruction from raw Optical Microscopy (OM) image stacks
is the basis of neuroscience. Manual annotation and semi-automatic
neuron tracing algorithms are time-consuming and inefficient. Existing
deep learning neuron reconstruction methods, although demonstrating
exemplary performance, greatly demand complex rule-based components.
Therefore, a crucial challenge is designing an end-to-end neuron
reconstruction method that makes the overall framework simpler and model
training easier. We propose a Neuron Reconstruction Transformer (NRTR)
that, discarding the complex rule-based components, views neuron
reconstruction as a direct set-prediction problem. To the best of our
knowledge, NRTR is the first image-to-set deep learning model for
end-to-end neuron reconstruction. The overall pipeline consists of the
CNN backbone, Transformer encoder-decoder, and connectivity construction
module. NRTR generates a point set representing neuron morphological
characteristics for raw neuron images. The relationships among the
points are established through connectivity construction. The point set
is saved as a standard SWC file. In experiments using the BigNeuron and
VISoR-40 datasets, NRTR achieves excellent neuron reconstruction results
for comprehensive benchmarks and outperforms competitive baselines.
Results of extensive experiments indicate that NRTR is effective at
showing that neuron reconstruction is viewed as a set-prediction
problem, which makes end-to-end model training available.
根据原始光学显微镜 (OM) 图像堆栈重建神经元是神经科学的基础。手动注释和半自动神经元追踪算法耗时且低效。现有的深度学习神经元重建方法虽然表现出示范性的性能,但极大地需要复杂的基于规则的组件。因此,一个关键的挑战是设计一种端到端的神经元重建方法,使整体框架更简单,模型训练更容易。我们提出了一种神经元重建变压器(NRTR),它抛弃了复杂的基于规则的组件,将神经元重建视为直接的集合预测问题。据我们所知,NRTR 是第一个用于端到端神经元重建的图像到设置深度学习模型。整个管道由 CNN 主干、Transformer 编码器-解码器和连接构建模块组成。 NRTR 生成代表原始神经元图像的神经元形态特征的点集。点之间的关系是通过连通性构建来建立的。点集保存为标准 SWC 文件。在使用 BigNeuron 和 VISoR-40 数据集的实验中,NRTR 在综合基准方面取得了出色的神经元重建结果,并且优于竞争基准。大量实验的结果表明,NRTR 有效地表明神经元重建被视为集合预测问题,这使得端到端模型训练成为可能。
AU de Vente, Coen
Vermeer, Koenraad A.
Jaccard, Nicolas
Wang, He
Sun, Hongyi
Khader, Firas
Truhn, Daniel
Aimyshev, Temirgali
Zhanibekuly, Yerkebulan
Le, Tien-Dung
Galdran, Adrian
Ballester, Miguel Angel Gonzalez
Carneiro, Gustavo
Devika, R. G.
Sethumadhavan, Hrishikesh Panikkasseril
Puthussery, Densen
Liu, Hong
Yang, Zekang
Kondo, Satoshi
Kasai, Satoshi
Wang, Edward
Durvasula, Ashritha
Heras, Jonathan
Zapata, Miguel Angel
Araujo, Teresa
Aresta, Guilherme
Bogunovic, Hrvoje
Arikan, Mustafa
Lee, Yeong Chan
Cho, Hyun Bin
Choi, Yoon Ho
Qayyum, Abdul
Razzak, Imran
van Ginneken, Bram
Lemij, Hans G.
Sanchez, Clara I.
AU de Vente, Coen Vermeer, Koenraad A. Jaccard, Nicolas Wang, He Sun, Hongyi Khader, Firas Truhn, Daniel Aimyshev, Temirgali Zhanibekuly, Yerkebulan Le, Tien-Dung Galdran, Adrian Ballester, Miguel Angel Gonzalez Carneiro, Gustavo Devika, RG Sethumadhavan, Hrishikesh Panikkasseril Puthussery, Densen Liu, Hong Yang, Zekang Kondo, Satoshi Kasai, Satoshi Wang, Edward Durvasula, Ashritha Heras, Jonathan Zapata, Miguel Angel Araujo, Teresa Aresta, Guilherme Bogunovic, Hrvoje Arikan, Mustafa Lee, Yeong Chan Cho, Hyun Bin Choi、Yoon Ho Qayyum、Abdul Razzak、Imran van Ginneken、Bram Lemij、Hans G. Sanchez、Clara I.
AIROGS: Artificial Intelligence for Robust Glaucoma Screening Challenge
AIROGS:人工智能应对稳健的青光眼筛查挑战
The early detection of glaucoma is essential in preventing visual
impairment. Artificial intelligence (AI) can be used to analyze color
fundus photographs (CFPs) in a cost-effective manner, making glaucoma
screening more accessible. While AI models for glaucoma screening from
CFPs have shown promising results in laboratory settings, their
performance decreases significantly in real-world scenarios due to the
presence of out-of-distribution and low-quality images. To address this
issue, we propose the Artificial Intelligence for Robust Glaucoma
Screening (AIROGS) challenge. This challenge includes a large dataset of
around 113,000 images from about 60,000 patients and 500 different
screening centers, and encourages the development of algorithms that are
robust to ungradable and unexpected input data. We evaluated solutions
from 14 teams in this paper and found that the best teams performed
similarly to a set of 20 expert ophthalmologists and optometrists. The
highest-scoring team achieved an area under the receiver operating
characteristic curve of 0.99 (95% CI: 0.98-0.99) for detecting
ungradable images on-the-fly. Additionally, many of the algorithms
showed robust performance when tested on three other publicly available
datasets. These results demonstrate the feasibility of robust AI-enabled
glaucoma screening.
青光眼的早期发现对于预防视力障碍至关重要。人工智能 (AI) 可用于以经济高效的方式分析彩色眼底照片 (CFP),从而使青光眼筛查变得更加容易。虽然 CFP 的青光眼筛查 AI 模型在实验室环境中显示出良好的结果,但由于存在分布不均和低质量图像,其性能在现实场景中显着下降。为了解决这个问题,我们提出了人工智能稳健青光眼筛查 (AIROGS) 挑战。该挑战包括来自约 60,000 名患者和 500 个不同筛查中心的约 113,000 张图像的大型数据集,并鼓励开发对不可分级和意外输入数据具有鲁棒性的算法。我们在本文中评估了 14 个团队的解决方案,发现最好的团队与 20 名专业眼科医生和验光师组成的团队表现相似。得分最高的团队在动态检测不可分级图像方面获得了 0.99 的接收器操作特征曲线下面积(95% CI:0.98-0.99)。此外,在其他三个公开可用的数据集上进行测试时,许多算法表现出了强大的性能。这些结果证明了人工智能支持的强大青光眼筛查的可行性。
AU Gungor, Alper
Askin, Baris
Soydan, Damla Alptekin
Top, Can Baris
Saritas, Emine Ulku
Cukur, Tolga
AU Gungor、Alper Askin、Baris Soydan、Damla Alptekin Top、Can Baris Saritas、Emine Ulku Cukur、Tolga
DEQ-MPI: A Deep Equilibrium Reconstruction With Learned Consistency for
Magnetic Particle Imaging
DEQ-MPI:磁粒子成像具有学习一致性的深度平衡重建
Magnetic particle imaging (MPI) offers unparalleled contrast and
resolution for tracing magnetic nanoparticles. A common imaging
procedure calibrates a system matrix (SM) that is used to reconstruct
data from subsequent scans. The ill-posed reconstruction problem can be
solved by simultaneously enforcing data consistency based on the SM and
regularizing the solution based on an image prior. Traditional
hand-crafted priors cannot capture the complex attributes of MPI images,
whereas recent MPI methods based on learned priors can suffer from
extensive inference times or limited generalization performance. Here,
we introduce a novel physics-driven method for MPI reconstruction based
on a deep equilibrium model with learned data consistency (DEQ-MPI).
DEQ-MPI reconstructs images by augmenting neural networks into an
iterative optimization, as inspired by unrolling methods in deep
learning. Yet, conventional unrolling methods are computationally
restricted to few iterations resulting in non-convergent solutions, and
they use hand-crafted consistency measures that can yield suboptimal
capture of the data distribution. DEQ-MPI instead trains an implicit
mapping to maximize the quality of a convergent solution, and it
incorporates a learned consistency measure to better account for the
data distribution. Demonstrations on simulated and experimental data
indicate that DEQ-MPI achieves superior image quality and competitive
inference time to state-of-the-art MPI reconstruction methods.
磁粒子成像 (MPI) 为追踪磁性纳米粒子提供了无与伦比的对比度和分辨率。常见的成像程序会校准系统矩阵 (SM),该系统矩阵用于从后续扫描中重建数据。不适定重建问题可以通过同时基于 SM 强制执行数据一致性和基于图像先验对解进行正则化来解决。传统的手工制作先验无法捕获 MPI 图像的复杂属性,而最近基于学习先验的 MPI 方法可能会受到推理时间过长或泛化性能有限的影响。在这里,我们介绍了一种基于具有学习数据一致性的深度平衡模型(DEQ-MPI)的新型物理驱动 MPI 重建方法。 DEQ-MPI 受到深度学习中展开方法的启发,通过将神经网络增强为迭代优化来重建图像。然而,传统的展开方法在计算上仅限于少数迭代,导致解决方案不收敛,并且它们使用手工设计的一致性度量,可能会产生数据分布的次优捕获。相反,DEQ-MPI 训练隐式映射以最大限度地提高收敛解决方案的质量,并且它结合了学习的一致性度量以更好地解释数据分布。模拟和实验数据的演示表明,与最先进的 MPI 重建方法相比,DEQ-MPI 实现了卓越的图像质量和有竞争力的推理时间。
AU Hahne, Christopher
Chabouh, Georges
Chavignon, Arthur
Couture, Olivier
Sznitman, Raphael
AU Hahne、克里斯托弗·查布、乔治·夏维农、亚瑟·库图尔、奥利维尔·斯尼特曼、拉斐尔
RF-ULM: Ultrasound Localization Microscopy Learned From Radio-Frequency
Wavefronts
RF-ULM:从射频波前学习超声定位显微镜
In Ultrasound Localization Microscopy (ULM), achieving high-resolution
images relies on the precise localization of contrast agent particles
across a series of beamformed frames. However, our study uncovers an
enormous potential: The process of delay-and-sum beamforming leads to an
irreversible reduction of Radio-Frequency (RF) channel data, while its
implications for localization remain largely unexplored. The rich
contextual information embedded within RF wavefronts, including their
hyperbolic shape and phase, offers great promise for guiding Deep Neural
Networks (DNNs) in challenging localization scenarios. To fully exploit
this data, we propose to directly localize scatterers in RF channel
data. Our approach involves a custom super-resolution DNN using learned
feature channel shuffling, non-maximum suppression, and a semi-global
convolutional block for reliable and accurate wavefront localization.
Additionally, we introduce a geometric point transformation that
facilitates seamless mapping to the B-mode coordinate space. To
understand the impact of beamforming on ULM, we validate the
effectiveness of our method by conducting an extensive comparison with
State-Of-The-Art (SOTA) techniques. We present the inaugural in vivo
results from a wavefront-localizing DNN, highlighting its real-world
practicality. Our findings show that RF-ULM bridges the domain shift
between synthetic and real datasets, offering a considerable advantage
in terms of precision and complexity. To enable the broader research
community to benefit from our findings.
在超声定位显微镜 (ULM) 中,获得高分辨率图像依赖于造影剂粒子在一系列波束形成帧上的精确定位。然而,我们的研究揭示了巨大的潜力:延迟求和波束成形过程会导致射频 (RF) 通道数据不可逆地减少,而其对定位的影响在很大程度上仍未被探索。 RF 波前中嵌入的丰富上下文信息(包括其双曲形状和相位)为指导深度神经网络 (DNN) 应对具有挑战性的定位场景提供了巨大的希望。为了充分利用这些数据,我们建议直接定位射频通道数据中的散射体。我们的方法涉及使用学习的特征通道改组、非极大值抑制和半全局卷积块的自定义超分辨率 DNN,以实现可靠且准确的波前定位。此外,我们引入了几何点变换,有助于无缝映射到 B 模式坐标空间。为了了解波束成形对 ULM 的影响,我们通过与最先进 (SOTA) 技术进行广泛比较来验证我们方法的有效性。我们展示了波前定位 DNN 的首次体内结果,强调了其现实世界的实用性。我们的研究结果表明,RF-ULM 弥合了合成数据集和真实数据集之间的领域转换,在精度和复杂性方面提供了相当大的优势。使更广泛的研究界能够从我们的研究结果中受益。
AU Zhu, Tao
Yin, Lin
He, Jie
Wei, Zechen
Yang, Xin
Tian, Jie
Hui, Hui
区朱、殷涛、何林、伟杰、杨泽辰、田鑫、辉杰、辉
Accurate Concentration Recovery for Quantitative Magnetic Particle
Imaging Reconstruction via Nonconvex Regularization
通过非凸正则化定量磁粒子成像重建的精确浓度恢复
Magnetic particle imaging (MPI) uses nonlinear response signals to
noninvasively detect magnetic nanoparticles in space, and its
quantitative properties hold promise for future precise quantitative
treatments. In reconstruction, the system matrix based method
necessitates suitable regularization terms, such as Tikhonov or
non-negative fused lasso (NFL) regularization, to stabilize the
solution. While NFL regularization offers clearer edge information than
Tikhonov regularization, it carries a biased estimate of the $\mathbf
{l}_{\mathbf {{1}}}$ penalty, leading to an underestimation of the
reconstructed concentration and adversely affecting the quantitative
properties. In this paper, a new nonconvex regularization method
including min-max concave (MC) and total variation (TV) regularization
is proposed. This method utilized MC penalty to provide nearly unbiased
sparse constraints and adds the TV penalty to provide a uniform
intensity distribution of images. By combining the alternating direction
multiplication method (ADMM) and the two-step parameter selection
method, a more accurate quantitative MPI reconstruction was realized.
The performance of the proposed method was verified on the simulation
data, the Open-MPI dataset, and measured data from a homemade MPI
scanner. The results indicate that the proposed method achieves better
image quality while maintaining the quantitative properties, thus
overcoming the drawback of intensity underestimation by the NFL method
while providing edge information. In particular, for the measured data,
the proposed method reduced the relative error in the intensity of the
reconstruction results from 28% to 8%.
磁粒子成像(MPI)利用非线性响应信号无创地检测空间中的磁性纳米粒子,其定量特性为未来的精确定量治疗带来了希望。在重建过程中,基于系统矩阵的方法需要适当的正则化项,例如 Tikhonov 或非负融合套索 (NFL) 正则化,以稳定解。虽然 NFL 正则化提供了比 Tikhonov 正则化更清晰的边缘信息,但它对 $\mathbf {l}_{\mathbf {{1}}}$ 惩罚有偏差估计,导致重建浓度的低估并对定量产生不利影响。特性。本文提出了一种新的非凸正则化方法,包括最小-最大凹(MC)和全变分(TV)正则化。该方法利用 MC 惩罚来提供几乎无偏的稀疏约束,并添加 TV 惩罚来提供图像的均匀强度分布。通过结合交替方向乘法(ADMM)和两步参数选择方法,实现了更准确的定量MPI重建。该方法的性能在仿真数据、Open-MPI 数据集和自制 MPI 扫描仪的测量数据上得到了验证。结果表明,该方法在保持定量特性的同时获得了更好的图像质量,从而克服了NFL方法在提供边缘信息的同时低估强度的缺点。特别是,对于测量数据,该方法将重建结果强度的相对误差从28%降低到8%。
AU Chen, Kecheng
Qin, Tiexin
Lee, Victor Ho-Fun
Yan, Hong
Li, Haoliang
AU Chen、秦克成、李铁心、严浩芬、李红、浩亮
Learning Robust Shape Regularization for Generalizable Medical Image
Segmentation
学习用于通用医学图像分割的鲁棒形状正则化
Generalizable medical image segmentation enables models to generalize to
unseen target domains under domain shift issues. Recent progress
demonstrates that the shape of the segmentation objective, with its high
consistency and robustness across domains, can serve as a reliable
regularization to aid the model for better cross-domain performance,
where existing methods typically seek a shared framework to render
segmentation maps and shape prior concurrently. However, due to the
inherent texture and style preference of modern deep neural networks,
the edge or silhouette of the extracted shape will inevitably be
undermined by those domain-specific texture and style interferences of
medical images under domain shifts. To address this limitation, we
devise a novel framework with a separation between the shape
regularization and the segmentation map. Specifically, we first
customize a novel whitening transform-based probabilistic shape
regularization extractor namely WT-PSE to suppress undesirable
domain-specific texture and style interferences, leading to more robust
and high-quality shape representations. Second, we deliver a Wasserstein
distance-guided knowledge distillation scheme to help the WT-PSE to
achieve more flexible shape extraction during the inference phase.
Finally, by incorporating domain knowledge of medical images, we propose
a novel instance-domain whitening transform method to facilitate a more
stable training process with improved performance. Experiments
demonstrate the performance of our proposed method on both multi-domain
and single-domain generalization.
可泛化的医学图像分割使模型能够泛化到域转移问题下看不见的目标域。最近的进展表明,分割目标的形状具有跨域的高度一致性和鲁棒性,可以作为可靠的正则化来帮助模型获得更好的跨域性能,其中现有方法通常寻求一个共享框架来呈现分割图和同时塑造先验。然而,由于现代深度神经网络固有的纹理和风格偏好,提取的形状的边缘或轮廓将不可避免地受到域转移下医学图像的特定领域纹理和风格干扰的破坏。为了解决这个限制,我们设计了一个新颖的框架,将形状正则化和分割图分开。具体来说,我们首先定制了一种新颖的基于白化变换的概率形状正则化提取器,即 WT-PSE,以抑制不需要的特定域纹理和风格干扰,从而获得更稳健和高质量的形状表示。其次,我们提供了 Wasserstein 距离引导的知识蒸馏方案,以帮助 WT-PSE 在推理阶段实现更灵活的形状提取。最后,通过结合医学图像的领域知识,我们提出了一种新颖的实例域白化变换方法,以促进更稳定的训练过程和更高的性能。实验证明了我们提出的方法在多域和单域泛化上的性能。
AU Huang, Zixun
Zhao, Rui
Leung, Frank H. F.
Banerjee, Sunetra
Lam, Kin-Man
Zheng, Yong-Ping
Ling, Sai Ho
AU Huang、赵子勋、Rui Leung、Frank HF Banerjee、Sunetra Lam、Kin-Man Cheng、Yong-Ping Ling、Sai Ho
Landmark Localization From Medical Images With Generative Distribution
Prior
具有生成分布先验的医学图像的地标定位
In medical image analysis, anatomical landmarks usually contain strong
prior knowledge of their structural information. In this paper, we
propose to promote medical landmark localization by modeling the
underlying landmark distribution via normalizing flows. Specifically, we
introduce the flow-based landmark distribution prior as a learnable
objective function into a regression-based landmark localization
framework. Moreover, we employ an integral operation to make the mapping
from heatmaps to coordinates differentiable to further enhance
heatmap-based localization with the learned distribution prior. Our
proposed Normalizing Flow-based Distribution Prior (NFDP) employs a
straightforward backbone and non-problem-tailored architecture (i.e.,
ResNet18), which delivers high-fidelity outputs across three X-ray-based
landmark localization datasets. Remarkably, the proposed NFDP can do the
job with minimal additional computational burden as the normalizing
flows module is detached from the framework on inferencing. As compared
to existing techniques, our proposed NFDP provides a superior balance
between prediction accuracy and inference speed, making it a highly
efficient and effective approach. The source code of this paper is
available at https://github.com/jacksonhzx95/NFDP.
在医学图像分析中,解剖标志通常包含对其结构信息的强大先验知识。在本文中,我们建议通过标准化流对底层地标分布进行建模来促进医学地标定位。具体来说,我们将基于流的地标分布先验作为可学习的目标函数引入到基于回归的地标定位框架中。此外,我们采用积分运算使从热图到坐标的映射可微分,以进一步增强基于学习分布先验的基于热图的定位。我们提出的基于流的归一化分布先验 (NFDP) 采用简单的主干和非问题定制的架构(即 ResNet18),它在三个基于 X 射线的地标定位数据集上提供高保真输出。值得注意的是,由于标准化流模块与推理框架分离,所提出的 NFDP 可以以最小的额外计算负担完成这项工作。与现有技术相比,我们提出的 NFDP 在预测精度和推理速度之间提供了卓越的平衡,使其成为一种高效且有效的方法。本文的源代码可在https://github.com/jacksonhzx95/NFDP获取。
AU Tajbakhsh, Kiarash
Stanowska, Olga
Neels, Antonia
Perren, Aurel
Zboray, Robert
AU Tajbakhsh、基亚拉什·斯坦诺斯卡、奥尔加·尼尔斯、安东尼娅·佩伦、Aurel Zboray、罗伯特
3D Virtual Histopathology by Phase-Contrast X-Ray Micro-CT for
Follicular Thyroid Neoplasms
通过相差 X 射线显微 CT 进行 3D 虚拟组织病理学治疗滤泡性甲状腺肿瘤
Histological analysis is the core of follicular thyroid carcinoma (FTC)
classification. The histopathological criteria of capsular and vascular
invasion define malignancy and aggressiveness of FTC. Analysis of
multiple sections is cumbersome and as only a minute tissue fraction is
analyzed during histopathology, under-sampling remains a problem.
Application of an efficient tool for complete tissue imaging in 3D would
speed-up diagnosis and increase accuracy. We show that X-ray
propagation-based imaging (XPBI) of paraffin-embedded tissue blocks is a
valuable complementary method for follicular thyroid carcinoma diagnosis
and assessment. It enables a fast, non-destructive and accurate 3D
virtual histology of the FTC resection specimen. We demonstrate that
XPBI virtual slices can reliably evaluate capsular invasions. Then we
discuss the accessible morphological information from XPBI and their
significance for vascular invasion diagnosis. We show 3D morphological
information that allow to discern vascular invasions. The results are
validated by comparing XPBI images with clinically accepted histology
slides revised by and under supervision of two experienced endocrine
pathologists.
组织学分析是滤泡性甲状腺癌(FTC)分类的核心。包膜和血管侵犯的组织病理学标准定义了 FTC 的恶性和侵袭性。多个切片的分析很麻烦,并且由于在组织病理学过程中仅分析微小的组织部分,因此采样不足仍然是一个问题。应用有效的 3D 完整组织成像工具将加快诊断速度并提高准确性。我们表明,石蜡包埋组织块的基于 X 射线传播的成像 (XPBI) 是滤泡性甲状腺癌诊断和评估的一种有价值的补充方法。它能够对 FTC 切除标本进行快速、无损且准确的 3D 虚拟组织学分析。我们证明 XPBI 虚拟切片可以可靠地评估包膜侵袭。然后我们讨论 XPBI 中可获取的形态学信息及其对血管侵犯诊断的意义。我们显示 3D 形态信息,可以辨别血管侵犯。通过将 XPBI 图像与由两位经验丰富的内分泌病理学家修改并在其监督下修改的临床可接受的组织学切片进行比较来验证结果。
AU Cui, Jiaqi
Zeng, Pinxian
Zeng, Xinyi
Xu, Yuanyuan
Wang, Peng
Zhou, Jiliu
Wang, Yan
Shen, Dinggang
崔AU、曾嘉琪、曾品贤、徐欣怡、王媛媛、周鹏、王继六、沉彦、丁刚
Prior Knowledge-guided Triple-Domain Transformer-GAN for Direct PET
Reconstruction from Low-Count Sinograms.
先验知识引导的三域变压器-GAN,用于从低计数正弦图直接重建 PET。
To obtain high-quality positron emission tomography (PET) images while
minimizing radiation exposure, numerous methods have been dedicated to
acquiring standard-count PET (SPET) from low-count PET (LPET). However,
current methods have failed to take full advantage of the different
emphasized information from multiple domains, i.e., the sinogram, image,
and frequency domains, resulting in the loss of crucial details.
Meanwhile, they overlook the unique inner-structure of the sinograms,
thereby failing to fully capture its structural characteristics and
relationships. To alleviate these problems, in this paper, we proposed a
prior knowledge-guided transformer-GAN that unites triple domains of
sinogram, image, and frequency to directly reconstruct SPET images from
LPET sinograms, namely PK-TriDo. Our PK-TriDo consists of a Sinogram
Inner-Structure-based Denoising Transformer (SISD-Former) to denoise the
input LPET sinogram, a Frequency-adapted Image Reconstruction
Transformer (FaIR-Former) to reconstruct high-quality SPET images from
the denoised sinograms guided by the image domain prior knowledge, and
an Adversarial Network (AdvNet) to further enhance the reconstruction
quality via adversarial training. Specifically tailored for the PET
imaging mechanism, we injected a sinogram embedding module that
partitions the sinograms by rows and columns to obtain 1D sequences of
angles and distances to faithfully preserve the inner-structure of the
sinograms. Moreover, to mitigate high-frequency distortions and enhance
reconstruction details, we integrated global-local frequency parsers
(GLFPs) into FaIR-Former to calibrate the distributions and proportions
of different frequency bands, thus compelling the network to preserve
high-frequency details. Evaluations on three datasets with different
dose levels and imaging scenarios demonstrated that our PK-TriDo
outperforms the state-of-the-art methods.
为了获得高质量的正电子发射断层扫描 (PET) 图像,同时最大限度地减少辐射暴露,许多方法致力于从低计数 PET (LPET) 中获取标准计数 PET (SPET)。然而,当前的方法未能充分利用来自多个域(即正弦图、图像和频域)的不同强调信息,导致关键细节的丢失。同时,他们忽视了正弦图独特的内部结构,从而未能充分捕捉其结构特征和关系。为了缓解这些问题,在本文中,我们提出了一种先验知识引导的 Transformer-GAN,它将正弦图、图像和频率三重域结合起来,直接从 LPET 正弦图重建 SPET 图像,即 PK-TriDo。我们的 PK-TriDo 包含一个基于正弦图内部结构的去噪变压器 (SISD-Former),用于对输入 LPET 正弦图进行去噪;以及一个频率自适应图像重建变压器 (FaIR-Former),用于从去噪正弦图重建高质量 SPET 图像以图像领域先验知识和对抗网络(AdvNet)为指导,通过对抗训练进一步提高重建质量。我们注入了一个专为 PET 成像机制定制的正弦图嵌入模块,该模块可以按行和列对正弦图进行分区,以获得角度和距离的一维序列,从而忠实地保留正弦图的内部结构。此外,为了减轻高频失真并增强重建细节,我们将全局局部频率解析器(GLFP)集成到FaIR-Former中,以校准不同频段的分布和比例,从而迫使网络保留高频细节。 对具有不同剂量水平和成像场景的三个数据集的评估表明,我们的 PK-TriDo 优于最先进的方法。
AU Huang, Bangyan
Li, Tiantian
Arino-Estrada, Gerard
Dulski, Kamil
Shopa, Roman Y.
Moskal, Pawel
Stepien, Ewa
Qi, Jinyi
AU Huang、Bangyan Li、Tiantian Arino-Estrada、Gerard Dulski、Kamil Shopa、Roman Y. Moskal、Pawel Stepien、Ewa Qi、Jinyi
SPLIT: Statistical Positronium Lifetime Image Reconstruction via
Time-Thresholding
SPLIT:通过时间阈值重建统计正电子寿命图像
Positron emission tomography (PET) is a widely utilized medical imaging
modality that uses positron-emitting radiotracers to visualize
biochemical processes in a living body. The spatiotemporal distribution
of a radiotracer is estimated by detecting the coincidence photon pairs
generated through positron annihilations. In human tissue, about 40% of
the positrons form positroniums prior to the annihilation. The lifetime
of these positroniums is influenced by the microenvironment in the
tissue and could provide valuable information for better understanding
of disease progression and treatment response. Currently, there are few
methods available for reconstructing high-resolution lifetime images in
practical applications. This paper presents an efficient statistical
image reconstruction method for positronium lifetime imaging (PLI). We
also analyze the random triple-coincidence events in PLI and propose a
correction method for random events, which is essential for real
applications. Both simulation and experimental studies demonstrate that
the proposed method can produce lifetime images with high numerical
accuracy, low variance, and resolution comparable to that of the
activity images generated by a PET scanner with currently available
time-of-flight resolution.
正电子发射断层扫描 (PET) 是一种广泛使用的医学成像方式,它使用正电子发射放射性示踪剂来可视化活体内的生化过程。通过检测正电子湮灭产生的重合光子对来估计放射性示踪剂的时空分布。在人体组织中,大约 40% 的正电子在湮灭之前形成正电子素。这些正电子素的寿命受到组织中微环境的影响,可以为更好地了解疾病进展和治疗反应提供有价值的信息。目前,在实际应用中可用于重建高分辨率寿命图像的方法很少。本文提出了一种有效的正电子寿命成像(PLI)统计图像重建方法。我们还分析了 PLI 中的随机三重符合事件,并提出了一种随机事件的校正方法,这对于实际应用至关重要。模拟和实验研究都表明,所提出的方法可以生成具有高数值精度、低方差和分辨率的终生图像,其分辨率可与具有当前可用飞行时间分辨率的 PET 扫描仪生成的活动图像相媲美。
AU Wu, Qian
Chen, Yufei
Liu, Wei
Yue, Xiaodong
Zhuang, Xiahai
吴宇、陈茜、刘宇飞、岳伟、庄晓东、夏海
Deep Closing: Enhancing Topological Connectivity in Medical Tubular
Segmentation.
深度闭合:增强医学管状分割中的拓扑连接性。
Accurately segmenting tubular structures, such as blood vessels or
nerves, holds significant clinical implications across various medical
applications. However, existing methods often exhibit limitations in
achieving satisfactory topological performance, particularly in terms of
preserving connectivity. To address this challenge, we propose a novel
deep-learning approach, termed Deep Closing, inspired by the
well-established classic closing operation. Deep Closing first leverages
an AutoEncoder trained in the Masked Image Modeling (MIM) paradigm,
enhanced with digital topology knowledge, to effectively learn the
inherent shape prior of tubular structures and indicate potential
disconnected regions. Subsequently, a Simple Components Erosion module
is employed to generate topology-focused outcomes, which refines the
preceding segmentation results, ensuring all the generated regions are
topologically significant. To evaluate the efficacy of Deep Closing, we
conduct comprehensive experiments on 4 datasets: DRIVE, CHASE DB1, DCA1,
and CREMI. The results demonstrate that our approach yields considerable
improvements in topological performance compared with existing methods.
Furthermore, Deep Closing exhibits the ability to generalize and
transfer knowledge from external datasets, showcasing its robustness and
adaptability. The code for this paper has been available at:
https://github.com/5k5000/DeepClosing.
准确分割血管或神经等管状结构对于各种医疗应用具有重要的临床意义。然而,现有方法在实现令人满意的拓扑性能方面通常表现出局限性,特别是在保持连通性方面。为了应对这一挑战,我们提出了一种新颖的深度学习方法,称为深度闭运算,其灵感来自于成熟的经典闭运算。 Deep Closing 首先利用在掩模图像建模 (MIM) 范式中训练的自动编码器,并通过数字拓扑知识进行增强,以有效地学习管状结构的固有形状先验并指示潜在的断开区域。随后,采用简单组件侵蚀模块来生成以拓扑为中心的结果,从而细化前面的分割结果,确保所有生成的区域都具有拓扑意义。为了评估 Deep Closing 的效果,我们在 DRIVE、CHASE DB1、DCA1 和 CREMI 4 个数据集上进行了全面的实验。结果表明,与现有方法相比,我们的方法在拓扑性能方面取得了相当大的改进。此外,Deep Closing 还展示了从外部数据集中泛化和传输知识的能力,展示了其鲁棒性和适应性。本文的代码可在以下网址获取:https://github.com/5k5000/DeepClosing。
AU Cao, Chentao
Cui, Zhuo-Xu
Wang, Yue
Liu, Shaonan
Chen, Taijin
Zheng, Hairong
Liang, Dong
Zhu, Yanjie
曹AU、崔晨涛、王卓旭、刘悦、陈少南、郑太金、梁海荣、朱东、燕杰
High-Frequency Space Diffusion Model for Accelerated MRI
加速 MRI 的高频空间扩散模型
Diffusion models with continuous stochastic differential equations
(SDEs) have shown superior performances in image generation. It can
serve as a deep generative prior to solving the inverse problem in
magnetic resonance (MR) reconstruction. However, low-frequency regions
of k-space data are typically fully sampled in fast MR imaging, while
existing diffusion models are performed throughout the entire image or
k-space, inevitably introducing uncertainty in the reconstruction of
low-frequency regions. Additionally, existing diffusion models often
demand substantial iterations to converge, resulting in time-consuming
reconstructions. To address these challenges, we propose a novel SDE
tailored specifically for MR reconstruction with the diffusion process
in high-frequency space (referred to as HFS-SDE). This approach ensures
determinism in the fully sampled low-frequency regions and accelerates
the sampling procedure of reverse diffusion. Experiments conducted on
the publicly available fastMRI dataset demonstrate that the proposed
HFS-SDE method outperforms traditional parallel imaging methods,
supervised deep learning, and existing diffusion models in terms of
reconstruction accuracy and stability. The fast convergence properties
are also confirmed through theoretical and experimental validation.
具有连续随机微分方程 (SDE) 的扩散模型在图像生成方面表现出了卓越的性能。它可以作为解决磁共振(MR)重建中的逆问题之前的深度生成。然而,在快速MR成像中,k空间数据的低频区域通常被完全采样,而现有的扩散模型是在整个图像或k空间中执行的,不可避免地在低频区域的重建中引入不确定性。此外,现有的扩散模型通常需要大量迭代才能收敛,从而导致重建耗时。为了解决这些挑战,我们提出了一种专门针对高频空间中的扩散过程进行MR重建的新型SDE(称为HFS-SDE)。这种方法确保了完全采样的低频区域的确定性,并加速了反向扩散的采样过程。在公开的 fastMRI 数据集上进行的实验表明,所提出的 HFS-SDE 方法在重建精度和稳定性方面优于传统的并行成像方法、监督深度学习和现有的扩散模型。快速收敛特性也通过理论和实验验证得到证实。
AU Liu, Min Han, Yubin Wang, Jiazheng Wang, Can Wang, Yaonan Meijering, Erik
LSKANet: Long Strip Kernel Attention Network for Robotic Surgical Scene
Segmentation
LSKANet:用于机器人手术场景分割的长条核注意网络
Surgical scene segmentation is a critical task in Robotic-assisted
surgery. However, the complexity of the surgical scene, which mainly
includes local feature similarity (e.g., between different anatomical
tissues), intraoperative complex artifacts, and indistinguishable
boundaries, poses significant challenges to accurate segmentation. To
tackle these problems, we propose the Long Strip Kernel Attention
network (LSKANet), including two well-designed modules named Dual-block
Large Kernel Attention module (DLKA) and Multiscale Affinity Feature
Fusion module (MAFF), which can implement precise segmentation of
surgical images. Specifically, by introducing strip convolutions with
different topologies (cascaded and parallel) in two blocks and a large
kernel design, DLKA can make full use of region- and strip-like surgical
features and extract both visual and structural information to reduce
the false segmentation caused by local feature similarity. In MAFF,
affinity matrices calculated from multiscale feature maps are applied as
feature fusion weights, which helps to address the interference of
artifacts by suppressing the activations of irrelevant regions. Besides,
the hybrid loss with Boundary Guided Head (BGH) is proposed to help the
network segment indistinguishable boundaries effectively. We evaluate
the proposed LSKANet on three datasets with different surgical scenes.
The experimental results show that our method achieves new
state-of-the-art results on all three datasets with improvements of
2.6%, 1.4%, and 3.4% mIoU, respectively. Furthermore, our method is
compatible with different backbones and can significantly increase their
segmentation accuracy. Code is available at
https://github.com/YubinHan73/LSKANet.
手术场景分割是机器人辅助手术中的一项关键任务。然而,手术场景的复杂性,主要包括局部特征相似性(例如,不同解剖组织之间)、术中复杂伪影和难以区分的边界,对精确分割提出了重大挑战。为了解决这些问题,我们提出了长带核注意网络(LSKANet),包括两个精心设计的模块:双块大核注意模块(DLKA)和多尺度亲和特征融合模块(MAFF),可以实现对物体的精确分割。手术图像。具体来说,通过在两个块中引入具有不同拓扑(级联和并行)的带状卷积和大内核设计,DLKA可以充分利用区域和带状手术特征并提取视觉和结构信息,以减少造成的错误分割通过局部特征相似度。在MAFF中,根据多尺度特征图计算出的亲和力矩阵被用作特征融合权重,这有助于通过抑制不相关区域的激活来解决伪影的干扰。此外,提出了边界引导头(BGH)的混合损失来帮助有效地帮助网络分割不可区分的边界。我们在具有不同手术场景的三个数据集上评估了所提出的 LSKANet。实验结果表明,我们的方法在所有三个数据集上均取得了最新的结果,mIoU 分别提高了 2.6%、1.4% 和 3.4%。此外,我们的方法与不同的骨干网兼容,可以显着提高其分割精度。代码可在 https://github.com/YubinHan73/LSKANet 获取。
AU Wang, Ke
Chen, Zicong
Zhu, Mingjia
Li, Zhetao
Weng, Jian
Gu, Tianlong
王AU、陈科、朱自聪、李明佳、翁哲涛、谷健、天龙
Score-based Counterfactual Generation for Interpretable Medical Image
Classification and Lesion Localization.
基于分数的反事实生成,用于可解释的医学图像分类和病变定位。
Deep neural networks (DNNs) have immense potential for precise clinical
decision-making in the field of biomedical imaging. However, accessing
high-quality data is crucial for ensuring the high-performance of DNNs.
Obtaining medical imaging data is often challenging in terms of both
quantity and quality. To address these issues, we propose a score-based
counterfactual generation (SCG) framework to create counterfactual
images from latent space, to compensate for scarcity and imbalance of
data. In addition, some uncertainties in external physical factors may
introduce unnatural features and further affect the estimation of the
true data distribution. Therefore, we integrated a learnable FuzzyBlock
into the classifier of the proposed framework to manage these
uncertainties. The proposed SCG framework can be applied to both
classification and lesion localization tasks. The experimental results
revealed a remarkable performance boost in classification tasks,
achieving an average performance enhancement of 3-5% compared to
previous state-of-the-art (SOTA) methods in interpretable lesion
localization.
深度神经网络(DNN)在生物医学成像领域的精确临床决策方面具有巨大潜力。然而,访问高质量数据对于确保 DNN 的高性能至关重要。获取医学成像数据在数量和质量方面通常都具有挑战性。为了解决这些问题,我们提出了一种基于分数的反事实生成(SCG)框架,从潜在空间创建反事实图像,以弥补数据的稀缺性和不平衡。此外,外部物理因素的一些不确定性可能会引入不自然的特征,进一步影响对真实数据分布的估计。因此,我们将可学习的 FuzzyBlock 集成到所提出框架的分类器中来管理这些不确定性。所提出的 SCG 框架可应用于分类和病变定位任务。实验结果显示,分类任务的性能显着提升,与之前最先进的 (SOTA) 方法相比,在可解释病变定位方面平均性能提高了 3-5%。
AU Jin, Liang
Gu, Shixuan
Wei, Donglai
Adhinarta, Jason Ken
Kuang, Kaiming
Zhang, Yongjie Jessica
Pfister, Hanspeter
Ni, Bingbing
Yang, Jiancheng
Li, Ming
区金、谷亮、魏世轩、阿迪纳塔东来、Jason Ken Kuang、张凯明、杰西卡·菲斯特永杰、倪汉斯彼特、杨冰冰、李建成、明
<i>RibSeg v2</i>: A Large-Scale Benchmark for Rib Labeling and
Anatomical Centerline Extraction
<i>RibSeg v2</i>:肋骨标记和解剖中心线提取的大规模基准
Automatic rib labeling and anatomical centerline extraction are common
prerequisites for various clinical applications. Prior studies either
use in-house datasets that are inaccessible to communities, or focus on
rib segmentation that neglects the clinical significance of rib
labeling. To address these issues, we extend our prior dataset (RibSeg)
on the binary rib segmentation task to a comprehensive benchmark, named
RibSeg v2, with 660 CT scans (15,466 individual ribs in total) and
annotations manually inspected by experts for rib labeling and
anatomical centerline extraction. Based on the RibSeg v2, we develop a
pipeline including deep learning-based methods for rib labeling, and a
skeletonization-based method for centerline extraction. To improve
computational efficiency, we propose a sparse point cloud representation
of CT scans and compare it with standard dense voxel grids. Moreover, we
design and analyze evaluation metrics to address the key challenges of
each task. Our dataset, code, and model are available online to
facilitate open research at https://github.com/M3DV/RibSeg.
自动肋骨标记和解剖中心线提取是各种临床应用的常见先决条件。先前的研究要么使用社区无法访问的内部数据集,要么专注于肋骨分割,而忽略了肋骨标记的临床意义。为了解决这些问题,我们将二进制肋骨分割任务上的先前数据集 (RibSeg) 扩展为一个名为 RibSeg v2 的综合基准,其中包含 660 个 CT 扫描(总共 15,466 根肋骨)以及由肋骨标记和解剖专家手动检查的注释。中心线提取。基于 RibSeg v2,我们开发了一个管道,包括基于深度学习的肋骨标记方法和基于骨架化的中心线提取方法。为了提高计算效率,我们提出了 CT 扫描的稀疏点云表示,并将其与标准密集体素网格进行比较。此外,我们设计和分析评估指标以解决每项任务的关键挑战。我们的数据集、代码和模型可在线获取,以促进开放研究:https://github.com/M3DV/RibSeg。
AU Lei, Wenhui
Su, Qi
Jiang, Tianyu
Gu, Ran
Wang, Na
Liu, Xinglong
Wang, Guotai
Zhang, Xiaofan
Zhang, Shaoting
区磊、苏文辉、蒋琪、顾天宇、王然、刘娜、王兴龙、张国泰、张晓凡、绍婷
One-Shot Weakly-Supervised Segmentation in 3D Medical Images
3D 医学图像中的一次性弱监督分割
Deep neural networks typically require accurate and a large number of
annotations to achieve outstanding performance in medical image
segmentation. One-shot and weakly-supervised learning are promising
research directions that reduce labeling effort by learning a new class
from only one annotated image and using coarse labels instead,
respectively. In this work, we present an innovative framework for 3D
medical image segmentation with one-shot and weakly-supervised settings.
Firstly a propagation-reconstruction network is proposed to propagate
scribbles from one annotated volume to unlabeled 3D images based on the
assumption that anatomical patterns in different human bodies are
similar. Then a multi-level similarity denoising module is designed to
refine the scribbles based on embeddings from anatomical- to
pixel-level. After expanding the scribbles to pseudo masks, we observe
the miss-classified voxels mainly occur at the border region and propose
to extract self-support prototypes for the specific refinement. Based on
these weakly-supervised segmentation results, we further train a
segmentation model for the new class with the noisy label training
strategy. Experiments on three CT and one MRI datasets show the proposed
method obtains significant improvement over the state-of-the-art methods
and performs robustly even under severe class imbalance and low
contrast. Code is publicly available at
https://github.com/LWHYC/OneShot_WeaklySeg.
深度神经网络通常需要准确且大量的注释才能在医学图像分割中实现出色的性能。一次性学习和弱监督学习是有前途的研究方向,它们分别通过仅从一张带注释的图像中学习新类别和使用粗标签来减少标记工作。在这项工作中,我们提出了一种具有一次性和弱监督设置的 3D 医学图像分割创新框架。首先,基于不同人体的解剖模式相似的假设,提出了一种传播重建网络,将涂鸦从一个带注释的体积传播到未标记的 3D 图像。然后,设计了一个多级相似性去噪模块,以基于从解剖级到像素级的嵌入来细化涂鸦。将涂鸦扩展到伪掩模后,我们观察到错误分类的体素主要发生在边界区域,并提出提取自支撑原型以进行特定的细化。基于这些弱监督分割结果,我们使用噪声标签训练策略进一步训练新类的分割模型。对三个 CT 和一个 MRI 数据集的实验表明,所提出的方法比最先进的方法获得了显着改进,即使在严重的类别不平衡和低对比度下也能稳健地执行。代码可在 https://github.com/LWHYC/OneShot_WeaklySeg 上公开获取。
AU van Garderen, Karin A.
van der Voort, Sebastian R.
Wijnenga, Maarten M. J.
Incekara, Fatih
Alafandi, Ahmad
Kapsas, Georgios
Gahrmann, Renske
Schouten, Joost W.
Dubbink, Hendrikus J.
Vincent, Arnaud J. P. E.
van den Bent, Martin
French, Pim J.
Smits, Marion
Klein, Stefan
AU van Garderen、Karin A. van der Voort、Sebastian R. Wijnenga、Maarten MJ Incekara、Fatih Alafandi、Ahmad Kapsas、Georgios Gahrmann、Renske Schouten、Joost W. Dubbink、Hendrikus J. Vincent、Arnaud JPE van den Bent、Martin French 、皮姆·J·史密茨、玛丽昂·克莱因、斯特凡
Evaluating the Predictive Value of Glioma Growth Models for Low-Grade
Glioma After Tumor Resection
评估肿瘤切除后胶质瘤生长模型对低级别胶质瘤的预测价值
Tumor growth models have the potential to model and predict the
spatiotemporal evolution of glioma in individual patients. Infiltration
of glioma cells is known to be faster along the white matter tracts, and
therefore structural magnetic resonance imaging (MRI) and diffusion
tensor imaging (DTI) can be used to inform the model. However, applying
and evaluating growth models in real patient data is challenging. In
this work, we propose to formulate the problem of tumor growth as a
ranking problem, as opposed to a segmentation problem, and use the
average precision (AP) as a performance metric. This enables an
evaluation of the spatial pattern that does not require a volume cut-off
value. Using the AP metric, we evaluate diffusion-proliferation models
informed by structural MRI and DTI, after tumor resection. We applied
the models to a unique longitudinal dataset of 14 patients with
low-grade glioma (LGG), who received no treatment after surgical
resection, to predict the recurrent tumor shape after tumor resection.
The diffusion models informed by structural MRI and DTI showed a small
but significant increase in predictive performance with respect to
homogeneous isotropic diffusion, and the DTI-informed model reached the
best predictive performance. We conclude there is a significant
improvement in the prediction of the recurrent tumor shape when using a
DTI-informed anisotropic diffusion model with respect to istropic
diffusion, and that the AP is a suitable metric to evaluate these
models. All code and data used in this publication are made publicly
available.
肿瘤生长模型有可能模拟和预测个体患者神经胶质瘤的时空演变。众所周知,神经胶质瘤细胞沿白质束的浸润速度更快,因此结构磁共振成像 (MRI) 和扩散张量成像 (DTI) 可用于为模型提供信息。然而,在真实患者数据中应用和评估增长模型具有挑战性。在这项工作中,我们建议将肿瘤生长问题表述为排序问题,而不是分割问题,并使用平均精度(AP)作为性能指标。这使得能够评估不需要体积截止值的空间图案。使用 AP 指标,我们评估肿瘤切除后由结构 MRI 和 DTI 提供的扩散增殖模型。我们将这些模型应用于 14 名低级别胶质瘤 (LGG) 患者的独特纵向数据集,这些患者在手术切除后未接受任何治疗,以预测肿瘤切除后复发的肿瘤形状。由结构 MRI 和 DTI 提供信息的扩散模型显示,相对于均匀各向同性扩散,预测性能有小幅但显着的提高,并且由 DTI 提供信息的模型达到了最佳预测性能。我们得出的结论是,使用基于 DTI 的各向异性扩散模型相对于各向异性扩散而言,对复发肿瘤形状的预测有显着改善,并且 AP 是评估这些模型的合适指标。本出版物中使用的所有代码和数据均公开可用。
AU Lin, Yiyang
Wang, Yifeng
Fang, Zijie
Li, Zexin
Guan, Xianchao
Jiang, Danling
Zhang, Yongbing
AU Lin、王一阳、方一峰、李子杰、关泽新、蒋贤超、张丹玲、永兵
A Multi-Perspective Self-Supervised Generative Adversarial Network for
FS to FFPE Stain Transfer.
用于 FS 到 FFPE 污渍转移的多视角自监督生成对抗网络。
In clinical practice, frozen section (FS) images can be utilized to
obtain the immediate pathological results of the patients in operation
due to their fast production speed. However, compared with the
formalin-fixed and paraffin-embedded (FFPE) images, the FS images
greatly suffer from poor quality. Thus, it is of great significance to
transfer the FS image to the FFPE one, which enables pathologists to
observe high-quality images in operation. However, obtaining the paired
FS and FFPE images is quite hard, so it is difficult to obtain accurate
results using supervised methods. Apart from this, the FS to FFPE stain
transfer faces many challenges. Firstly, the number and position of
nuclei scattered throughout the image are hard to maintain during the
transfer process. Secondly, transferring the blurry FS images to the
clear FFPE ones is quite challenging. Thirdly, compared with the center
regions of each patch, the edge regions are harder to transfer. To
overcome these problems, a multi-perspective self-supervised GAN,
incorporating three auxiliary tasks, is proposed to improve the
performance of FS to FFPE stain transfer. Concretely, a nucleus
consistency constraint is designed to enable the high-fidelity of
nuclei, an FFPE guided image deblurring is proposed for improving the
clarity, and a multi-field-of-view consistency constraint is designed to
better generate the edge regions. Objective indicators and pathologists'
evaluation for experiments on the five datasets across different
countries have demonstrated the effectiveness of our method. In
addition, the validation in the downstream task of microsatellite
instability prediction has also proved the performance improvement by
transferring the FS images to FFPE ones. Our code link is
https://github.com/linyiyang98/Self-Supervised-FS2FFPE.git.
在临床实践中,冷冻切片(FS)图像由于其制作速度快,可用于获得术中患者的即时病理结果。然而,与福尔马林固定石蜡包埋(FFPE)图像相比,FS 图像质量较差。因此,将 FS 图像转移到 FFPE 图像具有重要意义,使病理学家能够在手术中观察到高质量的图像。然而,获得配对的 FS 和 FFPE 图像相当困难,因此使用监督方法很难获得准确的结果。除此之外,FS 到 FFPE 染色转移还面临许多挑战。首先,在转移过程中很难维持分散在整个图像中的核的数量和位置。其次,将模糊的 FS 图像转换为清晰的 FFPE 图像非常具有挑战性。第三,与每个补丁的中心区域相比,边缘区域更难转移。为了克服这些问题,提出了一种包含三个辅助任务的多视角自监督 GAN,以提高 FS 到 FFPE 染色剂转移的性能。具体来说,设计了核一致性约束以实现核的高保真度,提出了FFPE引导图像去模糊以提高清晰度,并设计了多视场一致性约束以更好地生成边缘区域。客观指标和病理学家对不同国家五个数据集实验的评估证明了我们方法的有效性。此外,下游微卫星不稳定预测任务的验证也证明了将FS图像转换为FFPE图像的性能提升。我们的代码链接是https://github。com/linyiyang98/Self-Supervised-FS2FFPE.git。
EI 1558-254X
DA 2024-09-18
UT MEDLINE:39283778
PM 39283778
ER
EI 1558-254X DA 2024-09-18 UT MEDLINE:39283778 PM 39283778 ER
AU Chen, Zhongyu
Bian, Yun
Shen, Erwei
Fan, Ligang
Zhu, Weifang
Shi, Fei
Shao, Chengwei
Chen, Xinjian
Xiang, Dehui
陈AU、卞中宇、沉云、范尔伟、朱立刚、施伟芳、邵飞、陈成伟、项新建、德辉
Moment-Consistent Contrastive CycleGAN for Cross-Domain Pancreatic Image
Segmentation.
用于跨域胰腺图像分割的时刻一致对比循环 GAN。
CT and MR are currently the most common imaging techniques for
pancreatic cancer diagnosis. Accurate segmentation of the pancreas in CT
and MR images can provide significant help in the diagnosis and
treatment of pancreatic cancer. Traditional supervised segmentation
methods require a large number of labeled CT and MR training data, which
is usually time-consuming and laborious. Meanwhile, due to domain shift,
traditional segmentation networks are difficult to be deployed on
different imaging modality datasets. Cross-domain segmentation can
utilize labeled source domain data to assist unlabeled target domains in
solving the above problems. In this paper, a cross-domain pancreas
segmentation algorithm is proposed based on Moment-Consistent
Contrastive Cycle Generative Adversarial Networks (MC-CCycleGAN).
MC-CCycleGAN is a style transfer network, in which the encoder of its
generator is used to extract features from real images and style
transfer images, constrain feature extraction through a contrastive
loss, and fully extract structural features of input images during style
transfer while eliminate redundant style features. The multi-order
central moments of the pancreas are proposed to describe its anatomy in
high dimensions and a contrastive loss is also proposed to constrain the
moment consistency, so as to maintain consistency of the pancreatic
structure and shape before and after style transfer. Multi-teacher
knowledge distillation framework is proposed to transfer the knowledge
from multiple teachers to a single student, so as to improve the
robustness and performance of the student network. The experimental
results have demonstrated the superiority of our framework over
state-of-the-art domain adaptation methods.
CT和MR是目前诊断胰腺癌最常用的影像技术。 CT和MR图像中胰腺的精确分割可以为胰腺癌的诊断和治疗提供重要帮助。传统的监督分割方法需要大量带标签的CT和MR训练数据,通常费时费力。同时,由于域转移,传统的分割网络很难部署在不同的成像模态数据集上。跨域分割可以利用标记的源域数据来辅助未标记的目标域解决上述问题。本文提出了一种基于矩一致对比循环生成对抗网络(MC-CCycleGAN)的跨域胰腺分割算法。 MC-CCycleGAN是一种风格迁移网络,其中生成器的编码器用于从真实图像和风格迁移图像中提取特征,通过对比损失来约束特征提取,在风格迁移过程中充分提取输入图像的结构特征,同时消除多余的风格特征。提出胰腺的多阶中心矩来描述其高维解剖结构,并提出对比损失来约束矩一致性,从而保持样式转移前后胰腺结构和形状的一致性。提出了多教师知识蒸馏框架,将知识从多个教师转移到单个学生,从而提高学生网络的鲁棒性和性能。实验结果证明了我们的框架相对于最先进的域适应方法的优越性。
AU Mou, Lei
Yan, Qifeng
Lin, Jinghui
Zhao, Yifan
Liu, Yonghuai
Ma, Shaodong
Zhang, Jiong
Lv, Wenhao
Zhou, Tao
Frangi, Alejandro F
Zhao, Yitian
区某、严雷、林奇峰、赵晶辉、刘一凡、马永怀、张少东、吕炯、周文浩、陶弗朗吉、Alejandro F 赵、益田
COSTA: A Multi-center TOF-MRA Dataset and A Style Self-Consistency
Network for Cerebrovascular Segmentation.
COSTA:多中心 TOF-MRA 数据集和脑血管分割风格自洽网络。
Time-of-flight magnetic resonance angiography (TOF-MRA) is the least
invasive and ionizing radiation-free approach for cerebrovascular
imaging, but variations in imaging artifacts across different clinical
centers and imaging vendors result in inter-site and inter-vendor
heterogeneity, making its accurate and robust cerebrovascular
segmentation challenging. Moreover, the limited availability and quality
of annotated data pose further challenges for segmentation methods to
generalize well to unseen datasets. In this paper, we construct the
largest and most diverse TOF-MRA dataset (COSTA) from 8 individual
imaging centers, with all the volumes manually annotated. Then we
propose a novel network for cerebrovascular segmentation, namely CESAR,
with the ability to tackle feature granularity and image style
heterogeneity issues. Specifically, a coarse-to-fine architecture is
implemented to refine cerebrovascular segmentation in an iterative
manner. An automatic feature selection module is proposed to selectively
fuse global long-range dependencies and local contextual information of
cerebrovascular structures. A style self-consistency loss is then
introduced to explicitly align diverse styles of TOF-MRA images to a
standardized one. Extensive experimental results on the COSTA dataset
demonstrate the effectiveness of our CESAR network against
state-of-the-art methods. We have made 6 subsets of COSTA with the
source code online available, in order to promote relevant research in
the community.
飞行时间磁共振血管造影(TOF-MRA)是脑血管成像中侵入性最小、无电离辐射的方法,但不同临床中心和成像供应商之间的成像伪影存在差异,导致站点间和供应商间的异质性,使其准确而稳健的脑血管分割具有挑战性。此外,注释数据的可用性和质量有限,这对分割方法如何泛化到未见过的数据集提出了进一步的挑战。在本文中,我们从 8 个独立成像中心构建了最大且最多样化的 TOF-MRA 数据集 (COSTA),所有数据卷均经过手动注释。然后,我们提出了一种新颖的脑血管分割网络,即 CESAR,能够解决特征粒度和图像风格异质性问题。具体来说,采用从粗到细的架构以迭代方式细化脑血管分割。提出了一种自动特征选择模块来选择性地融合脑血管结构的全局远程依赖性和局部上下文信息。然后引入风格自一致性损失,以明确地将不同风格的 TOF-MRA 图像与标准化图像对齐。 COSTA 数据集上的大量实验结果证明了我们的 CESAR 网络相对于最先进方法的有效性。我们已经制作了 COSTA 的 6 个子集并提供了在线源代码,以促进社区的相关研究。
AU Sharifzadeh, Mostafa
Goudarzi, Sobhan
Tang, An
Benali, Habib
Rivaz, Hassan
AU Sharifzadeh、Mostafa Goudarzi、Sobhan Tang、An Benali、Habib Rivaz、Hassan
Mitigating Aberration-Induced Noise: A Deep Learning-Based
Aberration-to-Aberration Approach.
减轻像差引起的噪声:一种基于深度学习的像差到像差方法。
One of the primary sources of suboptimal image quality in ultrasound
imaging is phase aberration. It is caused by spatial changes in sound
speed over a heterogeneous medium, which disturbs the transmitted waves
and prevents coherent summation of echo signals. Obtaining non-aberrated
ground truths in real-world scenarios can be extremely challenging, if
not impossible. This challenge hinders the performance of deep
learning-based techniques due to the domain shift between simulated and
experimental data. Here, for the first time, we propose a deep
learning-based method that does not require ground truth to correct the
phase aberration problem and, as such, can be directly trained on real
data. We train a network wherein both the input and target output are
randomly aberrated radio frequency (RF) data. Moreover, we demonstrate
that a conventional loss function such as mean square error is
inadequate for training such a network to achieve optimal performance.
Instead, we propose an adaptive mixed loss function that employs both
B-mode and RF data, resulting in more efficient convergence and enhanced
performance. Finally, we publicly release our dataset, comprising over
180,000 aberrated single plane-wave images (RF data), wherein phase
aberrations are modeled as near-field phase screens. Although not
utilized in the proposed method, each aberrated image is paired with its
corresponding aberration profile and the non-aberrated version, aiming
to mitigate the data scarcity problem in developing deep learning-based
techniques for phase aberration correction. Source code and trained
model are also available along with the dataset at
http://code.sonography.ai/main-aaa.
超声成像中图像质量欠佳的主要来源之一是相位像差。它是由异质介质上声速的空间变化引起的,它会干扰传输波并阻止回声信号的相干求和。在现实场景中获得无畸变的基本事实即使不是不可能,也是极具挑战性的。由于模拟数据和实验数据之间的域转移,这一挑战阻碍了基于深度学习的技术的性能。在这里,我们首次提出一种基于深度学习的方法,不需要地面事实来纠正相位像差问题,因此可以直接在真实数据上进行训练。我们训练一个网络,其中输入和目标输出都是随机畸变的射频(RF)数据。此外,我们证明了传统的损失函数(例如均方误差)不足以训练这样的网络以实现最佳性能。相反,我们提出了一种采用 B 模式和 RF 数据的自适应混合损失函数,从而实现更有效的收敛并增强性能。最后,我们公开发布我们的数据集,其中包含超过 180,000 个像差单平面波图像(RF 数据),其中相位像差被建模为近场相位屏幕。虽然在所提出的方法中没有使用,但每个像差图像都与其相应的像差轮廓和无像差版本配对,旨在缓解开发基于深度学习的相位像差校正技术时的数据稀缺问题。源代码和训练模型也可与数据集一起在 http://code.sonography.ai/main-aaa 上获取。
AU Huang, Junzhang
Zhu, Xiongfeng
Chen, Ziyang
Lin, Guoye
Huang, Meiyan
Feng, Qianjin
黄AU、朱俊章、陈雄峰、林紫阳、黄国业、冯美艳、前进
Pathological Priors Inspired Network for Vertebral Osteophytes
Recognition
病理学先验启发的椎骨骨赘识别网络
Automatic vertebral osteophyte recognition in Digital Radiography is of
great importance for the early prediction of degenerative disease but is
still a challenge because of the tiny size and high inter-class
similarity between normal and osteophyte vertebrae. Meanwhile, common
sampling strategies applied in Convolution Neural Network could cause
detailed context loss. All of these could lead to an incorrect
positioning predicament. In this paper, based on important pathological
priors, we define a set of potential lesions of each vertebra and
propose a novel Pathological Priors Inspired Network (PPIN) to achieve
accurate osteophyte recognition. PPIN comprises a backbone feature
extractor integrating with a Wavelet Transform Sampling module for
high-frequency detailed context extraction, a detection branch for
locating all potential lesions and a classification branch for producing
final osteophyte recognition. The Anatomical Map-guided Filter between
two branches helps the network focus on the specific anatomical regions
via the generated heatmaps of potential lesions in the detection branch
to address the incorrect positioning problem. To reduce the inter-class
similarity, a Bilateral Augmentation Module based on the graph
relationship is proposed to imitate the clinical diagnosis process and
to extract discriminative contextual information between adjacent
vertebrae in the classification branch. Experiments on the two
osteophytes-specific datasets collected from the public VinDr-Spine
database show that the proposed PPIN achieves the best recognition
performance among multitask frameworks and shows strong generalization.
The results on a private dataset demonstrate the potential in clinical
application. The Class Activation Maps also show the powerful
localization capability of PPIN. The source codes are available in
https://github.com/Phalo/PPIN.
数字放射成像中的自动椎骨骨赘识别对于退行性疾病的早期预测非常重要,但由于正常椎骨和骨赘椎骨之间的尺寸微小且类间高度相似,因此仍然是一个挑战。同时,卷积神经网络中应用的常见采样策略可能会导致详细的上下文丢失。所有这些都可能导致定位不正确的困境。在本文中,基于重要的病理先验,我们定义了每个椎骨的一组潜在病变,并提出了一种新颖的病理先验启发网络(PPIN)以实现准确的骨赘识别。 PPIN 包括一个与小波变换采样模块集成的主干特征提取器,用于高频详细上下文提取、一个用于定位所有潜在病变的检测分支和一个用于生成最终骨赘识别的分类分支。两个分支之间的解剖图引导过滤器帮助网络通过检测分支中潜在病变生成的热图来关注特定的解剖区域,以解决不正确的定位问题。为了减少类间相似性,提出了一种基于图关系的双边增强模块来模仿临床诊断过程并提取分类分支中相邻椎骨之间的区分性上下文信息。对从公共 VinDr-Spine 数据库收集的两个特定骨赘数据集进行的实验表明,所提出的 PPIN 在多任务框架中实现了最佳的识别性能,并表现出很强的泛化性。私人数据集的结果证明了其临床应用的潜力。 类激活图也展示了PPIN强大的定位能力。源代码可在 https://github.com/Phalo/PPIN 中获取。
AU Agarwal, Saurabh
Arya, K V
Meena, Yogesh Kumar
AU Agarwal、Saurabh Arya、KV Meena、Yogesh Kumar
CNN-O-ELMNet: Optimized Lightweight and Generalized Model for Lung
Disease Classification and Severity Assessment.
CNN-O-ELMNet:用于肺部疾病分类和严重程度评估的优化轻量级广义模型。
The high burden of lung diseases on healthcare necessitates effective
detection methods. Current Computer-aided design (CAD) systems are
limited by their focus on specific diseases and computationally
demanding deep learning models. To overcome these challenges, we
introduce CNN-O-ELMNet, a lightweight classification model designed to
efficiently detect various lung diseases, surpassing the limitations of
disease-specific CAD systems and the complexity of deep learning models.
This model combines a convolutional neural network for deep feature
extraction with an optimized extreme learning machine, utilizing the
imperialistic competitive algorithm for enhanced predictions. We then
evaluated the effectiveness of CNN-O-ELMNet using benchmark datasets for
lung diseases: distinguishing pneumothorax vs. non-pneumothorax,
tuberculosis vs. normal, and lung cancer vs. healthy cases. Our findings
demonstrate that CNN-O-ELMNet significantly outperformed (p < 0.05)
state-of-the-art methods in binary classifications for tuberculosis and
cancer, achieving accuracies of 97.85% and 97.70%, respectively, while
maintaining low computational complexity with only 2481 trainable
parameters. We also extended the model to categorize lung disease
severity based on Brixia scores. Achieving a 96.20% accuracy in
multi-class assessment for mild, moderate, and severe cases, makes it
suitable for deployment in lightweight healthcare devices.
肺部疾病对医疗保健造成的沉重负担需要有效的检测方法。当前的计算机辅助设计 (CAD) 系统因其对特定疾病的关注和计算要求较高的深度学习模型而受到限制。为了克服这些挑战,我们引入了 CNN-O-ELMNet,这是一种轻量级分类模型,旨在有效检测各种肺部疾病,超越了特定疾病 CAD 系统的局限性和深度学习模型的复杂性。该模型将用于深度特征提取的卷积神经网络与优化的极限学习机相结合,利用帝国主义竞争算法来增强预测。然后,我们使用肺部疾病的基准数据集评估了 CNN-O-ELMNet 的有效性:区分气胸与非气胸、结核病与正常病例、肺癌与健康病例。我们的研究结果表明,CNN-O-ELMNet 在结核病和癌症的二元分类方面显着优于 (p < 0.05) 最先进的方法,分别实现了 97.85% 和 97.70% 的准确率,同时保持了较低的计算复杂性只有 2481 个可训练参数。我们还扩展了该模型,以根据 Brixia 评分对肺部疾病的严重程度进行分类。在轻度、中度和重度病例的多类别评估中实现 96.20% 的准确率,使其适合部署在轻型医疗设备中。
AU Teng, Yingzhi
Wu, Kai
Liu, Jing
Li, Yifan
Teng, Xiangyi
滕AU、吴英智、刘凯、李静、滕一凡、相宜
Constructing High-order Functional Connectivity Networks with Temporal
Information from fMRI Data.
利用 fMRI 数据的时间信息构建高阶功能连接网络。
Conducting functional connectivity analysis on functional magnetic
resonance imaging (fMRI) data presents a significant and intricate
challenge. Contemporary studies typically analyze fMRI data by
constructing high-order functional connectivity networks (FCNs) due to
their strong interpretability. However, these approaches often overlook
temporal information, resulting in suboptimal accuracy. Temporal
information plays a vital role in reflecting changes in blood
oxygenation level-dependent signals. To address this shortcoming, we
have devised a framework for extracting temporal dependencies from fMRI
data and inferring high-order functional connectivity among regions of
interest (ROIs). Our approach postulates that the current state can be
determined by the FCN and the state at the previous time, effectively
capturing temporal dependencies. Furthermore, we enhance FCN by
incorporating high-order features through hypergraph-based manifold
regularization. Our algorithm involves causal modeling of the dynamic
brain system, and the obtained directed FC reveals differences in the
flow of information under different pattern. We have validated the
significance of integrating temporal information into FCN using four
real-world fMRI datasets. On average, our framework achieves 12% higher
accuracy than non-temporal hypergraph-based and low-order FCNs, all
while maintaining a short processing time. Notably, our framework
successfully identifies the most discriminative ROIs, aligning with
previous research, thereby facilitating cognitive and behavioral
studies.
对功能磁共振成像 (fMRI) 数据进行功能连接分析提出了重大而复杂的挑战。由于其强大的可解释性,当代研究通常通过构建高阶功能连接网络(FCN)来分析功能磁共振成像数据。然而,这些方法常常忽略时间信息,导致准确性不佳。时间信息在反映血氧水平依赖性信号的变化方面起着至关重要的作用。为了解决这个缺点,我们设计了一个框架,用于从功能磁共振成像数据中提取时间依赖性并推断感兴趣区域(ROI)之间的高阶功能连接。我们的方法假设当前状态可以由 FCN 和前一个时间的状态确定,从而有效地捕获时间依赖性。此外,我们通过基于超图的流形正则化合并高阶特征来增强 FCN。我们的算法涉及动态大脑系统的因果建模,获得的有向FC揭示了不同模式下信息流的差异。我们使用四个真实世界的 fMRI 数据集验证了将时间信息集成到 FCN 中的重要性。平均而言,我们的框架比基于非时间超图和低阶 FCN 的准确度高出 12%,同时保持较短的处理时间。值得注意的是,我们的框架成功地识别了最具辨别力的投资回报率,与之前的研究相一致,从而促进了认知和行为研究。
AU Li, Zihan
Zheng, Yuan
Shan, Dandan
Yang, Shuzhou
Li, Qingde
Wang, Beizhan
Zhang, Yuanting
Hong, Qingqi
Shen, Dinggang
AU Li、郑子涵、袁山、杨丹丹、李树周、王庆德、张北战、洪元廷、沉庆奇、丁刚
ScribFormer: Transformer Makes CNN Work Better for Scribble-Based
Medical Image Segmentation
ScribFormer:Transformer 使 CNN 更好地进行基于 Scribble 的医学图像分割
Most recent scribble-supervised segmentation methods commonly adopt a
CNN framework with an encoder-decoder architecture. Despite its multiple
benefits, this framework generally can only capture small-range feature
dependency for the convolutional layer with the local receptive field,
which makes it difficult to learn global shape information from the
limited information provided by scribble annotations. To address this
issue, this paper proposes a new CNN-Transformer hybrid solution for
scribble-supervised medical image segmentation called ScribFormer. The
proposed ScribFormer model has a triple-branch structure, i.e., the
hybrid of a CNN branch, a Transformer branch, and an attention-guided
class activation map (ACAM) branch. Specifically, the CNN branch
collaborates with the Transformer branch to fuse the local features
learned from CNN with the global representations obtained from
Transformer, which can effectively overcome limitations of existing
scribble-supervised segmentation methods. Furthermore, the ACAM branch
assists in unifying the shallow convolution features and the deep
convolution features to improve model's performance further. Extensive
experiments on two public datasets and one private dataset show that our
ScribFormer has superior performance over the state-of-the-art
scribble-supervised segmentation methods, and achieves even better
results than the fully-supervised segmentation methods. The code is
released at https://github.com/HUANGLIZI/ScribFormer.
最近的涂鸦监督分割方法通常采用具有编码器-解码器架构的 CNN 框架。尽管具有多种优点,但该框架通常只能捕获具有局部感受野的卷积层的小范围特征依赖性,这使得很难从涂鸦注释提供的有限信息中学习全局形状信息。为了解决这个问题,本文提出了一种新的 CNN-Transformer 混合解决方案,用于涂鸦监督医学图像分割,称为 ScribFormer。所提出的 ScribFormer 模型具有三分支结构,即 CNN 分支、Transformer 分支和注意力引导类激活图 (ACAM) 分支的混合。具体来说,CNN分支与Transformer分支协作,将从CNN学到的局部特征与从Transformer获得的全局表示融合,可以有效克服现有涂鸦监督分割方法的局限性。此外,ACAM分支有助于统一浅层卷积特征和深层卷积特征,以进一步提高模型的性能。对两个公共数据集和一个私有数据集的大量实验表明,我们的 ScribFormer 比最先进的涂鸦监督分割方法具有更优越的性能,并且比完全监督分割方法取得了更好的结果。代码发布于https://github.com/HUANGLIZI/ScribFormer。
AU Dong, Xiuyu
Yang, Kaifan
Liu, Jinyu
Tang, Fan
Liao, Wenjun
Zhang, Yu
Liang, Shujun
区东、杨秀宇、刘开凡、唐金宇、廖凡、张文军、梁宇、淑君
Cross-Domain Mutual-Assistance Learning Framework for Fully Automated
Diagnosis of Primary Tumor in Nasopharyngeal Carcinoma.
鼻咽癌原发肿瘤全自动诊断的跨领域互助学习框架。
Accurate T-staging of nasopharyngeal carcinoma (NPC) holds paramount
importance in guiding treatment decisions and prognosticating outcomes
for distinct risk groups. Regrettably, the landscape of deep
learning-based techniques for T-staging in NPC remains sparse, and
existing methodologies often exhibit suboptimal performance due to their
neglect of crucial domain-specific knowledge pertinent to primary tumor
diagnosis. To address these issues, we propose a new cross-domain
mutual-assistance learning framework for fully automated diagnosis of
primary tumor using H&N MR images. Specifically, we tackle primary tumor
diagnosis task with the convolutional neural network consisting of a 3D
cross-domain knowledge perception network (CKP net) for excavated
cross-domain-invariant features emphasizing tumor intensity variations
and internal tumor heterogeneity, and a multi-domain mutual-information
sharing fusion network (M2SF net), comprising a dual-pathway
domain-specific representation module and a mutual information fusion
module, for intelligently gauging and amalgamating multi-domain,
multi-scale T-stage diagnosis-oriented features. The proposed 3D
cross-domain mutual-assistance learning framework not only embraces
task-specific multi-domain diagnostic knowledge but also automates the
entire process of primary tumor diagnosis. We evaluate our model on an
internal and an external MR images dataset in a three-fold
cross-validation paradigm. Exhaustive experimental results demonstrate
that our method outperforms the state-of-the-art algorithms, and obtains
promising performance for tumor segmentation and T-staging. These
findings underscore its potential for clinical application, offering
valuable assistance to clinicians in treatment decision-making and
prognostication for various risk groups.
鼻咽癌 (NPC) 的准确 T 分期对于指导不同风险群体的治疗决策和预测结果至关重要。遗憾的是,基于深度学习的鼻咽癌 T 分期技术仍然很少,而且现有的方法由于忽视了与原发性肿瘤诊断相关的关键领域特定知识,常常表现出次优的性能。为了解决这些问题,我们提出了一种新的跨领域互助学习框架,用于使用 H&N MR 图像全自动诊断原发肿瘤。具体来说,我们使用卷积神经网络来解决原发肿瘤诊断任务,该卷积神经网络由 3D 跨域知识感知网络(CKP 网络)组成,用于挖掘强调肿瘤强度变化和内部肿瘤异质性的跨域不变特征,以及多域交互网络-信息共享融合网络(M2SF net),包括双通道特定域表示模块和互信息融合模块,用于智能测量和合并多域、多尺度T阶段诊断导向特征。所提出的3D跨领域互助学习框架不仅包含特定任务的多领域诊断知识,而且还自动化了原发肿瘤诊断的整个过程。我们以三重交叉验证范例在内部和外部 MR 图像数据集上评估我们的模型。详尽的实验结果表明,我们的方法优于最先进的算法,并在肿瘤分割和 T 分期方面获得了有前景的性能。 这些发现强调了其临床应用潜力,为临床医生对各种风险群体的治疗决策和预测提供了宝贵的帮助。
EI 1558-254X
DA 2024-05-16
UT MEDLINE:38739507
PM 38739507
ER
EI 1558-254X DA 2024-05-16 UT MEDLINE:38739507 PM 38739507 ER
AU Li, Zimeng
Xiao, Sa
Wang, Cheng
Li, Haidong
Zhao, Xiuchao
Duan, Caohui
Zhou, Qian
Rao, Qiuchen
Fang, Yuan
Xie, Junshuai
Shi, Lei
Guo, Fumin
Ye, Chaohui
Zhou, Xin
李AU、肖子萌、王飒、李成、赵海东、段秀超、周曹慧、饶谦、方秋晨、谢元、史俊帅、郭雷、叶富民、周朝辉、辛
Encoding Enhanced Complex CNN for Accurate and Highly Accelerated MRI
编码增强型复杂 CNN,以实现准确且高度加速的 MRI
Magnetic resonance imaging (MRI) using hyperpolarized noble gases
provides a way to visualize the structure and function of human lung,
but the long imaging time limits its broad research and clinical
applications. Deep learning has demonstrated great potential for
accelerating MRI by reconstructing images from undersampled data.
However, most existing deep convolutional neural networks (CNN) directly
apply square convolution to k-space data without considering the
inherent properties of k-space sampling, limiting k-space learning
efficiency and image reconstruction quality. In this work, we propose an
encoding enhanced (EN2) complex CNN for highly undersampled pulmonary
MRI reconstruction. EN2 complex CNN employs convolution along either the
frequency or phase-encoding direction, resembling the mechanisms of
k-space sampling, to maximize the utilization of the encoding
correlation and integrity within a row or column of k-space. We also
employ complex convolution to learn rich representations from the
complex k-space data. In addition, we develop a feature-strengthened
modularized unit to further boost the reconstruction performance.
Experiments demonstrate that our approach can accurately reconstruct
hyperpolarized Xe-129 and H-1 lung MRI from 6-fold undersampled k-space
data and provide lung function measurements with minimal biases compared
with fully sampled images. These results demonstrate the effectiveness
of the proposed algorithmic components and indicate that the proposed
approach could be used for accelerated pulmonary MRI in research and
clinical lung disease patient care.
使用超极化惰性气体的磁共振成像(MRI)提供了一种可视化人体肺部结构和功能的方法,但成像时间长限制了其广泛的研究和临床应用。深度学习已展现出通过欠采样数据重建图像来加速 MRI 的巨大潜力。然而,大多数现有的深度卷积神经网络(CNN)直接将平方卷积应用于k空间数据,没有考虑k空间采样的固有属性,限制了k空间学习效率和图像重建质量。在这项工作中,我们提出了一种编码增强 (EN2) 复杂 CNN,用于高度欠采样的肺部 MRI 重建。 EN2 复合 CNN 采用沿频率或相位编码方向的卷积,类似于 k 空间采样机制,以最大限度地利用 k 空间行或列内的编码相关性和完整性。我们还采用复杂的卷积从复杂的 k 空间数据中学习丰富的表示。此外,我们还开发了功能强化的模块化单元,以进一步提高重建性能。实验表明,我们的方法可以根据 6 倍欠采样 k 空间数据准确重建超极化 Xe-129 和 H-1 肺 MRI,并提供与完全采样图像相比偏差最小的肺功能测量结果。这些结果证明了所提出的算法组件的有效性,并表明所提出的方法可用于研究和临床肺病患者护理中的加速肺部 MRI。
AU Tay, Zhiwei
Kim, Han-Joon
Ho, John S.
Olivo, Malini
AU Tay、Zhiwei Kim、Han-Joon Ho、John S. Olivo、Malini
A Magnetic Particle Imaging Approach for Minimally Invasive Imaging and
Sensing With Implantable Bioelectronic Circuits
利用可植入生物电子电路进行微创成像和传感的磁粒子成像方法
Minimally-invasive and biocompatible implantable bioelectronic circuits
are used for long-term monitoring of physiological processes in the
body. However, there is a lack of methods that can cheaply and
conveniently image the device within the body while simultaneously
extracting sensor information. Magnetic Particle Imaging (MPI) with zero
background signal, high contrast, and high sensitivity with quantitative
images is ideal for this challenge because the magnetic signal is not
absorbed with increasing tissue depth and incurs no radiation dose. We
show how to easily modify common implantable devices to be imaged by MPI
by encapsulating and magnetically-coupling magnetic nanoparticles
(SPIOs) to the device circuit. These modified implantable devices not
only provide spatial information via MPI, but also couple to our
handheld MPI reader to transmit sensor information by modulating
harmonic signals from magnetic nanoparticles via switching or
frequency-shifting with resistive or capacitive sensors. This paper
provides proof-of-concept of an optimized MPI imaging technique for
implantable devices to extract spatial information as well as other
information transmitted by the implanted circuit (such as biosensing)
via encoding in the magnetic particle spectrum. The 4D images present 3D
position and a changing color tone in response to a variable biometric.
Biophysical sensing via bioelectronic circuits that take advantage of
the unique imaging properties of MPI may enable a wide range of
minimally invasive applications in biomedicine and diagnosis.
微创且生物相容的植入式生物电子电路用于长期监测体内的生理过程。然而,缺乏能够廉价且方便地对体内设备进行成像并同时提取传感器信息的方法。具有零背景信号、高对比度和高灵敏度的定量图像的磁粒子成像 (MPI) 是应对这一挑战的理想选择,因为磁信号不会随着组织深度的增加而被吸收,并且不会产生辐射剂量。我们展示了如何通过将磁性纳米粒子 (SPIO) 封装并磁性耦合到设备电路来轻松修改要通过 MPI 成像的常见植入设备。这些改进的植入式设备不仅通过 MPI 提供空间信息,而且还与我们的手持式 MPI 读取器耦合,通过电阻或电容传感器的切换或频移来调制来自磁性纳米颗粒的谐波信号,从而传输传感器信息。本文提供了一种优化的 MPI 成像技术的概念验证,该技术适用于植入式设备,通过磁性粒子频谱中的编码来提取空间信息以及植入电路传输的其他信息(例如生物传感)。 4D 图像呈现 3D 位置和响应可变生物特征而变化的色调。通过生物电子电路进行生物物理传感,利用 MPI 独特的成像特性,可以在生物医学和诊断领域实现广泛的微创应用。
AU Fu, Minghan
Zhang, Na
Huang, Zhenxing
Zhou, Chao
Zhang, Xu
Yuan, Jianmin
He, Qiang
Yang, Yongfeng
Zheng, Hairong
Liang, Dong
Wu, Fang-Xiang
Fan, Wei
Hu, Zhanli
AU Fu, 张明汉, 黄娜, 周振兴, 张超, 袁旭, 何建民, 杨强, 郑永峰, 梁海蓉, 吴栋, 范方翔, 胡伟, 占利
OIF-Net: An Optical Flow Registration-Based PET/MR Cross-Modal
Interactive Fusion Network for Low-Count Brain PET Image Denoising
OIF-Net:基于光流配准的 PET/MR 跨模态交互式融合网络,用于低计数脑部 PET 图像去噪
The short frames of low-count positron emission tomography (PET) images
generally cause high levels of statistical noise. Thus, improving the
quality of low-count images by using image postprocessing algorithms to
achieve better clinical diagnoses has attracted widespread attention in
the medical imaging community. Most existing deep learning-based
low-count PET image enhancement methods have achieved satisfying
results, however, few of them focus on denoising low-count PET images
with the magnetic resonance (MR) image modality as guidance. The prior
context features contained in MR images can provide abundant and
complementary information for single low-count PET image denoising,
especially in ultralow-count (2.5%) cases. To this end, we propose a
novel two-stream dual PET/MR cross-modal interactive fusion network with
an optical flow pre-alignment module, namely, OIF-Net. Specifically, the
learnable optical flow registration module enables the spatial
manipulation of MR imaging inputs within the network without any extra
training supervision. Registered MR images fundamentally solve the
problem of feature misalignment in the multimodal fusion stage, which
greatly benefits the subsequent denoising process. In addition, we
design a spatial-channel feature enhancement module (SC-FEM) that
considers the interactive impacts of multiple modalities and provides
additional information flexibility in both the spatial and channel
dimensions. Furthermore, instead of simply concatenating two extracted
features from these two modalities as an intermediate fusion method, the
proposed cross-modal feature fusion module (CM-FFM) adopts
cross-attention at multiple feature levels and greatly improves the two
modalities' feature fusion procedure. Extensive experimental assessments
conducted on real clinical datasets, as well as an independent clinical
testing dataset, demonstrate that the proposed OIF-Net outperforms the
state-of-the-art methods.
低计数正电子发射断层扫描 (PET) 图像的短帧通常会导致高水平的统计噪声。因此,利用图像后处理算法提高低计数图像的质量以实现更好的临床诊断已引起医学影像界的广泛关注。现有的大多数基于深度学习的低计数PET图像增强方法都取得了令人满意的结果,然而,很少有方法专注于以磁共振(MR)图像模态为指导的低计数PET图像去噪。 MR 图像中包含的先验上下文特征可以为单个低计数 PET 图像去噪提供丰富且互补的信息,特别是在超低计数(2.5%)的情况下。为此,我们提出了一种带有光流预对准模块的新型双流双PET/MR跨模态交互式融合网络,即OIF-Net。具体来说,可学习的光流配准模块能够在网络内对 MR 成像输入进行空间操作,而无需任何额外的训练监督。配准后的MR图像从根本上解决了多模态融合阶段特征错位的问题,极大有利于后续的去噪过程。此外,我们设计了一个空间通道特征增强模块(SC-FEM),该模块考虑了多种模态的交互影响,并在空间和通道维度上提供了额外的信息灵活性。此外,所提出的跨模态特征融合模块(CM-FFM)不是简单地连接从这两种模态中提取的两个特征作为中间融合方法,而是在多个特征级别上采用交叉注意,并极大地改进了两种模态的特征融合过程。 对真实临床数据集以及独立临床测试数据集进行的广泛实验评估表明,所提出的 OIF-Net 优于最先进的方法。
AU Yin, Ziying
Li, Guo-Yang
Zhang, Zhaoyi
Zheng, Yang
Cao, Yanping
AU Yin、李子英、张国阳、郑昭仪、曹阳、燕平
SWENet: A Physics-Informed Deep Neural Network (PINN) for Shear Wave
Elastography
SWENet:用于剪切波弹性成像的物理信息深度神经网络 (PINN)
Shear wave elastography (SWE) enables the measurement of elastic
properties of soft materials in a non-invasive manner and finds broad
applications in various disciplines. The state-of-the-art SWE methods
rely on the measurement of local shear wave speeds to infer material
parameters and suffer from wave diffraction when applied to soft
materials with strong heterogeneity. In the present study, we overcome
this challenge by proposing a physics-informed neural network
(PINN)-based SWE (SWENet) method. The spatial variation of elastic
properties of inhomogeneous materials has been introduced in the
governing equations, which are encoded in SWENet as loss functions.
Snapshots of wave motions have been used to train neural networks, and
during this course, the elastic properties within a region of interest
illuminated by shear waves are inferred simultaneously. We performed
finite element simulations, tissue-mimicking phantom experiments, and ex
vivo experiments to validate the method. Our results show that the shear
moduli of soft composites consisting of matrix and inclusions of several
millimeters in cross-section dimensions with either regular or irregular
geometries can be identified with excellent accuracy. The advantages of
the SWENet over conventional SWE methods consist of using more features
of the wave motions and enabling seamless integration of multi-source
data in the inverse analysis. Given the advantages of SWENet, it may
find broad applications where full wave fields get involved to infer
heterogeneous mechanical properties, such as identifying small solid
tumors with ultrasound SWE, and differentiating gray and white matters
of the brain with magnetic resonance elastography.
剪切波弹性成像(SWE)能够以非侵入方式测量软材料的弹性特性,并在各个学科中得到广泛应用。最先进的 SWE 方法依赖于局部剪切波速度的测量来推断材料参数,并且在应用于具有强异质性的软材料时会遭受波衍射。在本研究中,我们通过提出一种基于物理信息神经网络 (PINN) 的 SWE (SWENet) 方法来克服这一挑战。非均质材料弹性特性的空间变化已被引入控制方程中,并在 SWENet 中编码为损失函数。波浪运动的快照已用于训练神经网络,在此过程中,同时推断出剪切波照射的感兴趣区域内的弹性特性。我们进行了有限元模拟、模仿组织的体模实验和离体实验来验证该方法。我们的结果表明,由基体和横截面尺寸为几毫米、具有规则或不规则几何形状的夹杂物组成的软复合材料的剪切模量可以非常准确地识别。与传统 SWE 方法相比,SWENet 的优点包括使用更多的波动特征,并能够在反演分析中无缝集成多源数据。鉴于 SWENet 的优势,它可能会在涉及全波场以推断异质机械特性的情况下找到广泛的应用,例如使用超声 SWE 识别小型实体瘤,以及使用磁共振弹性成像区分大脑的灰质和白质。
AU Ji, Wen
Chung, Albert C. S.
欧吉、文中、Albert CS
Unsupervised Domain Adaptation for Medical Image Segmentation Using
Transformer With Meta Attention
使用带有元注意力的 Transformer 进行医学图像分割的无监督域适应
Image segmentation is essential to medical image analysis as it provides
the labeled regions of interest for the subsequent diagnosis and
treatment. However, fully-supervised segmentation methods require
high-quality annotations produced by experts, which is laborious and
expensive. In addition, when performing segmentation on another
unlabeled image modality, the segmentation performance will be adversely
affected due to the domain shift. Unsupervised domain adaptation (UDA)
is an effective way to tackle these problems, but the performance of the
existing methods is still desired to improve. Also, despite the
effectiveness of recent Transformer-based methods in medical image
segmentation, the adaptability of Transformers is rarely investigated.
In this paper, we present a novel UDA framework using a Transformer for
building a cross-modality segmentation method with the advantages of
learning long-range dependencies and transferring attentive information.
To fully utilize the attention learned by the Transformer in UDA, we
propose Meta Attention (MA) and use it to perform a fully
attention-based alignment scheme, which can learn the hierarchical
consistencies of attention and transfer more discriminative information
between two modalities. We have conducted extensive experiments on
cross-modality segmentation using three datasets, including a whole
heart segmentation dataset (MMWHS), an abdominal organ segmentation
dataset, and a brain tumor segmentation dataset. The promising results
show that our method can significantly improve performance compared with
the state-of-the-art UDA methods.
图像分割对于医学图像分析至关重要,因为它为后续的诊断和治疗提供了标记的感兴趣区域。然而,完全监督的分割方法需要专家提供高质量的注释,这是费力且昂贵的。此外,当对另一种未标记的图像模态执行分割时,由于域移位,分割性能将受到不利影响。无监督域适应(UDA)是解决这些问题的有效方法,但现有方法的性能仍有待提高。此外,尽管最近基于 Transformer 的方法在医学图像分割中非常有效,但 Transformer 的适应性却很少被研究。在本文中,我们提出了一种新颖的 UDA 框架,使用 Transformer 构建跨模态分割方法,具有学习远程依赖性和传输注意力信息的优点。为了充分利用 UDA 中 Transformer 学到的注意力,我们提出了元注意力(MA),并用它来执行完全基于注意力的对齐方案,该方案可以学习注意力的层次一致性并在两种模态之间传递更多区分信息。我们使用三个数据集进行了跨模态分割的广泛实验,包括全心脏分割数据集(MMWHS)、腹部器官分割数据集和脑肿瘤分割数据集。有希望的结果表明,与最先进的 UDA 方法相比,我们的方法可以显着提高性能。
AU Pak, Daniel H.
Liu, Minliang
Kim, Theodore
Liang, Liang
Caballero, Andres
Onofrey, John
Ahn, Shawn S.
Xu, Yilin
McKay, Raymond
Sun, Wei
Gleason, Rudolph
Duncan, James S.
AU Pak、Daniel H. Liu、Minliang Kim、Theodore Liang、Liang Caballero、Andres Onofrey、John Ahn、Shawn S. Xu、Yilin McKay、Raymond Sun、Wei Gleason、Rudolph Duncan、James S.
Patient-Specific Heart Geometry Modeling for Solid Biomechanics Using
Deep Learning
使用深度学习进行实体生物力学的患者特定心脏几何建模
Automated volumetric meshing of patient-specific heart geometry can help
expedite various biomechanics studies, such as post-intervention stress
estimation. Prior meshing techniques often neglect important modeling
characteristics for successful downstream analyses, especially for thin
structures like the valve leaflets. In this work, we present DeepCarve
(Deep Cardiac Volumetric Mesh): a novel deformation-based deep learning
method that automatically generates patient-specific volumetric meshes
with high spatial accuracy and element quality. The main novelty in our
method is the use of minimally sufficient surface mesh labels for
precise spatial accuracy and the simultaneous optimization of isotropic
and anisotropic deformation energies for volumetric mesh quality. Mesh
generation takes only 0.13 seconds/scan during inference, and each mesh
can be directly used for finite element analyses without any manual
post-processing. Calcification meshes can also be subsequently
incorporated for increased simulation accuracy. Numerous stent
deployment simulations validate the viability of our approach for
large-batch analyses.
患者特定心脏几何形状的自动体积网格划分可以帮助加快各种生物力学研究,例如干预后应力估计。先前的网格划分技术常常忽略成功下游分析的重要建模特征,特别是对于瓣膜小叶等薄结构。在这项工作中,我们提出了 DeepCarve(深度心脏体积网格):一种新颖的基于变形的深度学习方法,可自动生成具有高空间精度和元素质量的患者特定体积网格。我们方法的主要新颖之处在于使用最小足够的表面网格标签来实现精确的空间精度,并同时优化各向同性和各向异性变形能以实现体积网格质量。推理过程中网格生成/扫描仅需0.13秒,每个网格可直接用于有限元分析,无需任何手动后处理。随后还可以合并钙化网格以提高模拟精度。大量的支架部署模拟验证了我们的大批量分析方法的可行性。
AU Xie, Qingsong
Li, Yuexiang
He, Nanjun
Ning, Munan
Ma, Kai
Wang, Guoxing
Lian, Yong
Zheng, Yefeng
谢AU、李青松、何跃翔、宁南军、马木楠、王凯、连国兴、郑勇、叶峰
Unsupervised Domain Adaptation for Medical Image Segmentation by
Disentanglement Learning and Self-Training
通过解纠缠学习和自我训练进行医学图像分割的无监督域适应
Unsupervised domain adaption (UDA), which aims to enhance the
segmentation performance of deep models on unlabeled data, has recently
drawn much attention. In this paper, we propose a novel UDA method
(namely DLaST) for medical image segmentation via disentanglement
learning and self-training. Disentanglement learning factorizes an image
into domain-invariant anatomy and domain-specific modality components.
To make the best of disentanglement learning, we propose a novel shape
constraint to boost the adaptation performance. The self-training
strategy further adaptively improves the segmentation performance of the
model for the target domain through adversarial learning and pseudo
label, which implicitly facilitates feature alignment in the anatomy
space. Experimental results demonstrate that the proposed method
outperforms the state-of-the-art UDA methods for medical image
segmentation on three public datasets, i.e., a cardiac dataset, an
abdominal dataset and a brain dataset. The code will be released soon.
无监督域适应(UDA)旨在增强深度模型在未标记数据上的分割性能,最近引起了广泛关注。在本文中,我们提出了一种通过解纠缠学习和自训练进行医学图像分割的新型 UDA 方法(即 DLaST)。解缠结学习将图像分解为领域不变的解剖结构和领域特定的模态组件。为了充分利用解缠结学习,我们提出了一种新颖的形状约束来提高适应性能。自训练策略通过对抗性学习和伪标签进一步自适应地提高了目标域模型的分割性能,这隐式地促进了解剖空间中的特征对齐。实验结果表明,所提出的方法在三个公共数据集(心脏数据集、腹部数据集和大脑数据集)上优于最先进的医学图像分割 UDA 方法。该代码即将发布。
AU Zhang, Jiaojiao
Zhang, Shuo
Shen, Xiaoqian
Lukasiewicz, Thomas
Xu, Zhenghua
张AU、张娇娇、沉硕、Lukasiewicz 小倩、徐托马斯、正华
Multi-ConDoS: Multimodal Contrastive Domain Sharing Generative
Adversarial Networks for Self-Supervised Medical Image Segmentation
Multi-ConDoS:用于自监督医学图像分割的多模态对比域共享生成对抗网络
Existing self-supervised medical image segmentation usually encounters
the domain shift problem (i.e., the input distribution of pre-training
is different from that of fine-tuning) and/or the multimodality problem
(i.e., it is based on single-modal data only and cannot utilize the
fruitful multimodal information of medical images). To solve these
problems, in this work, we propose multimodal contrastive domain sharing
(Multi-ConDoS) generative adversarial networks to achieve effective
multimodal contrastive self-supervised medical image segmentation.
Compared to the existing self-supervised approaches, Multi-ConDoS has
the following three advantages: (i) it utilizes multimodal medical
images to learn more comprehensive object features via multimodal
contrastive learning; (ii) domain translation is achieved by integrating
the cyclic learning strategy of CycleGAN and the cross-domain
translation loss of Pix2Pix; (iii) novel domain sharing layers are
introduced to learn not only domain-specific but also domain-sharing
information from the multimodal medical images. Extensive experiments on
two publicly multimodal medical image segmentation datasets show that,
with only 5% (resp., 10%) of labeled data, Multi-ConDoS not only greatly
outperforms the state-of-the-art self-supervised and semi-supervised
medical image segmentation baselines with the same ratio of labeled
data, but also achieves similar (sometimes even better) performances as
fully supervised segmentation methods with 50% (resp., 100%) of labeled
data, which thus proves that our work can achieve superior segmentation
performances with very low labeling workload. Furthermore, ablation
studies prove that the above three improvements are all effective and
essential for Multi-ConDoS to achieve this very superior performance.
现有的自监督医学图像分割通常会遇到域移位问题(即预训练的输入分布与微调的输入分布不同)和/或多模态问题(即仅基于单模态数据)并且无法利用医学图像丰富的多模态信息)。为了解决这些问题,在这项工作中,我们提出了多模态对比域共享(Multi-ConDoS)生成对抗网络,以实现有效的多模态对比自监督医学图像分割。与现有的自监督方法相比,Multi-ConDoS具有以下三个优点:(i)它利用多模态医学图像通过多模态对比学习来学习更全面的对象特征; (ii)通过整合CycleGAN的循环学习策略和Pix2Pix的跨域翻译损失来实现领域翻译; (iii)引入新颖的域共享层,不仅可以从多模态医学图像中学习特定域的信息,还可以学习域共享的信息。对两个公开多模态医学图像分割数据集的大量实验表明,仅使用 5%(分别是 10%)的标记数据,Multi-ConDoS 不仅大大优于最先进的自监督和半监督医学图像分割基线具有相同比例的标记数据,但也实现了与具有 50%(或 100%)标记数据的完全监督分割方法相似(有时甚至更好)的性能,从而证明我们的工作可以实现卓越的性能。具有非常低的标记工作量的分割性能。 此外,消融研究证明,上述三项改进对于 Multi-ConDoS 实现如此卓越的性能来说都是有效且必不可少的。
AU Zhang, Qiyang
Hu, Yingying
Zhao, Yumo
Cheng, Jing
Fan, Wei
Hu, Debin
Shi, Fuxiao
Cao, Shuangliang
Zhou, Yun
Yang, Yongfeng
Liu, Xin
Zheng, Hairong
Liang, Dong
Hu, Zhanli
张AU、胡启阳、赵莹莹、程雨墨、范静、胡伟、施德斌、曹福晓、周双良、杨云、刘永峰、郑新、梁海荣、胡东、占利
Deep Generalized Learning Model for PET Image Reconstruction
PET 图像重建的深度广义学习模型
Low-count positron emission tomography (PET) imaging is challenging
because of the ill-posedness of this inverse problem. Previous studies
have demonstrated that deep learning (DL) holds promise for achieving
improved low-count PET image quality. However, almost all data-driven DL
methods suffer from fine structure degradation and blurring effects
after denoising. Incorporating DL into the traditional iterative
optimization model can effectively improve its image quality and recover
fine structures, but little research has considered the full relaxation
of the model, resulting in the performance of this hybrid model not
being sufficiently exploited. In this paper, we propose a learning
framework that deeply integrates DL and an alternating direction of
multipliers method (ADMM)-based iterative optimization model. The
innovative feature of this method is that we break the inherent forms of
the fidelity operators and use neural networks to process them. The
regularization term is deeply generalized. The proposed method is
evaluated on simulated data and real data. Both the qualitative and
quantitative results show that our proposed neural network method can
outperform partial operator expansion-based neural network methods,
neural network denoising methods and traditional methods.
由于该逆问题的不适定性,低计数正电子发射断层扫描 (PET) 成像具有挑战性。先前的研究表明,深度学习 (DL) 有望提高低计数 PET 图像质量。然而,几乎所有数据驱动的深度学习方法在去噪后都会遭受精细结构退化和模糊效应的影响。将深度学习融入传统的迭代优化模型中可以有效提高其图像质量并恢复精细结构,但很少有研究考虑模型的完全松弛,导致这种混合模型的性能没有得到充分的发挥。在本文中,我们提出了一种深度集成深度学习和基于交替方向乘子法(ADMM)的迭代优化模型的学习框架。该方法的创新点在于我们打破了保真算子的固有形式,采用神经网络对其进行处理。正则化项是深度概括的。所提出的方法在模拟数据和实际数据上进行了评估。定性和定量结果都表明,我们提出的神经网络方法可以优于基于部分算子扩展的神经网络方法、神经网络去噪方法和传统方法。
AU Zhang, Shengjie
Shen, Xin
Chen, Xiang
Yu, Ziqi
Ren, Bohan
Yang, Haibo
Zhang, Xiao-Yong
Zhou, Yuan
张AU、沉胜杰、陈鑫、项宇、任子奇、杨博涵、张海波、周小勇、袁
CQformer: Learning Dynamics Across Slices in Medical Image Segmentation.
CQformer:医学图像分割中跨切片的学习动态。
Prevalent studies on deep learning-based 3D medical image segmentation
capture the continuous variation across 2D slices mainly via
convolution, Transformer, inter-slice interaction, and time series
models. In this work, via modeling this variation by an ordinary
differential equation (ODE), we propose a cross instance query-guided
Transformer architecture (CQformer) that leverages features from
preceding slices to improve the segmentation performance of subsequent
slices. Its key components include a cross-attention mechanism in an ODE
formulation, which bridges the features of contiguous 2D slices of the
3D volumetric data. In addition, a regression head is employed to
shorten the gap between the bottleneck and the prediction layer.
Extensive experiments on 7 datasets with various modalities (CT, MRI)
and tasks (organ, tissue, and lesion) demonstrate that CQformer
outperforms previous state-of-the-art segmentation algorithms on 6
datasets by 0.44%-2.45%, and achieves the second highest performance of
88.30% on the BTCV dataset. The code will be publicly available after
acceptance.
基于深度学习的 3D 医学图像分割的流行研究主要通过卷积、Transformer、切片间交互和时间序列模型来捕获 2D 切片的连续变化。在这项工作中,通过常微分方程(ODE)对这种变化进行建模,我们提出了一种跨实例查询引导的 Transformer 架构(CQformer),该架构利用先前切片的特征来提高后续切片的分割性能。其关键组件包括 ODE 公式中的交叉注意机制,该机制桥接了 3D 体积数据的连续 2D 切片的特征。此外,还采用回归头来缩短瓶颈和预测层之间的差距。对具有不同模式(CT、MRI)和任务(器官、组织和病变)的 7 个数据集进行的广泛实验表明,CQformer 在 6 个数据集上的性能优于之前最先进的分割算法 0.44%-2.45%,并实现了在 BTCV 数据集上表现第二高,达到 88.30%。该代码将在接受后公开。
EI 1558-254X
DA 2024-10-12
UT MEDLINE:39388328
PM 39388328
ER
EI 1558-254X DA 2024-10-12 UT MEDLINE:39388328 PM 39388328 ER
AU Chen, Zhi
Liu, Yongguo
Zhang, Yun
Zhu, Jiajing
Li, Qiaoqin
Wu, Xindong
陈AU、刘志、张永国、朱云、李嘉静、吴巧勤、鑫东
Enhanced Multimodal Low-rank Embedding based Feature Selection Model for
Multimodal Alzheimer's Disease Diagnosis.
用于多模态阿尔茨海默病诊断的增强型多模态低秩嵌入特征选择模型。
Identification of Alzheimer's disease (AD) with multimodal neuroimaging
data has been receiving increasing attention. However, the presence of
numerous redundant features and corrupted neuroimages within multimodal
datasets poses significant challenges for existing methods. In this
paper, we propose a feature selection method named Enhanced Multimodal
Low-rank Embedding (EMLE) for multimodal AD diagnosis. Unlike previous
methods utilizing convex relaxations of the ℓ2,0-norm, EMLE exploits an
ℓ2,gamma-norm regularized projection matrix to obtain an embedding
representation and select informative features jointly for each
modality. The ℓ2,gamma-norm, employing an upper-bounded nonconvex
Minimax Concave Penalty (MCP) function to characterize sparsity, offers
a superior approximation for the ℓ2,0-norm compared to other convex
relaxations. Next, a similarity graph is learned based on the
self-expressiveness property to increase the robustness to corrupted
data. As the approximation coefficient vectors of samples from the same
class should be highly correlated, an MCP function introduced norm,
i.e., matrix gamma-norm, is applied to constrain the rank of the graph.
Furthermore, recognizing that diverse modalities should share an
underlying structure related to AD, we establish a consensus graph for
all modalities to unveil intrinsic structures across multiple
modalities. Finally, we fuse the embedding representations of all
modalities into the label space to incorporate supervisory information.
The results of extensive experiments on the Alzheimer's Disease
Neuroimaging Initiative datasets verify the discriminability of the
features selected by EMLE.
利用多模态神经影像数据识别阿尔茨海默病(AD)已受到越来越多的关注。然而,多模态数据集中存在大量冗余特征和损坏的神经图像,对现有方法提出了重大挑战。在本文中,我们提出了一种用于多模态 AD 诊断的特征选择方法,称为增强型多模态低秩嵌入(EMLE)。与之前利用 ℓ2,0-范数的凸松弛的方法不同,EMLE 利用 ℓ2,gamma-范数正则化投影矩阵来获得嵌入表示并为每种模态联合选择信息特征。 ℓ2,gamma-范数采用上界非凸极小极大凹罚分 (MCP) 函数来表征稀疏性,与其他凸松弛相比,为 ℓ2,0-范数提供了更好的近似。接下来,基于自我表达特性学习相似图,以提高对损坏数据的鲁棒性。由于同一类样本的逼近系数向量应该高度相关,因此采用引入范数的MCP函数,即矩阵伽马范数来约束图的秩。此外,认识到不同的模式应该共享与 AD 相关的底层结构,我们为所有模式建立了一个共识图,以揭示跨多种模式的内在结构。最后,我们将所有模态的嵌入表示融合到标签空间中以纳入监督信息。对阿尔茨海默病神经影像计划数据集进行的大量实验结果验证了 EMLE 所选特征的可区分性。
AU Li, Wen
Cao, Fuzhi
An, Nan
Wang, Wenli
Wang, Chunhui
Xu, Weinan
Gao, Yang
Ning, Xiaolin
李区、曹文、安富志、王楠、王文丽、徐春辉、高伟南、宁杨、小林
Source Extent Estimation in OPM-MEG: A Two-Stage Champagne Approach.
OPM-MEG 中的源范围估计:两阶段香槟法。
The accurate estimation of source extent using magnetoencephalography
(MEG) is important for the study of preoperative functional localization
in epilepsy. Conventional source imaging techniques tend to produce
diffuse or focused source estimates that fail to capture the source
extent accurately. To address this issue, we propose a novel method
called the two-stage Champagne approach (TS-Champagne). TS-Champagne
divides source extent estimation into two stages. In the first stage,
the Champagne algorithm with noise learning (Champagne-NL) is employed
to obtain an initial source estimate. In the second stage, spatial basis
functions are constructed from the initial source estimate. These
spatial basis functions consist of potential activation source centers
and their neighbors, and serve as spatial priors, which are incorporated
into Champagne-NL to obtain a final source estimate. We evaluated the
performance of TS-Champagne through numerical simulations. TS-Champagne
achieved more robust performance under various conditions (i.e., varying
source extent, number of sources, signal-to-noise level, and correlation
coefficients between sources) than Champagne-NL and several benchmark
methods. Furthermore, auditory and median nerve stimulation experiments
were conducted using a 31-channel optically pumped magnetometer
(OPM)-MEG system. The validation results indicated that the
reconstructed source activity was spatially and temporally consistent
with the neurophysiological results of previous OPM-MEG studies, further
demonstrating the feasibility of TS-Champagne for practical
applications.
使用脑磁图(MEG)准确估计源范围对于癫痫术前功能定位的研究非常重要。传统的源成像技术往往会产生漫射或聚焦源估计,而无法准确捕获源范围。为了解决这个问题,我们提出了一种称为两阶段香槟法(TS-Champagne)的新方法。 TS-Champagne 将源范围估计分为两个阶段。在第一阶段,采用带有噪声学习的香槟算法(Champagne-NL)来获得初始源估计。在第二阶段,根据初始源估计构建空间基函数。这些空间基函数由潜在的激活源中心及其邻居组成,并充当空间先验,将其合并到 Champagne-NL 中以获得最终的源估计。我们通过数值模拟评估了 TS-Champagne 的性能。与 Champagne-NL 和几种基准方法相比,TS-Champagne 在各种条件(即变化的源范围、源数量、信噪比以及源之间的相关系数)下实现了更稳健的性能。此外,使用 31 通道光泵磁力计 (OPM)-MEG 系统进行听觉和正中神经刺激实验。验证结果表明,重建的源活动在空间和时间上与之前OPM-MEG研究的神经生理学结果一致,进一步证明了TS-Champagne实际应用的可行性。
AU Zhou, Huajun
Zhou, Fengtao
Chen, Hao
周AU、周华军、陈风涛、郝
Cohort-Individual Cooperative Learning for Multimodal Cancer Survival
Analysis.
多模式癌症生存分析的队列个体合作学习。
Recently, we have witnessed impressive achievements in cancer survival
analysis by integrating multimodal data, e.g., pathology images and
genomic profiles. However, the heterogeneity and high dimensionality of
these modalities pose significant challenges for extracting
discriminative representations while maintaining good generalization. In
this paper, we propose a Cohortindividual Cooperative Learning (CCL)
framework to advance cancer survival analysis by collaborating knowledge
decomposition and cohort guidance. Specifically, first, we propose a
Multimodal Knowledge Decomposition (MKD) module to explicitly decompose
multimodal knowledge into four distinct components: redundancy, synergy
and uniqueness of the two modalities. Such a comprehensive decomposition
can enlighten the models to perceive easily overlooked yet important
information, facilitating an effective multimodal fusion. Second, we
propose a Cohort Guidance Modeling (CGM) to mitigate the risk of
overfitting task-irrelevant information. It can promote a more
comprehensive and robust understanding of the underlying multimodal
data, while avoiding the pitfalls of overfitting and enhancing the
generalization ability of the model. By cooperating the knowledge
decomposition and cohort guidance methods, we develop a robust
multimodal survival analysis model with enhanced discrimination and
generalization abilities. Extensive experimental results on five cancer
datasets demonstrate the effectiveness of our model in integrating
multimodal data for survival analysis. The code will be publicly
available soon.
最近,我们通过整合病理图像和基因组图谱等多模态数据,在癌症生存分析方面取得了令人瞩目的成就。然而,这些模态的异质性和高维性对提取判别性表示同时保持良好的泛化提出了重大挑战。在本文中,我们提出了一个队列个体合作学习(CCL)框架,通过协作知识分解和队列指导来推进癌症生存分析。具体来说,首先,我们提出了一个多模态知识分解(MKD)模块,将多模态知识显式分解为四个不同的组成部分:两种模态的冗余、协同和独特性。这种全面的分解可以启发模型感知容易被忽视但重要的信息,促进有效的多模态融合。其次,我们提出了队列指导模型(CGM)来减轻过度拟合与任务无关的信息的风险。它可以促进对底层多模态数据更全面、更稳健的理解,同时避免过度拟合的陷阱并增强模型的泛化能力。通过配合知识分解和队列指导方法,我们开发了一个强大的多模态生存分析模型,具有增强的辨别力和泛化能力。对五个癌症数据集的广泛实验结果证明了我们的模型在整合多模式数据进行生存分析方面的有效性。该代码很快就会公开。
AU Lerendegui, Marcelo
Riemer, Kai
Papageorgiou, Georgios
Wang, Bingxue
Arthur, Lachlan
Chavignon, Arthur
Zhang, Tao
Couture, Olivier
Huang, Pingtong
Ashikuzzaman, Md
Dencks, Stefanie
Dunsby, Chris
Helfield, Brandon
Jensen, Jorgen Arendt
Lisson, Thomas
Lowerison, Matthew R.
Rivaz, Hassan
Samir, Anthony E.
Schmitz, Georg
Schoen, Scott
van Sloun, Ruud
Song, Pengfei
Stevens, Tristan
Yan, Jipeng
Sboros, Vassilis
Tang, Meng-Xing
AU Lerendegui、Marcelo Riemer、Kai Papageorgiou、Georgios Wang、Bingxue Arthur、Lachlan Chavignon、Arthur 张、Tao Couture、Olivier Huang、Pingtong Ashikuzzaman、Md Dencks、Stefanie Dunsby、Chris Helfield、Brandon Jensen、Jorgen Arendt Lisson、Thomas Lowerison、Matthew R. Rivaz、Hassan Samir、Anthony E. Schmitz、Georg Schoen、Scott van Sloun、Ruud Song、Pengfei Stevens、Tristan Yan、Jipeng Sboros、Vassilis Tang、Meng-Xing
ULTRA-SR Challenge: Assessment of Ultrasound Localization and TRacking
Algorithms for Super-Resolution Imaging
ULTRA-SR 挑战:超分辨率成像超声定位和跟踪算法的评估
With the widespread interest and uptake of super-resolution ultrasound
(SRUS) through localization and tracking of microbubbles, also known as
ultrasound localization microscopy (ULM), many localization and tracking
algorithms have been developed. ULM can image many centimeters into
tissue in-vivo and track microvascular flow non-invasively with
sub-diffraction resolution. In a significant community effort, we
organized a challenge, Ultrasound Localization and TRacking Algorithms
for Super-Resolution (ULTRA-SR). The aims of this paper are threefold:
to describe the challenge organization, data generation, and winning
algorithms; to present the metrics and methods for evaluating challenge
entrants; and to report results and findings of the evaluation.
Realistic ultrasound datasets containing microvascular flow for
different clinical ultrasound frequencies were simulated, using vascular
flow physics, acoustic field simulation and nonlinear bubble dynamics
simulation. Based on these datasets, 38 submissions from 24 research
groups were evaluated against ground truth using an evaluation framework
with six metrics, three for localization and three for tracking. In-vivo
mouse brain and human lymph node data were also provided, and
performance assessed by an expert panel. Winning algorithms are
described and discussed. The publicly available data with ground truth
and the defined metrics for both localization and tracking present a
valuable resource for researchers to benchmark algorithms and software,
identify optimized methods/software for their data, and provide insight
into the current limits of the field. In conclusion, Ultra-SR challenge
has provided benchmarking data and tools as well as direct comparison
and insights for a number of the state-of-the art localization and
tracking algorithms.
随着通过微泡定位和跟踪的超分辨率超声(SRUS)(也称为超声定位显微镜(ULM))受到广泛关注和采用,许多定位和跟踪算法已经被开发出来。 ULM 可以将体内组织成像许多厘米,并以亚衍射分辨率非侵入性地跟踪微血管流动。在社区的一项重大努力中,我们组织了一项挑战:超分辨率超声定位和跟踪算法 (ULTRA-SR)。本文的目的有三个:描述挑战组织、数据生成和获胜算法;提出评估挑战者的指标和方法;并报告评估结果和结果。使用血管流物理、声场模拟和非线性气泡动力学模拟,模拟了包含不同临床超声频率的微血管流的真实超声数据集。基于这些数据集,使用具有六个指标(三个用于本地化、三个用于跟踪)的评估框架,根据真实情况对来自 24 个研究小组的 38 份提交内容进行了评估。还提供了体内小鼠大脑和人类淋巴结数据,并由专家小组评估了性能。描述并讨论了获胜算法。具有真实事实的公开数据以及定义的定位和跟踪指标为研究人员提供了宝贵的资源,可以对算法和软件进行基准测试,确定针对其数据的优化方法/软件,并提供对该领域当前限制的洞察。 总之,Ultra-SR 挑战赛提供了基准数据和工具,以及对许多最先进的定位和跟踪算法的直接比较和见解。
AU Ye, Yiwen
Zhang, Jianpeng
Chen, Ziyang
Xia, Yong
区野、张艺文、陈建鹏、夏紫阳、勇
CADS: A Self-supervised Learner via Cross-modal Alignment and Deep
Self-distillation for CT Volume Segmentation.
CADS:通过跨模式对齐和深度自蒸馏进行 CT 体积分割的自我监督学习器。
Self-supervised learning (SSL) has long had great success in advancing
the field of annotation-efficient learning. However, when applied to CT
volume segmentation, most SSL methods suffer from two limitations,
including rarely using the information acquired by different imaging
modalities and providing supervision only to the bottleneck encoder
layer. To address both limitations, we design a pretext task to align
the information in each 3D CT volume and the corresponding 2D generated
X-ray image and extend self-distillation to deep self-distillation.
Thus, we propose a self-supervised learner based on Cross-modal
Alignment and Deep Self-distillation (CADS) to improve the encoder's
ability to characterize CT volumes. The cross-modal alignment is a more
challenging pretext task that forces the encoder to learn better image
representation ability. Deep self-distillation provides supervision to
not only the bottleneck layer but also shallow layers, thus boosting the
abilities of both. Comparative experiments show that, during
pre-training, our CADS has lower computational complexity and GPU memory
cost than competing SSL methods. Based on the pre-trained encoder, we
construct PVT-UNet for 3D CT volume segmentation. Our results on seven
downstream tasks indicate that PVT-UNet outperforms state-of-the-art SSL
methods like MOCOv3 and DiRA, as well as prevalent medical image
segmentation methods like nnUNet and CoTr. Code and pre-trained weight
will be available at https://github.com/yeerwen/CADS.
自监督学习(SSL)长期以来在推进注释高效学习领域取得了巨大成功。然而,当应用于 CT 体积分割时,大多数 SSL 方法都存在两个局限性,包括很少使用不同成像方式获取的信息以及仅向瓶颈编码器层提供监督。为了解决这两个限制,我们设计了一个借口任务来对齐每个 3D CT 体积中的信息和相应的 2D 生成的 X 射线图像,并将自蒸馏扩展到深度自蒸馏。因此,我们提出了一种基于跨模态对齐和深度自蒸馏(CADS)的自监督学习器,以提高编码器表征 CT 体积的能力。跨模态对齐是一项更具挑战性的借口任务,它迫使编码器学习更好的图像表示能力。深度自蒸馏不仅可以对瓶颈层进行监督,还可以对浅层进行监督,从而提高两者的能力。对比实验表明,在预训练期间,我们的 CADS 比竞争的 SSL 方法具有更低的计算复杂度和 GPU 内存成本。基于预训练的编码器,我们构建了用于 3D CT 体积分割的 PVT-UNet。我们在七个下游任务上的结果表明,PVT-UNet 的性能优于 MOCOv3 和 DiRA 等最先进的 SSL 方法,以及 nnUNet 和 CoTr 等流行的医学图像分割方法。代码和预训练权重将在 https://github.com/yeerwen/CADS 上提供。
AU Smith, Nathaniel J.
Newton, David T.
Gunderman, David
Hutchins, Gary D.
AU 史密斯、纳撒尼尔·牛顿、大卫·T·冈德曼、大卫·哈钦斯、加里·D.
A Comparison of Arterial Input Function Interpolation Methods for Patlak
Plot Analysis of <SUP>68</SUP>Ga-PSMA-11 PET Prostate Cancer Studies
<SUP>68</SUP>Ga-PSMA-11 PET 前列腺癌研究的 Patlak 图分析的动脉输入函数插值方法比较
Positron emission tomography (PET) imaging enables quantitative
assessment of tissue physiology. Dynamic pharmacokinetic analysis of PET
images requires accurate estimation of the radiotracer plasma input
function to derive meaningful parameter estimates, and small
discrepancies in parameter estimation can mimic subtle physiologic
tissue variation. This study evaluates the impact of input function
interpolation method on the accuracy of Patlak kinetic parameter
estimation through simulations modeling the pharmacokinetic properties
of [Ga-68]-PSMA-11. This study evaluated both trained and untrained
methods. Although the mean kinetic parameter accuracy was similar across
all interpolation models, the trained node weighting interpolation model
estimated accurate kinetic parameters with reduced overall variability
relative to standard linear interpolation. Trained node weighting
interpolation reduced kinetic parameter estimation variance by a
magnitude approximating the underlying physiologic differences between
normal and diseased prostatic tissue. Overall, this analysis suggests
that trained node weighting improves the reliability of Patlak kinetic
parameter estimation for [Ga-68]-PSMA-11 PET.
正电子发射断层扫描 (PET) 成像可以对组织生理学进行定量评估。 PET 图像的动态药代动力学分析需要准确估计放射性示踪剂血浆输入函数,以获得有意义的参数估计,参数估计中的微小差异可以模拟微妙的生理组织变化。本研究通过对 [Ga-68]-PSMA-11 的药代动力学特性进行模拟建模,评估输入函数插值法对 Patlak 动力学参数估计准确性的影响。这项研究评估了经过训练和未经训练的方法。尽管所有插值模型的平均动力学参数精度相似,但经过训练的节点加权插值模型估计了准确的动力学参数,相对于标准线性插值,整体变异性降低。经过训练的节点加权插值减少了动力学参数估计方差,其幅度接近正常和患病前列腺组织之间的潜在生理差异。总体而言,该分析表明,经过训练的节点权重提高了 [Ga-68]-PSMA-11 PET 的 Patlak 动力学参数估计的可靠性。
AU Ni, Guangming
Wu, Renxiong
Zheng, Fei
Li, Meixuan
Huang, Shaoyan
Ge, Xin
Liu, Linbo
Liu, Yong
AU Ni、吴光明、郑仁雄、李飞、黄美轩、葛少彦、刘鑫、刘林波、勇
Toward Ground-Truth Optical Coherence Tomography via Three-Dimensional
Unsupervised Deep Learning Processing and Data
通过三维无监督深度学习处理和数据实现地面真实光学相干断层扫描
Optical coherence tomography (OCT) can perform non-invasive
high-resolution three-dimensional (3D) imaging and has been widely used
in biomedical fields, while it is inevitably affected by coherence
speckle noise which degrades OCT imaging performance and restricts its
applications. Here we present a novel speckle-free OCT imaging strategy,
named toward-ground-truth OCT (tGT-OCT), that utilizes unsupervised 3D
deep-learning processing and leverages OCT 3D imaging features to
achieve speckle-free OCT imaging. Specifically, our proposed tGT-OCT
utilizes an unsupervised 3D-convolution deep-learning network trained
using random 3D volumetric data to distinguish and separate speckle from
real structures in 3D imaging volumetric space; moreover, tGT-OCT
effectively further reduces speckle noise and reveals structures that
would otherwise be obscured by speckle noise while preserving spatial
resolution. Results derived from different samples demonstrated the
high-quality speckle-free 3D imaging performance of tGT-OCT and its
advancement beyond the previous state-of-the-art. The code is available
online: https://github.com/Voluntino/tGT-OCT.
光学相干断层扫描(OCT)可以进行非侵入性高分辨率三维(3D)成像,已广泛应用于生物医学领域,但它不可避免地受到相干散斑噪声的影响,降低了OCT成像性能并限制了其应用。在这里,我们提出了一种新颖的无散斑 OCT 成像策略,称为面向地面实况 OCT (tGT-OCT),该策略利用无监督 3D 深度学习处理并利用 OCT 3D 成像功能来实现无散斑 OCT 成像。具体来说,我们提出的 tGT-OCT 利用使用随机 3D 体积数据训练的无监督 3D 卷积深度学习网络来区分和分离 3D 成像体积空间中真实结构的散斑;此外,tGT-OCT 有效地进一步降低了散斑噪声,揭示了原本会被散斑噪声掩盖的结构,同时保留了空间分辨率。来自不同样本的结果证明了 tGT-OCT 的高质量无散斑 3D 成像性能及其超越先前最先进技术的进步。该代码可在线获取:https://github.com/Voluntino/tGT-OCT。
AU Haeusele, Jakob
Schmid, Clemens
Viermetz, Manuel
Gustschin, Nikolai
Lasser, Tobias
Koehler, Thomas
Pfeiffer, Franz
AU Haeusele、雅各布·施密德、克莱门斯·维尔梅茨、曼努埃尔·古斯钦、尼古拉·拉瑟、托拜厄斯·克勒、托马斯·菲佛、弗朗茨
Robust Sample Information Retrieval in Dark-Field Computed Tomography
with a Vibrating Talbot-Lau Interferometer.
使用振动 Talbot-Lau 干涉仪在暗场计算机断层扫描中进行鲁棒样本信息检索。
X-ray computed tomography (CT) is a crucial tool for non-invasive
medical diagnosis that uses differences in materials' attenuation
coefficients to generate contrast and provide 3D information.
Grating-based dark-field-contrast X-ray imaging is an innovative
technique that utilizes small-angle scattering to generate additional
co-registered images with additional microstructural information. While
it is already possible to perform human chest dark-field radiography, it
is assumed that its diagnostic value increases when performed in a
tomographic setup. However, the susceptibility of Talbot-Lau
interferometers to mechanical vibrations coupled with a need to minimize
data acquisition times has hindered its application in clinical routines
and the combination of X-ray dark-field imaging and large field-of-view
(FOV) tomography in the past. In this work, we propose a processing
pipeline to address this issue in a human-sized clinical dark-field CT
prototype. We present the corrective measures that are applied in the
employed processing and reconstruction algorithms to mitigate the
effects of vibrations and deformations of the interferometer gratings.
This is achieved by identifying spatially and temporally variable
vibrations in air reference scans. By translating the found correlations
to the sample scan, we can identify and mitigate relevant fluctuation
modes for scans with arbitrary sample sizes. This approach effectively
eliminates the requirement for sample-free detector area, while still
distinctly separating fluctuation and sample information. As a result,
samples of arbitrary dimensions can be reconstructed without being
affected by vibration artifacts. To demonstrate the viability of the
technique for human-scale objects, we present reconstructions of an
anthropomorphic thorax phantom.
X 射线计算机断层扫描 (CT) 是无创医学诊断的重要工具,它利用材料衰减系数的差异来生成对比度并提供 3D 信息。基于光栅的暗场对比 X 射线成像是一种创新技术,利用小角度散射生成具有附加微观结构信息的附加共同配准图像。虽然已经可以进行人体胸部暗场放射线摄影,但假设在断层摄影设置中进行时其诊断价值会增加。然而,Talbot-Lau 干涉仪对机械振动的敏感性加上最小化数据采集时间的需要阻碍了其在临床常规中的应用以及 X 射线暗场成像和大视场 (FOV) 断层扫描的结合在过去。在这项工作中,我们提出了一种处理流程,以在人体大小的临床暗场 CT 原型中解决这个问题。我们提出了在所采用的处理和重建算法中应用的纠正措施,以减轻干涉仪光栅的振动和变形的影响。这是通过识别空气参考扫描中空间和时间变化的振动来实现的。通过将发现的相关性转化为样本扫描,我们可以识别并减轻任意样本大小扫描的相关波动模式。这种方法有效地消除了对无样品检测器区域的要求,同时仍然清晰地分离波动和样品信息。因此,可以重建任意维度的样本,而不受振动伪影的影响。 为了证明该技术对于人体尺度物体的可行性,我们展示了拟人化胸部模型的重建。
AU Wang, Yan
Zhen, Liangli
Tan, Tien-En
Fu, Huazhu
Feng, Yangqin
Wang, Zizhou
Xu, Xinxing
Goh, Rick Siow Mong
Ng, Yipin
Calhoun, Claire
Tan, Gavin Siew Wei
Sun, Jennifer K.
Liu, Yong
Ting, Daniel Shu Wei
AU Wang、Yanzhen、Liangli Tan、Tien-En Fu、Huazhu Feng、Yangqin Wang、Zi Zhou Xu、Xinshing Goh、Rick Siow Mong Ng、Yipin Calhoun、Claire Tan、Gavin Siew Wei Sun、Jennifer K. Liu、Yong Ting、丹尼尔舒伟
Geometric Correspondence-Based Multimodal Learning for Ophthalmic Image
Analysis
用于眼科图像分析的基于几何对应的多模态学习
Color fundus photography (CFP) and Optical coherence tomography (OCT)
images are two of the most widely used modalities in the clinical
diagnosis and management of retinal diseases. Despite the widespread use
of multimodal imaging in clinical practice, few methods for automated
diagnosis of eye diseases utilize correlated and complementary
information from multiple modalities effectively. This paper explores
how to leverage the information from CFP and OCT images to improve the
automated diagnosis of retinal diseases. We propose a novel multimodal
learning method, named geometric correspondence-based multimodal
learning network (GeCoM-Net), to achieve the fusion of CFP and OCT
images. Specifically, inspired by clinical observations, we consider the
geometric correspondence between the OCT slice and the CFP region to
learn the correlated features of the two modalities for robust fusion.
Furthermore, we design a new feature selection strategy to extract
discriminative OCT representations by automatically selecting the
important feature maps from OCT slices. Unlike the existing multimodal
learning methods, GeCoM-Net is the first method that formulates the
geometric relationships between the OCT slice and the corresponding
region of the CFP image explicitly for CFP and OCT fusion. Experiments
have been conducted on a large-scale private dataset and a publicly
available dataset to evaluate the effectiveness of GeCoM-Net for
diagnosing diabetic macular edema (DME), impaired visual acuity (VA) and
glaucoma. The empirical results show that our method outperforms the
current state-of-the-art multimodal learning methods by improving the
AUROC score 0.4%, 1.9% and 2.9% for DME, VA and glaucoma detection,
respectively.
彩色眼底摄影(CFP)和光学相干断层扫描(OCT)图像是视网膜疾病临床诊断和治疗中使用最广泛的两种模式。尽管多模态成像在临床实践中广泛使用,但很少有自动诊断眼部疾病的方法能够有效利用来自多种模态的相关和互补信息。本文探讨了如何利用 CFP 和 OCT 图像的信息来改进视网膜疾病的自动诊断。我们提出了一种新颖的多模态学习方法,称为基于几何对应的多模态学习网络(GeCoM-Net),以实现 CFP 和 OCT 图像的融合。具体来说,受临床观察的启发,我们考虑 OCT 切片和 CFP 区域之间的几何对应关系,以了解两种模式的相关特征以实现稳健融合。此外,我们设计了一种新的特征选择策略,通过自动从 OCT 切片中选择重要的特征图来提取有区别的 OCT 表示。与现有的多模态学习方法不同,GeCoM-Net 是第一个明确制定 OCT 切片与 CFP 图像相应区域之间的几何关系以进行 CFP 和 OCT 融合的方法。我们在大规模私人数据集和公开数据集上进行了实验,以评估 GeCoM-Net 在诊断糖尿病黄斑水肿 (DME)、视力受损 (VA) 和青光眼方面的有效性。实证结果表明,我们的方法优于当前最先进的多模态学习方法,将 DME、VA 和青光眼检测的 AUROC 分数分别提高了 0.4%、1.9% 和 2.9%。
AU Huang, Kun
Ma, Xiao
Zhang, Zetian
Zhang, Yuhan
Yuan, Songtao
Fu, Huazhu
Chen, Qiang
AU黄、马坤、张晓、张泽天、袁雨涵、付松涛、陈华柱、强
Diverse Data Generation for Retinal Layer Segmentation with Potential
Structure Modelling.
具有潜在结构建模的视网膜层分割的多样化数据生成。
Accurate retinal layer segmentation on optical coherence tomography
(OCT) images is hampered by the challenges of collecting OCT images with
diverse pathological characterization and balanced distribution. Current
generative models can produce high-realistic images and corresponding
labels without quantitative limitations by fitting distributions of real
collected data. Nevertheless, the diversity of their generated data is
still limited due to the inherent imbalance of training data. To address
these issues, we propose an image-label pair generation framework that
generates diverse and balanced potential data from imbalanced real
samples. Specifically, the framework first generates diverse layer
masks, and then generates plausible OCT images corresponding to these
layer masks using two customized diffusion probabilistic models
respectively. To learn from imbalanced data and facilitate balanced
generation, we introduce pathological-related conditions to guide the
generation processes. To enhance the diversity of the generated
image-label pairs, we propose a potential structure modeling technique
that transfers the knowledge of diverse sub-structures from lowly- or
non-pathological samples to highly pathological samples. We conducted
extensive experiments on two public datasets for retinal layer
segmentation. Firstly, our method generates OCT images with higher image
quality and diversity compared to other generative methods. Furthermore,
based on the extensive training with the generated OCT images,
downstream retinal layer segmentation tasks demonstrate improved
results. The code is publicly available at:
https://github.com/nicetomeetu21/GenPSM.
收集具有不同病理特征和平衡分布的 OCT 图像的挑战阻碍了光学相干断层扫描 (OCT) 图像上准确的视网膜层分割。当前的生成模型可以通过拟合真实收集数据的分布来生成高度真实的图像和相应的标签,而不受数量限制。然而,由于训练数据固有的不平衡,他们生成的数据的多样性仍然受到限制。为了解决这些问题,我们提出了一种图像标签对生成框架,该框架可以从不平衡的真实样本中生成多样化且平衡的潜在数据。具体来说,该框架首先生成不同的层掩模,然后分别使用两个定制的扩散概率模型生成与这些层掩模相对应的可信 OCT 图像。为了从不平衡数据中学习并促进平衡生成,我们引入病理相关条件来指导生成过程。为了增强生成的图像标签对的多样性,我们提出了一种潜在的结构建模技术,该技术将不同子结构的知识从低度或非病理样本转移到高度病理样本。我们对两个用于视网膜层分割的公共数据集进行了广泛的实验。首先,与其他生成方法相比,我们的方法生成的 OCT 图像具有更高的图像质量和多样性。此外,基于对生成的 OCT 图像的广泛训练,下游视网膜层分割任务显示出改进的结果。该代码可在以下网址公开获取:https://github.com/nicetomeetu21/GenPSM。
AU Chen, Jiachen
Li, Mengyang
Han, Hu
Zhao, Zhiming
Chen, Xilin
陈AU、李佳辰、韩梦阳、赵胡、陈志明、西林
SurgNet: Self-Supervised Pretraining With Semantic Consistency for
Vessel and Instrument Segmentation in Surgical Images
SurgNet:具有语义一致性的自监督预训练,用于手术图像中的血管和器械分割
Blood vessel and surgical instrument segmentation is a fundamental
technique for robot-assisted surgical navigation. Despite the
significant progress in natural image segmentation, surgical image-based
vessel and instrument segmentation are rarely studied. In this work, we
propose a novel self-supervised pretraining method (SurgNet) that can
effectively learn representative vessel and instrument features from
unlabeled surgical images. As a result, it allows for precise and
efficient segmentation of vessels and instruments with only a small
amount of labeled data. Specifically, we first construct a region
adjacency graph (RAG) based on local semantic consistency in unlabeled
surgical images and use it as a self-supervision signal for pseudo-mask
segmentation. We then use the pseudo-mask to perform guided masked image
modeling (GMIM) to learn representations that integrate structural
information of intraoperative objectives more effectively. Our
pretrained model, paired with various segmentation methods, can be
applied to perform vessel and instrument segmentation accurately using
limited labeled data for fine-tuning. We build an Intraoperative Vessel
and Instrument Segmentation (IVIS) dataset, comprised of similar to 3
million unlabeled images and over 4,000 labeled images with manual
vessel and instrument annotations to evaluate the effectiveness of our
self-supervised pretraining method. We also evaluated the
generalizability of our method to similar tasks using two public
datasets. The results demonstrate that our approach outperforms the
current state-of-the-art (SOTA) self-supervised representation learning
methods in various surgical image segmentation tasks.
血管和手术器械分割是机器人辅助手术导航的基本技术。尽管自然图像分割取得了重大进展,但基于手术图像的血管和器械分割的研究却很少。在这项工作中,我们提出了一种新颖的自监督预训练方法(SurgNet),可以有效地从未标记的手术图像中学习代表性血管和器械特征。因此,它只需少量的标记数据即可对血管和仪器进行精确有效的分割。具体来说,我们首先基于未标记的手术图像中的局部语义一致性构建区域邻接图(RAG),并将其用作伪掩模分割的自监督信号。然后,我们使用伪掩模执行引导掩模图像建模(GMIM),以学习更有效地整合术中目标结构信息的表示。我们的预训练模型与各种分割方法相结合,可以使用有限的标记数据进行微调,从而准确地执行血管和器械分割。我们构建了一个术中血管和器械分割 (IVIS) 数据集,其中包含大约 300 万张未标记图像和超过 4,000 张带有手动血管和器械注释的标记图像,以评估我们的自监督预训练方法的有效性。我们还使用两个公共数据集评估了我们的方法对类似任务的通用性。结果表明,我们的方法在各种手术图像分割任务中优于当前最先进的(SOTA)自监督表示学习方法。
AU Lin, Chen
Zhu, Zhenfeng
Zhao, Yawei
Zhang, Ying
He, Kunlun
Zhao, Yao
AU Lin、陈竺、赵振峰、张亚伟、何瑛、赵昆仑、姚
SGT plus plus : Improved Scene Graph-Guided Transformer for Surgical
Report Generation
SGT plus plus:用于生成手术报告的改进场景图引导变压器
Automatically recording surgical procedures and generating surgical
reports are crucial for alleviating surgeons' workload and enabling them
to concentrate more on the operations. Despite some achievements, there
still exist several issues for the previous works: 1) failure to model
the interactive relationship between surgical instruments and tissue;
and 2) neglect of fine-grained differences within different surgical
images in the same surgery. To address these two issues, we propose an
improved scene graph-guided Transformer, also named by SGT++, to
generate more accurate surgical report, in which the complex
interactions between surgical instruments and tissue are learnt from
both explicit and implicit perspectives. Specifically, to facilitate the
understanding of the surgical scene graph under a graph learning
framework, a simple yet effective approach is proposed for homogenizing
the input heterogeneous scene graph. For the homogeneous scene graph
that contains explicit structured and fine-grained semantic
relationships, we design an attention-induced graph transformer for node
aggregation via an explicit relation-aware encoder. In addition, to
characterize the implicit relationships about the instrument, tissue,
and the interaction between them, the implicit relational attention is
proposed to take full advantage of the prior knowledge from the
interactional prototype memory. With the learnt explicit and implicit
relation-aware representations, they are then coalesced to obtain the
fused relation-aware representations contributing to generating reports.
Some comprehensive experiments on two surgical datasets show that the
proposed STG++ model achieves state-of-the-art results.
自动记录手术过程并生成手术报告对于减轻外科医生的工作量并使他们更加专注于手术至关重要。尽管取得了一些成果,但之前的工作仍然存在一些问题:1)未能模拟手术器械与组织之间的交互关系; 2)忽略同一手术中不同手术图像的细粒度差异。为了解决这两个问题,我们提出了一种改进的场景图引导 Transformer(也称为 SGT++),以生成更准确的手术报告,其中从显式和隐式角度学习手术器械和组织之间的复杂相互作用。具体来说,为了便于在图学习框架下理解手术场景图,提出了一种简单而有效的方法来均匀化输入异构场景图。对于包含显式结构化和细粒度语义关系的同构场景图,我们通过显式关系感知编码器设计了一种用于节点聚合的注意力诱导图转换器。此外,为了表征仪器、组织以及它们之间的交互的隐式关系,提出了隐式关系注意,以充分利用交互原型记忆中的先验知识。利用学习到的显式和隐式关系感知表示,然后将它们合并以获得有助于生成报告的融合关系感知表示。对两个手术数据集的一些综合实验表明,所提出的 STG++ 模型取得了最先进的结果。
AU Grohl, Janek
Else, Thomas R.
Hacker, Lina
Bunce, Ellie V.
Sweeney, Paul W.
Bohndiek, Sarah E.
AU Grohl、Janek Else、Thomas R. Hacker、Lina Bunce、Ellie V. Sweeney、Paul W. Bohndiek、Sarah E.
Moving Beyond Simulation: Data-Driven Quantitative Photoacoustic Imaging
Using Tissue-Mimicking Phantoms
超越模拟:使用模拟组织模型进行数据驱动的定量光声成像
Accurate measurement of optical absorption coefficients from
photoacoustic imaging (PAI) data would enable direct mapping of
molecular concentrations, providing vital clinical insight. The
ill-posed nature of the problem of absorption coefficient recovery has
prohibited PAI from achieving this goal in living systems due to the
domain gap between simulation and experiment. To bridge this gap, we
introduce a collection of experimentally well-characterised imaging
phantoms and their digital twins. This first-of-a-kind phantom data set
enables supervised training of a U-Net on experimental data for
pixel-wise estimation of absorption coefficients. We show that training
on simulated data results in artefacts and biases in the estimates,
reinforcing the existence of a domain gap between simulation and
experiment. Training on experimentally acquired data, however, yielded
more accurate and robust estimates of optical absorption coefficients.
We compare the results to fluence correction with a Monte Carlo model
from reference optical properties of the materials, which yields a
quantification error of approximately 20%. Application of the trained
U-Nets to a blood flow phantom demonstrated spectral biases when
training on simulated data, while application to a mouse model
highlighted the ability of both learning-based approaches to recover the
depth-dependent loss of signal intensity. We demonstrate that training
on experimental phantoms can restore the correlation of signal
amplitudes measured in depth. While the absolute quantification error
remains high and further improvements are needed, our results highlight
the promise of deep learning to advance quantitative PAI.
通过光声成像(PAI)数据精确测量光吸收系数将能够直接绘制分子浓度图,从而提供重要的临床洞察力。由于模拟和实验之间的领域差距,吸收系数恢复问题的不适定性质阻碍了 PAI 在生命系统中实现这一目标。为了弥补这一差距,我们引入了一系列经过实验充分表征的成像模型及其数字双胞胎。这个史无前例的模型数据集能够根据实验数据对 U-Net 进行监督训练,以逐像素估计吸收系数。我们表明,对模拟数据的训练会导致估计中的伪影和偏差,从而强化了模拟与实验之间域差距的存在。然而,对实验获得的数据进行训练可以产生更准确、更稳健的光吸收系数估计。我们将结果与根据材料的参考光学特性使用蒙特卡罗模型进行的注量校正进行比较,产生大约 20% 的量化误差。将经过训练的 U-Net 应用于血流模型,在模拟数据上进行训练时表现出光谱偏差,而应用于小鼠模型则强调了两种基于学习的方法恢复与深度相关的信号强度损失的能力。我们证明,对实验体模的训练可以恢复深度测量的信号幅度的相关性。虽然绝对量化误差仍然很高,需要进一步改进,但我们的结果凸显了深度学习在推进定量 PAI 方面的前景。
AU Sewani, Alykhan
Roa, Carlos-Felipe
Zhou, James J.
Alawneh, Yara
Quadri, Amaar
Gilliland-Rocque, Rene
Cherin, Emmanuel
Dueck, Andrew
Demore, Christine
Wright, Graham
Tavallaei, M. Ali
AU Sewani、Alykhan Roa、Carlos-Felipe Zhou、James J. Alawneh、Yara Quadri、Amaar Gilliland-Rocque、Rene Cherin、Emmanuel Dueck、Andrew Demore、Christine Wright、Graham Tavallaei、M. Ali
The CathEye: A Forward-Looking Ultrasound Catheter for Image-Guided
Cardiovascular Procedures
CathEye:用于图像引导心血管手术的前瞻性超声导管
Catheter based procedures are typically guided by X-Ray, which suffers
from low soft tissue contrast and only provides 2D projection images of
a 3D volume. Intravascular ultrasound (IVUS) can serve as a
complementary imaging technique. Forward viewing catheters are useful
for visualizing obstructions along the path of the catheter. The CathEye
system mechanically steers a single-element transducer to generate a
forward-looking surface reconstruction from an irregularly spaced 2-D
scan pattern. The steerable catheter leverages an expandable frame with
cables to manipulate the distal end independently of vessel tortuosity.
The tip position is estimated by measuring the cable displacements and
used to create surface reconstructions of the imaging workspace with the
single-element transducer. CathEye's imaging capabilities were tested
with an agar phantom and an ex vivo chronic total occlusion (CTO) sample
while the catheter was confined to various tortuous paths. The CathEye
maintained similar scan patterns regardless of path tortuosity and was
able to recreate major features of the imaging targets, such as holes
and extrusions. The feasibility of forward-looking IVUS with the CathEye
is demonstrated in this study. The CathEye mechanism can be applied to
other imaging modalities with field-of-view (FOV) limitations and
represents the basis for an interventional device fully integrated with
image guidance.
基于导管的手术通常由 X 射线引导,其软组织对比度较低,并且仅提供 3D 体积的 2D 投影图像。血管内超声(IVUS)可以作为一种补充成像技术。前视导管可用于观察导管路径上的障碍物。 CathEye 系统机械地引导单元件传感器,根据不规则间隔的二维扫描图案生成前视表面重建。可操纵导管利用带有电缆的可扩展框架来独立于血管弯曲度来操纵远端。通过测量电缆位移来估计尖端位置,并用于使用单元件传感器创建成像工作空间的表面重建。 CathEye 的成像能力通过琼脂模型和离体慢性完全闭塞 (CTO) 样本进行了测试,同时导管被限制在各种曲折路径中。无论路径曲折如何,CathEye 都保持相似的扫描模式,并且能够重新创建成像目标的主要特征,例如孔和挤压。本研究证明了使用 CathEye 进行前瞻性 IVUS 的可行性。 CathEye 机制可应用于具有视场 (FOV) 限制的其他成像模式,并代表了与图像引导完全集成的介入设备的基础。
AU Fu, Junhu
Chen, Ke
Dou, Qi
Gao, Yun
He, Yiping
Zhou, Pinghong
Lin, Shengli
Wang, Yuanyuan
Guo, Yi
AU Fu, 陈俊虎, 窦柯, 高奇, 何云, 周一平, 林平红, 王胜利, 郭媛媛, 易
IPNet: An Interpretable Network with Progressive Loss for Whole-stage
Colorectal Disease Diagnosis.
IPNet:用于全阶段结直肠疾病诊断的渐进式损失的可解释网络。
Colorectal cancer plays a dominant role in cancer-related deaths,
primarily due to the absence of obvious early-stage symptoms.
Whole-stage colorectal disease diagnosis is crucial for assessing lesion
evolution and determining treatment plans. However, locality difference
and disease progression lead to intra-class disparities and inter-class
similarities for colorectal lesion representation. In addition,
interpretable algorithms explaining the lesion progression are still
lacking, making the prediction process a "black box". In this paper, we
propose IPNet, a dual-branch interpretable network with progressive loss
for whole-stage colorectal disease diagnosis. The dual-branch
architecture captures unbiased features representing diverse localities
to suppress intra-class variation. The progressive loss function
considers inter-class relationship, using prior knowledge of disease
evolution to guide classification. Furthermore, a novel Grain-CAM is
designed to interpret IPNet by visualizing pixel-wise attention maps
from shallow to deep layers, providing regions semantically related to
IPNet's progressive classification. We conducted whole-stage diagnosis
on two image modalities, i.e., colorectal lesion classification on
129,893 endoscopic optical images and rectal tumor T-staging on 11,072
endoscopic ultrasound images. IPNet is shown to surpass other
state-of-the-art algorithms, accordingly achieving an accuracy of 93.15%
and 89.62%. Especially, it establishes effective decision boundaries for
challenges like polyp vs. adenoma and T2 vs. T3. The results demonstrate
an explainable attempt for colorectal lesion classification at a
whole-stage level, and rectal tumor T-staging by endoscopic ultrasound
is also unprecedentedly explored. IPNet is expected to be further
applied, assisting physicians in whole-stage disease diagnosis and
enhancing diagnostic interpretability.
结直肠癌在癌症相关死亡中占主导地位,这主要是由于没有明显的早期症状。全阶段结直肠疾病诊断对于评估病变演变和确定治疗方案至关重要。然而,局部差异和疾病进展导致结直肠病变表征的类内差异和类间相似性。此外,仍然缺乏解释病变进展的可解释算法,使得预测过程成为“黑匣子”。在本文中,我们提出了 IPNet,这是一种具有渐进损失的双分支可解释网络,用于全阶段结直肠疾病诊断。双分支架构捕获代表不同位置的无偏特征,以抑制类内变异。渐进损失函数考虑类间关系,利用疾病进化的先验知识来指导分类。此外,一种新颖的 Grain-CAM 旨在通过可视化从浅层到深层的像素级注意力图来解释 IPNet,提供与 IPNet 渐进分类语义相关的区域。我们对两种图像模式进行了全阶段诊断,即对129,893张内镜光学图像进行结直肠病变分类,对11,072张内镜超声图像进行直肠肿瘤T分期。 IPNet 被证明超越了其他最先进的算法,相应地达到了 93.15% 和 89.62% 的准确率。特别是,它为息肉与腺瘤以及 T2 与 T3 等挑战建立了有效的决策边界。结果表明,在全阶段水平上对结直肠病变分类进行了可解释的尝试,并且通过内镜超声对直肠肿瘤T分期也进行了前所未有的探索。 IPNet有望得到进一步应用,辅助医生进行全阶段疾病诊断,增强诊断的可解释性。
EI 1558-254X
DA 2024-09-21
UT MEDLINE:39298304
PM 39298304
ER
EI 1558-254X DA 2024-09-21 UT MEDLINE:39298304 PM 39298304 ER
AU Zhang, Yirui
Zou, Yanni
Liu, Peter X
张AU、邹一瑞、刘燕妮、Peter X
Point Cloud Registration in Laparoscopic Liver Surgery Using Keypoint
Correspondence Registration Network.
使用关键点对应配准网络进行腹腔镜肝脏手术中的点云配准。
Laparoscopic liver surgery is a newly developed minimally invasive
technique and represents an inevitable trend in the future development
of surgical methods. By using augmented reality (AR) technology to
overlay preoperative CT models with intraoperative laparoscopic videos,
surgeons can accurately locate blood vessels and tumors, significantly
enhancing the safety and precision of surgeries. Point cloud
registration technology is key to achieving this effect. However, there
are two major challenges in registering the CT model with the point
cloud surface reconstructed from intraoperative laparoscopy. First, the
surface features of the organ are not prominent. Second, due to the
limited field of view of the laparoscope, the reconstructed surface
typically represents only a very small portion of the entire organ. To
address these issues, this paper proposes the keypoint correspondence
registration network (KCR-Net). This network first uses the neighborhood
feature fusion module (NFFM) to aggregate and interact features from
different regions and structures within a pair of point clouds to obtain
comprehensive feature representations. Then, through correspondence
generation, it directly generates keypoints and their corresponding
weights, with keypoints located in the common structures of the point
clouds to be registered, and corresponding weights learned automatically
by the network. This approach enables accurate point cloud registration
even under conditions of extremely low overlap. Experiments conducted on
the ModelNet40, 3Dircadb, DePoLL demonstrate that our method achieves
excellent registration accuracy and is capable of meeting the
requirements of real-world scenarios.
腹腔镜肝脏手术是一种新兴的微创技术,是未来手术方法发展的必然趋势。通过使用增强现实(AR)技术将术前CT模型与术中腹腔镜视频叠加,外科医生可以准确定位血管和肿瘤,显着提高手术的安全性和精准度。点云配准技术是实现这一效果的关键。然而,将 CT 模型与术中腹腔镜重建的点云表面配准存在两个主要挑战。首先,器官的表面特征不突出。其次,由于腹腔镜的视野有限,重建的表面通常仅代表整个器官的很小一部分。为了解决这些问题,本文提出了关键点对应注册网络(KCR-Net)。该网络首先使用邻域特征融合模块(NFFM)来聚合和交互来自一对点云内不同区域和结构的特征,以获得全面的特征表示。然后,通过对应生成,直接生成关键点及其对应的权重,关键点位于要注册的点云的公共结构中,并由网络自动学习对应的权重。即使在重叠度极低的情况下,这种方法也能实现准确的点云配准。在ModelNet40、3Dircadb、DePoLL上进行的实验表明,我们的方法具有出色的配准精度,能够满足实际场景的要求。
AU Kaji, Shizuo
Tanabe, Naoya
Maetani, Tomoki
Shiraishi, Yusuke
Sakamoto, Ryo
Oguma, Tsuyoshi
Suzuki, Katsuhiro
Terada, Kunihiko
Fukui, Motonari
Muro, Shigeo
Sato, Susumu
Hirai, Toyohiro
梶AU、田边静雄、前谷直也、白石智树、坂本佑介、小熊亮、铃木刚、寺田胜宏、福井邦彦、室元就、佐藤繁夫、平井进、丰博
Quantification of Airway Structures by Persistent Homology
通过持续同源性对气道结构进行量化
We propose two types of novel morphological metrics for quantifying the
geometry of tubular structures on computed tomography (CT) images. We
apply our metrics to identify irregularities in the airway of patients
with chronic obstructive pulmonary disease (COPD) and demonstrate that
they provide complementary information to the conventional metrics used
to assess COPD, such as the tissue density distribution in lung
parenchyma and the wall area ratio of the segmented airway. The
three-dimensional shape of the airway and its abstraction as a rooted
tree with the root at the trachea carina are automatically extracted
from a lung CT volume, and the two metrics are computed based on a
mathematical tool called persistent homology; treeH0 quantifies the
distribution of branch lengths to assess the complexity of the tree-like
structure and radialH0 quantifies the irregularities in the luminal
radius along the airway. We show our metrics are associated with
clinical outcomes.
我们提出了两种新颖的形态学度量来量化计算机断层扫描(CT)图像上管状结构的几何形状。我们应用我们的指标来识别慢性阻塞性肺病 (COPD) 患者气道的不规则性,并证明它们为用于评估 COPD 的传统指标提供了补充信息,例如肺实质中的组织密度分布和壁面积比的分段气道。气道的三维形状及其抽象为根部位于气管隆突的有根树,是从肺部 CT 体积中自动提取的,并且这两个指标是基于称为持久同源性的数学工具计算的; treeH0 量化分支长度的分布以评估树状结构的复杂性,radialH0 量化沿气道的管腔半径的不规则性。我们表明我们的指标与临床结果相关。
AU Chen, Haomin
Dreizin, David
Gomez, Catalina
Zapaishchykova, Anna
Unberath, Mathias
AU Chen、Haomin Dreizin、David Gomez、Catalina Zapaishchykova、Anna Unberath、Mathias
Interpretable Severity Scoring of Pelvic Trauma Through Automated
Fracture Detection and Bayesian Inference.
通过自动骨折检测和贝叶斯推理对骨盆创伤进行可解释的严重程度评分。
Pelvic ring disruptions result from blunt injury mechanisms and are
potentially lethal mainly due to associated injuries and massive pelvic
hemorrhage. The severity of pelvic fractures in trauma victims is
frequently assessed by grading the fracture according to the Tile AO/OTA
classification in whole-body Computed Tomography (CT) scans. Due to the
high volume of whole-body CT scans generated in trauma centers, the
overall information content of a single whole-body CT scan and low
manual CT reading speed, an automatic approach to Tile classification
would provide substantial value, e. g., to prioritize the reading
sequence of the trauma radiologists or enable them to focus on other
major injuries in multi-trauma patients. In such a high-stakes scenario,
an automated method for Tile grading should ideally be transparent such
that the symbolic information provided by the method follows the same
logic a radiologist or orthopedic surgeon would use to determine the
fracture grade. This paper introduces an automated yet interpretable
pelvic trauma decision support system to assist radiologists in fracture
detection and Tile grading. To achieve interpretability despite
processing high-dimensional whole-body CT images, we design a
neurosymbolic algorithm that operates similarly to human interpretation
of CT scans. The algorithm first detects relevant pelvic fractures on
CTs with high specificity using Faster-RCNN. To generate robust fracture
detections and associated detection (un)certainties, we perform
test-time augmentation of the CT scans to apply fracture detection
several times in a self-ensembling approach. The fracture detections are
interpreted using a structural causal model based on clinical best
practices to infer an initial Tile grade. We apply a Bayesian causal
model to recover likely co-occurring fractures that may have been
rejected initially due to the highly specific operating point of the
detector, resulting in an updated list of detected fractures and
corresponding final Tile grade. Our method is transparent in that it
provides fracture location and types, as well as information on
important counterfactuals that would invalidate the system's
recommendation. Our approach achieves an AUC of 0.89/0.74 for
translational and rotational instability,which is comparable to
radiologist performance. Despite being designed for human-machine
teaming, our approach does not compromise on performance compared to
previous black-box methods.
骨盆环破裂是由钝性损伤机制引起的,并且主要由于相关损伤和大量骨盆出血而可能致命。通常根据全身计算机断层扫描 (CT) 扫描中的 Tile AO/OTA 分类对骨折进行分级来评估创伤受害者骨盆骨折的严重程度。由于创伤中心生成的全身 CT 扫描量大、单次全身 CT 扫描的总体信息内容以及手动 CT 读取速度低,Tile 分类的自动方法将提供巨大的价值,例如,确定优先级创伤放射科医生的阅读顺序或使他们能够专注于多发伤患者的其他重大损伤。在这种高风险的情况下,Tile 分级的自动化方法在理想情况下应该是透明的,以便该方法提供的符号信息遵循放射科医生或整形外科医生用来确定骨折分级的相同逻辑。本文介绍了一种自动化但可解释的骨盆创伤决策支持系统,可协助放射科医生进行骨折检测和 Tile 分级。为了在处理高维全身 CT 图像的情况下实现可解释性,我们设计了一种神经符号算法,其操作类似于人类对 CT 扫描的解释。该算法首先使用 Faster-RCNN 在 CT 上以高特异性检测相关骨盆骨折。为了生成可靠的断裂检测和相关的检测(不确定)确定性,我们对 CT 扫描进行测试时间增强,以自组装方法多次应用断裂检测。使用基于临床最佳实践的结构因果模型来解释断裂检测,以推断初始 Tile 等级。 我们应用贝叶斯因果模型来恢复可能同时发生的裂缝,这些裂缝最初可能由于探测器的高度特定的操作点而被拒绝,从而产生检测到的裂缝和相应的最终瓷砖等级的更新列表。我们的方法是透明的,因为它提供了裂缝位置和类型,以及有关会使系统建议无效的重要反事实的信息。我们的方法在平移和旋转不稳定性方面实现了 0.89/0.74 的 AUC,与放射科医生的表现相当。尽管是为人机协作而设计的,但与以前的黑盒方法相比,我们的方法并没有影响性能。
AU Zhou, Lifang
Jiang, Yu
Li, Weisheng
Hu, Jun
Zheng, Shenhai
周AU、姜丽芳、李宇、胡伟胜、郑军、申海
Shape-Scale Co-Awareness Network for 3D Brain Tumor Segmentation
用于 3D 脑肿瘤分割的形状尺度协同感知网络
The accurate segmentation of brain tumor is significant in clinical
practice. Convolutional Neural Network (CNN)-based methods have made
great progress in brain tumor segmentation due to powerful local
modeling ability. However, brain tumors are frequently pattern-agnostic,
i.e. variable in shape, size and location, which can not be effectively
matched by traditional CNN-based methods with local and regular
receptive fields. To address the above issues, we propose a shape-scale
co-awareness network (S2CA-Net) for brain tumor segmentation, which can
efficiently learn shape-aware and scale-aware features simultaneously to
enhance pattern-agnostic representations. Primarily, three key
components are proposed to accomplish the co-awareness of shape and
scale. The Local-Global Scale Mixer (LGSM) decouples the extraction of
local and global context by adopting the CNN-Former parallel structure,
which contributes to obtaining finer hierarchical features. The
Multi-level Context Aggregator (MCA) enriches the scale diversity of
input patches by modeling global features across multiple receptive
fields. The Multi-Scale Attentive Deformable Convolution (MS-ADC) learns
the target deformation based on the multiscale inputs, which motivates
the network to enforce feature constraints both in terms of scale and
shape for optimal feature matching. Overall, LGSM and MCA focus on
enhancing the scale-awareness of the network to cope with the size and
location variations, while MS-ADC focuses on capturing deformation
information for optimal shape matching. Finally, their effective
integration prompts the network to perceive variations in shape and
scale simultaneously, which can robustly tackle the variations in
patterns of brain tumors. The experimental results on BraTS 2019, BraTS
2020, MSD BTS Task and BraTS2023-MEN show that S2CA-Net has superior
overall performance in accuracy and efficiency compared to other
state-of-the-art methods. Code: https://github.com/jiangyu945/S2CA-Net.
脑肿瘤的准确分割在临床实践中具有重要意义。基于卷积神经网络(CNN)的方法凭借强大的局部建模能力,在脑肿瘤分割方面取得了巨大进展。然而,脑肿瘤通常是模式不可知的,即形状、大小和位置可变,这不能通过具有局部和规则感受野的基于 CNN 的传统方法有效匹配。为了解决上述问题,我们提出了一种用于脑肿瘤分割的形状尺度协同感知网络(S2CA-Net),它可以有效地同时学习形状感知和尺度感知特征,以增强与模式无关的表示。首先,提出了三个关键组成部分来实现形状和尺度的共同意识。局部-全局尺度混合器(LGSM)通过采用CNN-Former并行结构解耦局部和全局上下文的提取,这有助于获得更精细的层次特征。多级上下文聚合器 (MCA) 通过对多个感受野的全局特征进行建模,丰富了输入块的尺度多样性。多尺度注意力变形卷积(MS-ADC)根据多尺度输入学习目标变形,这促使网络在尺度和形状方面强制实施特征约束,以实现最佳特征匹配。总体而言,LGSM 和 MCA 侧重于增强网络的尺度感知,以应对尺寸和位置变化,而 MS-ADC 侧重于捕获变形信息以实现最佳形状匹配。最后,它们的有效整合促使网络同时感知形状和尺度的变化,这可以稳健地应对脑肿瘤模式的变化。 在BraTS 2019、BraTS 2020、MSD BTS Task和BraTS2023-MEN上的实验结果表明,与其他最先进的方法相比,S2CA-Net在准确性和效率方面具有优越的整体性能。代码:https://github.com/jianyu945/S2CA-Net。
C1 Chongqing Univ Posts & Telecommun, Key Lab Image Cognit, Chongqing
400065, Peoples R China
C1 Chongqing Univ Posts & Telecommun, Coll Software, Chongqing 400065,
Peoples R China
C1 Guizhou Univ, Key Lab Adv Mfg Technol, Minist Educ, Guiyang 550025,
Guizhou, Peoples R China
C1 Third Mil Med Univ, Southwest Hosp, Dept Neurol, Chongqing 400065,
Peoples R China
SN 0278-0062
EI 1558-254X
DA 2024-07-22
UT WOS:001263692100005
PM 38386578
ER
C1 重庆邮电大学,图像认知重点实验室,重庆 400065,人民 R C1 重庆邮电大学,科尔软件,重庆 400065,人民 R C1 贵州大学,先进制造技术重点实验室,工信部教育学院,贵阳 550025贵州,人民 R 中国 C1 第三军医大学,西南医院,神经科,重庆 400065,人民 R 中国 SN 0278-0062 EI 1558-254X DA 2024-07-22 UT WOS:001263692100005 PM 38386578 ER
AU Wang, Puyang
Guo, Dazhou
Zheng, Dandan
Zhang, Minghui
Yu, Haogang
Sun, Xin
Ge, Jia
Gu, Yun
Lu, Le
Ye, Xianghua
Jin, Dakai
王AU、郭濮阳、郑大周、张丹丹、于明辉、孙浩刚、葛鑫、谷佳、陆云、叶乐、金向华、大凯
Accurate Airway Tree Segmentation in CT Scans via Anatomy-aware
Multi-class Segmentation and Topology-guided Iterative Learning.
通过解剖感知的多类分割和拓扑引导的迭代学习,在 CT 扫描中进行准确的气道树分割。
Intrathoracic airway segmentation in computed tomography is a
prerequisite for various respiratory disease analyses such as chronic
obstructive pulmonary disease, asthma and lung cancer. Due to the low
imaging contrast and noises execrated at peripheral branches, the
topological-complexity and the intra-class imbalance of airway tree, it
remains challenging for deep learning-based methods to segment the
complete airway tree (on extracting deeper branches). Unlike other
organs with simpler shapes or topology, the airway's complex tree
structure imposes an unbearable burden to generate the "ground truth"
label (up to 7 or 3 hours of manual or semi-automatic annotation per
case). Most of the existing airway datasets are incompletely
labeled/annotated, thus limiting the completeness of computer-segmented
airway. In this paper, we propose a new anatomy-aware multi-class airway
segmentation method enhanced by topology-guided iterative self-learning.
Based on the natural airway anatomy, we formulate a simple yet highly
effective anatomy-aware multi-class segmentation task to intuitively
handle the severe intra-class imbalance of the airway. To solve the
incomplete labeling issue, we propose a tailored iterative self-learning
scheme to segment toward the complete airway tree. For generating
pseudo-labels to achieve higher sensitivity (while retaining similar
specificity), we introduce a novel breakage attention map and design a
topology-guided pseudo-label refinement method by iteratively connecting
breaking branches commonly existed from initial pseudo-labels. Extensive
experiments have been conducted on four datasets including two public
challenges. The proposed method achieves the top performance in both
EXACT'09 challenge using average score and ATM'22 challenge on weighted
average score. In a public BAS dataset and a private lung cancer
dataset, our method significantly improves previous leading approaches
by extracting at least (absolute) 6.1% more detected tree length and
5.2% more tree branches, while maintaining comparable precision.
计算机断层扫描中的胸内气道分割是各种呼吸系统疾病分析的先决条件,例如慢性阻塞性肺病、哮喘和肺癌。由于外围分支的成像对比度低、噪声大、气道树的拓扑复杂性和类内不平衡,基于深度学习的方法分割完整的气道树(提取更深的分支)仍然具有挑战性。与具有更简单形状或拓扑的其他器官不同,气道复杂的树结构给生成“地面真相”标签带来了难以承受的负担(每个病例最多需要 7 或 3 小时的手动或半自动注释)。大多数现有气道数据集的标记/注释不完整,从而限制了计算机分段气道的完整性。在本文中,我们提出了一种通过拓扑引导迭代自学习增强的新的解剖感知多类气道分割方法。基于自然气道解剖结构,我们制定了一个简单而高效的解剖感知多类分割任务,以直观地处理气道严重的类内不平衡。为了解决不完整的标记问题,我们提出了一种定制的迭代自学习方案来分割完整的气道树。为了生成伪标签以实现更高的灵敏度(同时保留相似的特异性),我们引入了一种新颖的断裂注意图,并通过迭代连接初始伪标签中常见的断裂分支来设计拓扑引导的伪标签细化方法。已经对四个数据集进行了广泛的实验,其中包括两个公共挑战。 所提出的方法在使用平均分数的 EXACT'09 挑战和使用加权平均分数的 ATM'22 挑战中均取得了最佳性能。在公共 BAS 数据集和私有肺癌数据集中,我们的方法通过提取至少(绝对)多出 6.1% 的检测树长度和多 5.2% 的树枝,同时保持相当的精度,显着改进了以前的领先方法。
AU Liu, Jiaxuan
Zhang, Hui
Tian, Jiang-Huai
Su, Yingjian
Chen, Yurong
Wang, Yaonan
刘AU、张家轩、田辉、苏江怀、陈英健、王玉蓉、耀南
R2D2-GAN: Robust Dual Discriminator Generative Adversarial Network for
Microscopy Hyperspectral Image Super-Resolution.
R2D2-GAN:用于显微镜高光谱图像超分辨率的鲁棒双鉴别器生成对抗网络。
High-resolution microscopy hyperspectral (HS) images can provide highly
detailed spatial and spectral information, enabling the identification
and analysis of biological tissues at a microscale level. Recently,
significant efforts have been devoted to enhancing the resolution of HS
images by leveraging high spatial resolution multispectral (MS) images.
However, the inherent hardware constraints lead to a significant
distribution gap between HS and MS images, posing challenges for image
super-resolution within biomedical domains. This discrepancy may arise
from various factors, including variations in camera imaging principles
(e.g., snapshot and push-broom imaging), shooting positions, and the
presence of noise interference. To address these challenges, we
introduced a unique unsupervised super-resolution framework named
R2D2-GAN. This framework utilizes a generative adversarial network (GAN)
to efficiently merge the two data modalities and improve the resolution
of microscopy HS images. Traditionally, supervised approaches have
relied on intuitive and sensitive loss functions, such as mean squared
error (MSE). Our method, trained in a real-world unsupervised setting,
benefits from exploiting consistent information across the two
modalities. It employs a game-theoretic strategy and dynamic adversarial
loss, rather than relying solely on fixed training strategies for
reconstruction loss. Furthermore, we have augmented our proposed model
with a central consistency regularization (CCR) module, aiming to
further enhance the robustness of the R2D2-GAN. Our experimental results
show that the proposed method is accurate and robust for
super-resolution images. We specifically tested our proposed method on
both a real and a synthetic dataset, obtaining promising results in
comparison to other state-of-the-art methods. Our code and datasets are
accessible through Multimedia Content.
高分辨率显微镜高光谱 (HS) 图像可以提供高度详细的空间和光谱信息,从而能够在微观尺度上识别和分析生物组织。最近,人们致力于利用高空间分辨率多光谱(MS)图像来提高 HS 图像的分辨率。然而,固有的硬件限制导致HS和MS图像之间存在显着的分布差距,给生物医学领域的图像超分辨率带来了挑战。这种差异可能由多种因素引起,包括相机成像原理的变化(例如快照和推扫式成像)、拍摄位置以及噪声干扰的存在。为了应对这些挑战,我们引入了一种独特的无监督超分辨率框架,名为 R2D2-GAN。该框架利用生成对抗网络 (GAN) 有效地合并两种数据模式并提高显微 HS 图像的分辨率。传统上,监督方法依赖于直观且敏感的损失函数,例如均方误差(MSE)。我们的方法在现实世界的无监督环境中进行训练,受益于利用两种模式的一致信息。它采用博弈论策略和动态对抗性损失,而不是仅仅依靠固定的训练策略来重建损失。此外,我们还使用中央一致性正则化(CCR)模块增强了我们提出的模型,旨在进一步增强 R2D2-GAN 的鲁棒性。我们的实验结果表明,所提出的方法对于超分辨率图像是准确且鲁棒的。 我们专门在真实数据集和合成数据集上测试了我们提出的方法,与其他最先进的方法相比,获得了有希望的结果。我们的代码和数据集可通过多媒体内容访问。
AU Zhang, Zhenxuan
Yu, Chengjin
Zhang, Heye
Gao, Zhifan
张AU、于振轩、张成金、高荷叶、志凡
Embedding Tasks Into the Latent Space: Cross-Space Consistency for
Multi-Dimensional Analysis in Echocardiography
将任务嵌入到潜在空间中:超声心动图多维分析的跨空间一致性
Multi-dimensional analysis in echocardiography has attracted attention
due to its potential for clinical indices quantification and
computer-aided diagnosis. It can utilize various information to provide
the estimation of multiple cardiac indices. However, it still has the
challenge of inter-task conflict. This is owing to regional confusion,
global abnormalities, and time-accumulated errors. Task mapping methods
have the potential to address inter-task conflict. However, they may
overlook the inherent differences between tasks, especially for
multi-level tasks (e.g., pixel-level, image-level, and sequence-level
tasks). This may lead to inappropriate local and spurious task
constraints. We propose cross-space consistency (CSC) to overcome the
challenge. The CSC embeds multi-level tasks to the same-level to reduce
inherent task differences. This allows multi-level task features to be
consistent in a unified latent space. The latent space extracts
task-common features and constrains the distance in these features. This
constrains the task weight region that satisfies multiple task
conditions. Extensive experiments compare the CSC with fifteen
state-of-the-art echocardiographic analysis methods on five datasets
(10,908 patients). The result shows that the CSC can provide left
ventricular (LV) segmentation, (DSC = 0.932), keypoint detection (MAE =
3.06mm), and keyframe identification (accuracy = 0.943). These results
demonstrate that our method can provide a multi-dimensional analysis of
cardiac function and is robust in large-scale datasets.
超声心动图的多维分析因其在临床指标量化和计算机辅助诊断方面的潜力而引起人们的关注。它可以利用各种信息来提供多种心脏指数的估计。然而,它仍然面临任务间冲突的挑战。这是由于区域混乱、全局异常和时间累积的错误造成的。任务映射方法有可能解决任务间冲突。然而,他们可能忽略了任务之间的固有差异,特别是对于多级任务(例如,像素级、图像级和序列级任务)。这可能会导致不适当的本地和虚假任务限制。我们提出跨空间一致性(CSC)来克服这一挑战。 CSC将多级任务嵌入到同一级中,以减少固有的任务差异。这使得多级任务特征在统一的潜在空间中保持一致。潜在空间提取任务共同特征并约束这些特征中的距离。这限制了满足多个任务条件的任务权重区域。大量实验将 CSC 与 5 个数据集(10,908 名患者)上的 15 种最先进的超声心动图分析方法进行了比较。结果表明,CSC 可以提供左心室 (LV) 分割(DSC = 0.932)、关键点检测(MAE = 3.06mm)和关键帧识别(精度 = 0.943)。这些结果表明,我们的方法可以提供心脏功能的多维分析,并且在大规模数据集中具有鲁棒性。
AU Li, Xing
Jing, Kaili
Yang, Yan
Wang, Yongbo
Ma, Jianhua
Zheng, Hairong
Xu, Zongben
AU Li、Xing Jing、杨凯丽、王彦、马永波、郑建华、徐海荣、宗本
Noise-Generating and Imaging Mechanism Inspired Implicit Regularization
Learning Network for Low Dose CT Reconstrution
噪声产生和成像机制启发低剂量 CT 重建的隐式正则化学习网络
Low-dose computed tomography (LDCT) helps to reduce radiation risks in
CT scanning while maintaining image quality, which involves a consistent
pursuit of lower incident rays and higher reconstruction performance.
Although deep learning approaches have achieved encouraging success in
LDCT reconstruction, most of them treat the task as a general inverse
problem in either the image domain or the dual (sinogram and image)
domains. Such frameworks have not considered the original noise
generation of the projection data and suffer from limited performance
improvement for the LDCT task. In this paper, we propose a novel
reconstruction model based on noise-generating and imaging mechanism in
full-domain, which fully considers the statistical properties of
intrinsic noises in LDCT and prior information in sinogram and image
domains. To solve the model, we propose an optimization algorithm based
on the proximal gradient technique. Specifically, we derive the
approximate solutions of the integer programming problem on the
projection data theoretically. Instead of hand-crafting the sinogram and
image regularizers, we propose to unroll the optimization algorithm to
be a deep network. The network implicitly learns the proximal operators
of sinogram and image regularizers with two deep neural networks,
providing a more interpretable and effective reconstruction procedure.
Numerical results demonstrate our proposed method improvements of > 2.9
dB in peak signal to noise ratio, > 1.4% promotion in structural
similarity metric, and > 9 HU decrements in root mean square error over
current state-of-the-art LDCT methods.
低剂量计算机断层扫描(LDCT)有助于降低CT扫描中的辐射风险,同时保持图像质量,这涉及对更低入射射线和更高重建性能的持续追求。尽管深度学习方法在 LDCT 重建方面取得了令人鼓舞的成功,但大多数方法都将该任务视为图像域或双(正弦图和图像)域中的一般逆问题。此类框架没有考虑投影数据的原始噪声生成,并且 LDCT 任务的性能改进有限。在本文中,我们提出了一种基于全域噪声生成和成像机制的新型重建模型,该模型充分考虑了LDCT中固有噪声的统计特性以及正弦图和图像域中的先验信息。为了求解该模型,我们提出了一种基于近端梯度技术的优化算法。具体来说,我们从理论上推导了投影数据上的整数规划问题的近似解。我们建议将优化算法展开为深度网络,而不是手工制作正弦图和图像正则化器。该网络通过两个深度神经网络隐式学习正弦图和图像正则化器的近端算子,提供更可解释和更有效的重建过程。数值结果表明,与当前最先进的 LDCT 相比,我们提出的方法在峰值信噪比方面提高了 > 2.9 dB,在结构相似性度量方面提高了 > 1.4%,在均方根误差方面降低了 > 9 HU方法。
AU Onishi, Yuya
Hashimoto, Fumio
Ote, Kibo
Ota, Ryosuke
大西AU、桥本裕也、大手文雄、太田喜房、凉介
Whole Reconstruction-Free System Design for Direct Positron Emission
Imaging From Image Generation to Attenuation Correction
直接正电子发射成像从图像生成到衰减校正的整体免重构系统设计
Direct positron emission imaging (dPEI), which does not require a
mathematical reconstruction step, is a next-generation molecular imaging
modality. To maximize the practical applicability of the dPEI system to
clinical practice, we introduce a novel reconstruction-free
image-formation method called direct mu(Compton) imaging, which directly
localizes the interaction position of Compton scattering from the
annihilation photons in a three-dimensional space by utilizing the same
compact geometry as that for dPEI, involving ultrafast time-of-flight
radiation detectors. This unique imaging method not only provides the
anatomical information about an object but can also be applied to
attenuation correction of dPEI images. Evaluations through Monte Carlo
simulation showed that functional and anatomical hybrid images can be
acquired using this multimodal imaging system. By fusing the images, it
is possible to simultaneously access various object data, which ensures
the synergistic effect of the two imaging methodologies. In addition,
attenuation correction improves the quantification of dPEI images. The
realization of the whole reconstruction-free imaging system from image
generation to quantitative correction provides a new perspective in
molecular imaging.
直接正电子发射成像(dPEI)不需要数学重建步骤,是下一代分子成像模式。为了最大限度地提高 dPEI 系统在临床实践中的实用性,我们引入了一种称为直接 mu(康普顿) 成像的新型免重建成像方法,该方法直接定位三维湮灭光子的康普顿散射的相互作用位置。利用与 dPEI 相同的紧凑几何结构,包括超快飞行时间辐射探测器。这种独特的成像方法不仅可以提供物体的解剖信息,还可以应用于 dPEI 图像的衰减校正。通过蒙特卡罗模拟的评估表明,可以使用这种多模态成像系统获取功能和解剖混合图像。通过融合图像,可以同时访问各种对象数据,从而确保两种成像方法的协同效应。此外,衰减校正提高了 dPEI 图像的量化。从图像生成到定量校正的整个免重建成像系统的实现,为分子成像提供了新的视角。
AU Ta, Kevinminh
Ahn, Shawn S.
Thorn, Stephanie L.
Stendahl, John C.
Zhang, Xiaoran
Langdon, Jonathan
Staib, Lawrence H.
Sinusas, Albert J.
Duncan, James S.
AU Ta、Kevinminh Ahn、Shawn S. Thorn、Stephanie L. Stendahl、John C. 张、Xiaoran Langdon、Jonathan Staib、Lawrence H. Sinusas、Albert J. Duncan、James S.
Multi-Task Learning for Motion Analysis and Segmentation in 3D
Echocardiography
3D 超声心动图运动分析和分割的多任务学习
Characterizing left ventricular deformation and strain using 3D+time
echocardiography provides useful insights into cardiac function and can
be used to detect and localize myocardial injury. To achieve this, it is
imperative to obtain accurate motion estimates of the left ventricle. In
many strain analysis pipelines, this step is often accompanied by a
separate segmentation step; however, recent works have shown both tasks
to be highly related and can be complementary when optimized jointly. In
this work, we present a multi-task learning network that can
simultaneously segment the left ventricle and track its motion between
multiple time frames. Two task-specific networks are trained using a
composite loss function. Cross-stitch units combine the activations of
these networks by learning shared representations between the tasks at
different levels. We also propose a novel shape-consistency unit that
encourages motion propagated segmentations to match directly predicted
segmentations. Using a combined synthetic and in-vivo 3D
echocardiography dataset, we demonstrate that our proposed model can
achieve excellent estimates of left ventricular motion displacement and
myocardial segmentation. Additionally, we observe strong correlation of
our image-based strain measurements with crystal-based strain
measurements as well as good correspondence with SPECT perfusion
mappings. Finally, we demonstrate the clinical utility of the
segmentation masks in estimating ejection fraction and sphericity
indices that correspond well with benchmark measurements.
使用 3D+时间超声心动图表征左心室变形和应变可以提供对心脏功能的有用见解,并可用于检测和定位心肌损伤。为了实现这一目标,必须获得左心室的准确运动估计。在许多应变分析流程中,此步骤通常伴随着单独的分割步骤;然而,最近的研究表明这两项任务高度相关,并且在联合优化时可以互补。在这项工作中,我们提出了一个多任务学习网络,可以同时分割左心室并跟踪其在多个时间帧之间的运动。使用复合损失函数训练两个特定于任务的网络。十字绣单元通过学习不同级别的任务之间的共享表示来组合这些网络的激活。我们还提出了一种新颖的形状一致性单元,它鼓励运动传播的分割来匹配直接预测的分割。使用组合的合成和体内 3D 超声心动图数据集,我们证明我们提出的模型可以实现左心室运动位移和心肌分割的出色估计。此外,我们观察到基于图像的应变测量与基于晶体的应变测量之间存在很强的相关性,并且与 SPECT 灌注映射具有良好的对应性。最后,我们展示了分割掩模在估计射血分数和球形指数方面的临床实用性,这些指数与基准测量值很好地对应。
AU Liang, Yinhao
Tang, Wenjie
Wang, Ting
Ng, Wing W. Y.
Chen, Siyi
Jiang, Kuiming
Wei, Xinhua
Jiang, Xinqing
Guo, Yuan
区亮、汤银浩、王文杰、吴婷、陈永永、姜思怡、魏奎明、姜新华、郭新庆、袁
HRadNet: A Hierarchical Radiomics-Based Network for Multicenter Breast
Cancer Molecular Subtypes Prediction
HRadNet:基于分层放射组学的多中心乳腺癌分子亚型预测网络
Breast cancer is a heterogeneous disease, where molecular subtypes of
breast cancer are closely related to the treatment and prognosis.
Therefore, the goal of this work is to differentiate between luminal and
non-luminal subtypes of breast cancer. The hierarchical radiomics
network (HRadNet) is proposed for breast cancer molecular subtypes
prediction based on dynamic contrast-enhanced magnetic resonance
imaging. HRadNet fuses multilayer features with the metadata of images
to take advantage of conventional radiomics methods and general
convolutional neural networks. A two-stage training mechanism is adopted
to improve the generalization capability of the network for multicenter
breast cancer data. The ablation study shows the effectiveness of each
component of HRadNet. Furthermore, the influence of features from
different layers and metadata fusion are also analyzed. It reveals that
selecting certain layers of features for a specified domain can make
further performance improvements. Experimental results on three data
sets from different devices demonstrate the effectiveness of the
proposed network. HRadNet also has good performance when transferring to
other domains without fine-tuning.
乳腺癌是一种异质性疾病,乳腺癌的分子亚型与治疗和预后密切相关。因此,这项工作的目标是区分乳腺癌的管腔亚型和非管腔亚型。分层放射组学网络(HRadNet)被提出用于基于动态增强磁共振成像的乳腺癌分子亚型预测。 HRadNet 将多层特征与图像元数据融合,以利用传统放射组学方法和通用卷积神经网络。采用两阶段训练机制,提高网络对多中心乳腺癌数据的泛化能力。消融研究显示了 HRadNet 每个组件的有效性。此外,还分析了不同层特征和元数据融合的影响。它表明,为指定领域选择某些特征层可以进一步提高性能。来自不同设备的三个数据集的实验结果证明了所提出网络的有效性。 HRadNet 在无需微调的情况下转移到其他域时也具有良好的性能。
AU Ahmadi, N.
Tsang, M. Y.
Gu, A. N.
Tsang, T. S. M.
Abolmaesumi, P.
AU Ahmadi, N. Tsang, MY Gu, AN Tsang, TSM Abolmaesumi, P.
Transformer-Based Spatio-Temporal Analysis for Classification of Aortic
Stenosis Severity From Echocardiography Cine Series
基于变压器的时空分析对超声心动图电影系列中的主动脉瓣狭窄严重程度进行分类
Aortic stenosis (AS) is characterized by restricted motion and
calcification of the aortic valve and is the deadliest valvular cardiac
disease. Assessment of AS severity is typically done by expert
cardiologists using Doppler measurements of valvular flow from
echocardiography. However, this limits the assessment of AS to hospitals
staffed with experts to provide comprehensive echocardiography service.
As accurate Doppler acquisition requires significant clinical training,
in this paper, we present a deep learning framework to determine the
feasibility of AS detection and severity classification based only on
two-dimensional echocardiographic data. We demonstrate that our proposed
spatio-temporal architecture effectively and efficiently combines both
anatomical features and motion of the aortic valve for AS severity
classification. Our model can process cardiac echo cine series of
varying length and can identify, without explicit supervision, the
frames that are most informative towards the AS diagnosis. We present an
empirical study on how the model learns phases of the heart cycle
without any supervision and frame-level annotations. Our architecture
outperforms state-of-the-art results on a private and a public dataset,
achieving 95.2% and 91.5% in AS detection, and 78.1% and 83.8% in AS
severity classification on the private and public datasets,
respectively. Notably, due to the lack of a large public video dataset
for AS, we made slight adjustments to our architecture for the public
dataset. Furthermore, our method addresses common problems in training
deep networks with clinical ultrasound data, such as a low
signal-to-noise ratio and frequently uninformative frames. Our source
code is available at: https://github.com/neda77aa/FTC.git
主动脉瓣狭窄(AS)的特点是主动脉瓣运动受限和钙化,是最致命的瓣膜性心脏病。 AS 严重程度的评估通常由心脏病专家使用超声心动图对瓣膜血流的多普勒测量来完成。然而,这限制了对AS的评估仅限于配备专家提供全面超声心动图服务的医院。由于准确的多普勒采集需要大量的临床培训,因此在本文中,我们提出了一个深度学习框架,以确定仅基于二维超声心动图数据进行 AS 检测和严重程度分类的可行性。我们证明了我们提出的时空架构有效地结合了主动脉瓣的解剖特征和运动,以进行 AS 严重程度分类。我们的模型可以处理不同长度的心脏回声电影系列,并且可以在没有明确监督的情况下识别对 AS 诊断信息最丰富的帧。我们提出了一项关于模型如何在没有任何监督和帧级注释的情况下学习心动周期阶段的实证研究。我们的架构在私有和公共数据集上的表现优于最先进的结果,在私有和公共数据集上分别实现了 AS 检测的 95.2% 和 91.5%,以及 AS 严重性分类的 78.1% 和 83.8%。值得注意的是,由于缺乏大量的 AS 公共视频数据集,我们对公共数据集的架构进行了轻微调整。此外,我们的方法解决了使用临床超声数据训练深度网络的常见问题,例如低信噪比和经常缺乏信息的帧。我们的源代码位于:https://github.com/neda77aa/FTC.git
AU Li, Yiyue
Qian, Guangwu
Jiang, Xiaoshuang
Jiang, Zekun
Wen, Wen
Zhang, Shaoting
Li, Kang
Lao, Qicheng
AU Li、钱一跃、蒋光武、蒋小双、文泽坤、张文、李少婷、康老、启成
Hierarchical-Instance Contrastive Learning for Minority Detection on
Imbalanced Medical Datasets
用于不平衡医学数据集少数检测的分层实例对比学习
Deep learning methods are often hampered by issues such as data
imbalance and data-hungry. In medical imaging, malignant or rare
diseases are frequently of minority classes in the dataset, featured by
diversified distribution. Besides that, insufficient labels and unseen
cases also present conundrums for training on the minority classes. To
confront the stated problems, we propose a novel Hierarchical-instance
Contrastive Learning (HCLe) method for minority detection by only
involving data from the majority class in the training stage. To tackle
inconsistent intra-class distribution in majority classes, our method
introduces two branches, where the first branch employs an auto-encoder
network augmented with three constraint functions to effectively extract
image-level features, and the second branch designs a novel contrastive
learning network by taking into account the consistency of features
among hierarchical samples from majority classes. The proposed method is
further refined with a diverse mini-batch strategy, enabling the
identification of minority classes under multiple conditions. Extensive
experiments have been conducted to evaluate the proposed method on three
datasets of different diseases and modalities. The experimental results
show that the proposed method outperforms the state-of-the-art methods.
深度学习方法常常受到数据不平衡和数据匮乏等问题的阻碍。在医学影像中,恶性或罕见疾病在数据集中往往属于少数类别,且分布多样化。除此之外,标签不足、案例未见也给少数民族班的培训带来了难题。为了解决上述问题,我们提出了一种新颖的分层实例对比学习(HCLe)方法,通过在训练阶段仅涉及来自多数类别的数据来进行少数检测。为了解决大多数类中类内分布不一致的问题,我们的方法引入了两个分支,其中第一个分支采用增强了三个约束函数的自动编码器网络来有效提取图像级特征,第二个分支设计了一种新颖的对比学习网络通过考虑大多数类别的分层样本之间特征的一致性。所提出的方法通过多样化的小批量策略进一步完善,从而能够在多种条件下识别少数类别。已经进行了大量的实验,以在不同疾病和模式的三个数据集上评估所提出的方法。实验结果表明,所提出的方法优于最先进的方法。
AU Mei, Lanzhuju
Fang, Yu
Zhao, Yue
Zhou, Xiang Sean
Zhu, Min
Cui, Zhiming
Shen, Dinggang
区梅、方兰珠菊、赵宇、周悦、朱翔、崔敏、沉志明、丁刚
DTR-Net: Dual-Space 3D Tooth Model Reconstruction From Panoramic X-Ray
Images
DTR-Net:根据全景 X 射线图像重建双空间 3D 牙齿模型
In digital dentistry, cone-beam computed tomography (CBCT) can provide
complete 3D tooth models, yet suffers from a long concern of requiring
excessive radiation dose and higher expense. Therefore, 3D tooth model
reconstruction from 2D panoramic X-ray image is more cost-effective, and
has attracted great interest in clinical applications. In this paper, we
propose a novel dual-space framework, namely DTR-Net, to reconstruct 3D
tooth model from 2D panoramic X-ray images in both image and geometric
spaces. Specifically, in the image space, we apply a 2D-to-3D generative
model to recover intensities of CBCT image, guided by a task-oriented
tooth segmentation network in a collaborative training manner.
Meanwhile, in the geometric space, we benefit from an implicit function
network in the continuous space, learning using points to capture
complicated tooth shapes with geometric properties. Experimental results
demonstrate that our proposed DTR-Net achieves state-of-the-art
performance both quantitatively and qualitatively in 3D tooth model
reconstruction, indicating its potential application in dental practice.
在数字牙科领域,锥形束计算机断层扫描(CBCT)可以提供完整的3D牙齿模型,但长期以来一直存在辐射剂量过高和费用较高的问题。因此,从2D全景X射线图像重建3D牙齿模型更具成本效益,并引起了临床应用的极大兴趣。在本文中,我们提出了一种新颖的双空间框架,即 DTR-Net,用于在图像和几何空间中从 2D 全景 X 射线图像重建 3D 牙齿模型。具体来说,在图像空间中,我们应用 2D 到 3D 生成模型来恢复 CBCT 图像的强度,并以协作训练的方式由面向任务的牙齿分割网络引导。同时,在几何空间中,我们受益于连续空间中的隐式函数网络,学习使用点来捕获具有几何特性的复杂牙齿形状。实验结果表明,我们提出的 DTR-Net 在 3D 牙齿模型重建中在定量和定性方面均实现了最先进的性能,表明其在牙科实践中的潜在应用。
AU Noichl, Wolfgang
De Marco, Fabio
Willer, Konstantin
Urban, Theresa
Frank, Manuela
Schick, Rafael
Gleich, Bernhard
Hehn, Lorenz
Gustschin, Alex
Meyer, Pascal
Koehler, Thomas
Maack, Ingo
Engel, Klaus-Jurgen
Lundt, Bernd
Renger, Bernhard
Fingerle, Alexander
Pfeiffer, Daniela
Rummeny, Ernst
Herzen, Julia
Pfeiffer, Franz
奥·诺希尔、沃尔夫冈·德马科、法比奥·威勒、康斯坦丁·厄本、特里萨·弗兰克、曼努埃拉·希克、拉斐尔·格莱奇、伯恩哈德·赫恩、洛伦茨·古斯特钦、亚历克斯·迈耶、帕斯卡·克勒、托马斯·马克、英戈·恩格尔、克劳斯-于尔根·伦特、贝恩德·伦格、伯恩哈德·芬格勒亚历山大·菲佛 / 丹妮拉·鲁梅尼 / 恩斯特·赫尔岑 / 朱莉娅·菲佛 / 弗兰兹
Correction for Mechanical Inaccuracies in a Scanning Talbot-Lau
Interferometer
扫描 Talbot-Lau 干涉仪中机械误差的校正
Grating-based X-ray phase-contrast and in particular dark-field
radiography are promising new imaging modalities for medical
applications. Currently, the potential advantage of dark-field imaging
in early-stage diagnosis of pulmonary diseases in humans is being
investigated. These studies make use of a comparatively large scanning
interferometer at short acquisition times, which comes at the expense of
a significantly reduced mechanical stability as compared to tabletop
laboratory setups. Vibrations create random fluctuations of the grating
alignment, causing artifacts in the resulting images. Here, we describe
a novel maximum likelihood method for estimating this motion, thereby
preventing these artifacts. It is tailored to scanning setups and does
not require any sample-free areas. Unlike any previously described
method, it accounts for motion in between as well as during exposures.
基于光栅的 X 射线相衬技术,特别是暗场射线照相术,是医疗应用中前景光明的新成像方式。目前,正在研究暗场成像在人类肺部疾病早期诊断中的潜在优势。这些研究在较短的采集时间内使用了相对较大的扫描干涉仪,但与桌面实验室设置相比,其机械稳定性显着降低。振动会造成光栅对准的随机波动,从而导致生成的图像出现伪影。在这里,我们描述了一种新颖的最大似然方法来估计这种运动,从而防止这些伪影。它专为扫描设置而定制,不需要任何无样品区域。与之前描述的任何方法不同,它考虑了曝光之间以及曝光期间的运动。
AU Wang, Hong
Xie, Qi
Zeng, Dong
Ma, Jianhua
Meng, Deyu
Zheng, Yefeng
王AU、谢红、曾琪、马冬、孟建华、郑德宇、叶峰
OSCNet: Orientation-Shared Convolutional Network for CT Metal Artifact
Learning
OSCNet:用于 CT 金属工件学习的方向共享卷积网络
X-ray computed tomography (CT) has been broadly adopted in clinical
applications for disease diagnosis and image-guided interventions.
However, metals within patients always cause unfavorable artifacts in
the recovered CT images. Albeit attaining promising reconstruction
results for this metal artifact reduction (MAR) task, most of the
existing deep-learning-based approaches have some limitations. The
critical issue is that most of these methods have not fully exploited
the important prior knowledge underlying this specific MAR task.
Therefore, in this paper, we carefully investigate the inherent
characteristics of metal artifacts which present rotationally
symmetrical streaking patterns. Then we specifically propose an
orientation-shared convolution representation mechanism to adapt such
physical prior structures and utilize Fourier-series-expansion-based
filter parametrization for modelling artifacts, which can finely
separate metal artifacts from body tissues. By adopting the classical
proximal gradient algorithm to solve the model and then utilizing the
deep unfolding technique, we easily build the corresponding
orientation-shared convolutional network, termed as OSCNet. Furthermore,
considering that different sizes and types of metals would lead to
different artifact patterns (e.g., intensity of the artifacts), to
better improve the flexibility of artifact learning and fully exploit
the reconstructed results at iterative stages for information
propagation, we design a simple-yet-effective sub-network for the
dynamic convolution representation of artifacts. By easily integrating
the sub-network into the proposed OSCNet framework, we further construct
a more flexible network structure, called OSCNet+, which improves the
generalization performance. Through extensive experiments conducted on
synthetic and clinical datasets, we comprehensively substantiate the
effectiveness of our proposed methods. Code will be released at
https://github.com/hongwang01/OSCNet.
X射线计算机断层扫描(CT)已广泛应用于疾病诊断和图像引导干预的临床应用。然而,患者体内的金属总是会在恢复的 CT 图像中产生不利的伪影。尽管在金属伪影减少(MAR)任务中获得了有希望的重建结果,但大多数现有的基于深度学习的方法都存在一些局限性。关键问题是,大多数这些方法都没有充分利用这一特定 MAR 任务背后的重要先验知识。因此,在本文中,我们仔细研究了呈现旋转对称条纹图案的金属制品的固有特征。然后,我们特别提出了一种方向共享卷积表示机制来适应这种物理先验结构,并利用基于傅里叶级数展开的滤波器参数化来建模伪影,这可以将金属伪影与身体组织精细地分离。采用经典的近端梯度算法求解模型,然后利用深度展开技术,我们很容易构建相应的方向共享卷积网络,称为OSCNet。此外,考虑到不同尺寸和类型的金属会导致不同的伪影模式(例如伪影的强度),为了更好地提高伪影学习的灵活性并充分利用迭代阶段的重建结果进行信息传播,我们设计了一个简单的模型- 用于工件动态卷积表示的有效子网络。通过轻松地将子网络集成到所提出的 OSCNet 框架中,我们进一步构建了更灵活的网络结构,称为 OSCNet+,从而提高了泛化性能。 通过对合成和临床数据集进行广泛的实验,我们全面证实了我们提出的方法的有效性。代码将在 https://github.com/hongwang01/OSCNet 发布。
AU Zhou, Lianyu
Yu, Lequan
Wang, Liansheng
周AU、于连玉、王乐泉、连胜
RECIST-Induced Reliable Learning: Geometry-Driven Label Propagation for
Universal Lesion Segmentation
RECIST 诱导的可靠学习:用于通用病灶分割的几何驱动标签传播
Automatic universal lesion segmentation (ULS) from Computed Tomography
(CT) images can ease the burden of radiologists and provide a more
accurate assessment than the current Response Evaluation Criteria In
Solid Tumors (RECIST) guideline measurement. However, this task is
underdeveloped due to the absence of large-scale pixel-wise labeled
data. This paper presents a weakly-supervised learning framework to
utilize the large-scale existing lesion databases in hospital Picture
Archiving and Communication Systems (PACS) for ULS. Unlike previous
methods to construct pseudo surrogate masks for fully supervised
training through shallow interactive segmentation techniques, we propose
to unearth the implicit information from RECIST annotations and thus
design a unified RECIST-induced reliable learning (RiRL) framework.
Particularly, we introduce a novel label generation procedure and an
on-the-fly soft label propagation strategy to avoid noisy training and
poor generalization problems. The former, named RECIST-induced geometric
labeling, uses clinical characteristics of RECIST to preliminarily and
reliably propagate the label. With the labeling process, a trimap
divides the lesion slices into three regions, including certain
foreground, background, and unclear regions, which consequently enables
a strong and reliable supervision signal on a wide region. A topological
knowledge-driven graph is built to conduct the on-the-fly label
propagation for the optimal segmentation boundary to further optimize
the segmentation boundary. Experimental results on a public benchmark
dataset demonstrate that the proposed method surpasses the SOTA
RECIST-based ULS methods by a large margin. Our approach surpasses SOTA
approaches over 2.0%, 1.5%, 1.4%, and 1.6% Dice with ResNet101,
ResNet50, HRNet, and ResNest50 backbones.
计算机断层扫描 (CT) 图像的自动通用病灶分割 (ULS) 可以减轻放射科医生的负担,并提供比当前实体瘤反应评估标准 (RECIST) 指南测量更准确的评估。然而,由于缺乏大规模像素级标记数据,该任务尚未开发。本文提出了一种弱监督学习框架,以利用 ULS 医院图片存档和通信系统 (PACS) 中现有的大规模病变数据库。与之前通过浅层交互式分割技术构建伪代理掩码以进行完全监督训练的方法不同,我们建议从 RECIST 注释中挖掘隐含信息,从而设计一个统一的 RECIST 诱导的可靠学习(RiRL)框架。特别是,我们引入了一种新颖的标签生成过程和一种动态软标签传播策略,以避免噪声训练和泛化不良的问题。前者称为RECIST诱导几何标记,利用RECIST的临床特征来初步可靠地传播标签。通过标记过程,三元图将病变切片分为三个区域,包括某些前景、背景和不清晰区域,从而在广阔的区域上提供强大而可靠的监督信号。构建拓扑知识驱动图,对最佳分割边界进行即时标签传播,以进一步优化分割边界。在公共基准数据集上的实验结果表明,所提出的方法大大超过了基于 SOTA RECIST 的 ULS 方法。我们的方法比 SOTA 方法高出 2.0%、1.5%、1.4% 和 1 以上。6% Dice 具有 ResNet101、ResNet50、HRNet 和 ResNest50 主干网。
AU He, Linchao
Du, Wenchao
Liao, Peixi
Fan, Fenglei
Chen, Hu
Yang, Hongyu
Zhang, Yi
区贺、杜林超、廖文超、范培熙、陈风雷、胡杨、张宏宇、易
Solving Zero-Shot Sparse-View CT Reconstruction With Variational Score
Solver.
使用变分求解器求解零样本稀疏视图 CT 重建。
Computed tomography (CT) stands as a ubiquitous medical diagnostic tool.
Nonetheless, the radiation-related concerns associated with CT scans
have raised public apprehensions. Mitigating radiation dosage in CT
imaging poses an inherent challenge as it inevitably compromises the
fidelity of CT reconstructions, impacting diagnostic accuracy. While
previous deep learning techniques have exhibited promise in enhancing CT
reconstruction quality, they remain hindered by the reliance on paired
data, which is arduous to procure. In this study, we present a novel
approach named Variational Score Solver (VSS) for solving sparse-view
reconstruction without paired data. Our approach entails the acquisition
of a probability distribution from densely sampled CT reconstructions,
employing a latent diffusion model. High-quality reconstruction outcomes
are achieved through an iterative process, wherein the diffusion model
serves as the prior term, subsequently integrated with the data
consistency term. Notably, rather than directly employing the prior
diffusion model, we distill prior knowledge by finding the fixed point
of the diffusion model. This framework empowers us to exercise precise
control over the process. Moreover, we depart from modeling the
reconstruction outcomes as deterministic values, opting instead for a
distribution-based approach. This enables us to achieve more accurate
reconstructions utilizing a trainable model. Our approach introduces a
fresh perspective to the realm of zero-shot CT reconstruction,
circumventing the constraints of supervised learning. Our extensive
qualitative and quantitative experiments unequivocally demonstrate that
VSS surpasses other contemporary unsupervised and achieves comparable
results compared with the most advance supervised methods in sparse-view
reconstruction tasks. Codes are available in
https://github.com/fpsandnoob/vss.
计算机断层扫描 (CT) 是一种无处不在的医疗诊断工具。尽管如此,与 CT 扫描相关的辐射问题引起了公众的担忧。减少 CT 成像中的辐射剂量是一个固有的挑战,因为它不可避免地会损害 CT 重建的保真度,影响诊断准确性。虽然之前的深度学习技术在提高 CT 重建质量方面表现出了希望,但它们仍然受到对配对数据的依赖的阻碍,而配对数据很难获得。在这项研究中,我们提出了一种名为变分得分求解器(VSS)的新方法,用于在没有配对数据的情况下解决稀疏视图重建问题。我们的方法需要使用潜在扩散模型从密集采样的 CT 重建中获取概率分布。高质量的重建结果是通过迭代过程实现的,其中扩散模型作为先验项,随后与数据一致性项集成。值得注意的是,我们不是直接采用先验扩散模型,而是通过找到扩散模型的不动点来提取先验知识。该框架使我们能够对流程进行精确控制。此外,我们不再将重建结果建模为确定性值,而是选择基于分布的方法。这使我们能够利用可训练模型实现更准确的重建。我们的方法为零样本 CT 重建领域引入了全新的视角,规避了监督学习的限制。 我们广泛的定性和定量实验明确证明,VSS 超越了当代其他无监督方法,并且与稀疏视图重建任务中最先进的有监督方法相比,取得了可比的结果。代码可在 https://github.com/fpsandnoob/vss 中找到。
AU Chen, Yixin
Gao, Yajuan
Zhu, Lei
Shao, Wenrui
Lu, Yanye
Han, Hongbin
Xie, Zhaoheng
陈AU、高一新、朱亚娟、邵雷、卢文瑞、韩彦野、谢宏斌、兆恒
PCNet: Prior Category Network for CT Universal Segmentation Model
PCNet:CT 通用分割模型的先验类别网络
Accurate segmentation of anatomical structures in Computed Tomography
(CT) images is crucial for clinical diagnosis, treatment planning, and
disease monitoring. The present deep learning segmentation methods are
hindered by factors such as data scale and model size. Inspired by how
doctors identify tissues, we propose a novel approach, the Prior
Category Network (PCNet), that boosts segmentation performance by
leveraging prior knowledge between different categories of anatomical
structures. Our PCNet comprises three key components: prior category
prompt (PCP), hierarchy category system (HCS), and hierarchy category
loss (HCL). PCP utilizes Contrastive Language-Image Pretraining (CLIP),
along with attention modules, to systematically define the relationships
between anatomical categories as identified by clinicians. HCS guides
the segmentation model in distinguishing between specific organs,
anatomical structures, and functional systems through hierarchical
relationships. HCL serves as a consistency constraint, fortifying the
directional guidance provided by HCS to enhance the segmentation model's
accuracy and robustness. We conducted extensive experiments to validate
the effectiveness of our approach, and the results indicate that PCNet
can generate a high-performance, universal model for CT segmentation.
The PCNet framework also demonstrates a significant transferability on
multiple downstream tasks. The ablation experiments show that the
methodology employed in constructing the HCS is of critical importance.
计算机断层扫描 (CT) 图像中解剖结构的准确分割对于临床诊断、治疗计划和疾病监测至关重要。目前的深度学习分割方法受到数据规模和模型大小等因素的阻碍。受医生如何识别组织的启发,我们提出了一种新方法,即先验类别网络(PCNet),它通过利用不同类别的解剖结构之间的先验知识来提高分割性能。我们的 PCNet 包含三个关键组件:先前类别提示(PCP)、层次类别系统(HCS)和层次类别丢失(HCL)。 PCP 利用对比语言图像预训练 (CLIP) 以及注意力模块来系统地定义临床医生确定的解剖类别之间的关系。 HCS通过层次关系指导分割模型区分特定器官、解剖结构和功能系统。 HCL作为一致性约束,强化了HCS提供的方向指导,以提高分割模型的准确性和鲁棒性。我们进行了大量的实验来验证我们方法的有效性,结果表明 PCNet 可以生成高性能、通用的 CT 分割模型。 PCNet 框架还展示了在多个下游任务上的显着可转移性。消融实验表明,构建 HCS 所采用的方法至关重要。
AU Lou, Wei
Wan, Xiang
Li, Guanbin
Lou, Xiaoying
Li, Chenghang
Gao, Feng
Li, Haofeng
楼AU、万伟、李翔、楼冠斌、李晓英、高成航、李峰、浩峰
Structure Embedded Nucleus Classification for Histopathology Images
组织病理学图像的结构嵌入核分类
Nuclei classification provides valuable information for histopathology
image analysis. However, the large variations in the appearance of
different nuclei types cause difficulties in identifying nuclei. Most
neural network based methods are affected by the local receptive field
of convolutions, and pay less attention to the spatial distribution of
nuclei or the irregular contour shape of a nucleus. In this paper, we
first propose a novel polygon-structure feature learning mechanism that
transforms a nucleus contour into a sequence of points sampled in order,
and employ a recurrent neural network that aggregates the sequential
change in distance between key points to obtain learnable shape
features. Next, we convert a histopathology image into a graph structure
with nuclei as nodes, and build a graph neural network to embed the
spatial distribution of nuclei into their representations. To capture
the correlations between the categories of nuclei and their surrounding
tissue patterns, we further introduce edge features that are defined as
the background textures between adjacent nuclei. Lastly, we integrate
both polygon and graph structure learning mechanisms into a whole
framework that can extract intra and inter-nucleus structural
characteristics for nuclei classification. Experimental results show
that the proposed framework achieves significant improvements compared
to the previous methods.
细胞核分类为组织病理学图像分析提供了有价值的信息。然而,不同核类型的外观差异很大,导致识别核的困难。大多数基于神经网络的方法受到卷积局部感受野的影响,较少关注核的空间分布或核的不规则轮廓形状。在本文中,我们首先提出了一种新颖的多边形结构特征学习机制,将核轮廓转换为按顺序采样的点序列,并采用循环神经网络聚合关键点之间距离的顺序变化以获得可学习的形状特征。接下来,我们将组织病理学图像转换为以细胞核为节点的图结构,并构建图神经网络将细胞核的空间分布嵌入到它们的表示中。为了捕获细胞核类别与其周围组织模式之间的相关性,我们进一步引入了边缘特征,这些边缘特征被定义为相邻细胞核之间的背景纹理。最后,我们将多边形和图结构学习机制集成到一个整体框架中,可以提取核内和核间结构特征以进行核分类。实验结果表明,与之前的方法相比,所提出的框架取得了显着的改进。
AU Zheng, Yi
Conrad, Regan D.
Green, Emily J.
Burks, Eric J.
Betke, Margrit
Beane, Jennifer E.
Kolachalama, Vijaya B.
AU Cheng、Yi Conrad、Regan D. Green、Emily J. Burks、Eric J. Betke、Margrit Beane、Jennifer E. Kolachalama、Vijaya B.
Graph Attention-Based Fusion of Pathology Images and Gene Expression for
Prediction of Cancer Survival
基于图注意力的病理图像和基因表达融合用于预测癌症生存
Multimodal machine learning models are being developed to analyze
pathology images and other modalities, such as gene expression, to gain
clinical and biological insights. However, most frameworks for
multimodal data fusion do not fully account for the interactions between
different modalities. Here, we present an attention-based fusion
architecture that integrates a graph representation of pathology images
with gene expression data and concomitantly learns from the fused
information to predict patient-specific survival. In our approach,
pathology images are represented as undirected graphs, and their
embeddings are combined with embeddings of gene expression signatures
using an attention mechanism to stratify tumors by patient survival. We
show that our framework improves the survival prediction of human
non-small cell lung cancers, outperforming existing state-of-the-art
approaches that leverage multimodal data. Our framework can facilitate
spatial molecular profiling to identify tumor heterogeneity using
pathology images and gene expression data, complementing results
obtained from more expensive spatial transcriptomic and proteomic
technologies.
正在开发多模式机器学习模型来分析病理图像和其他模式(例如基因表达),以获得临床和生物学见解。然而,大多数多模态数据融合框架并没有完全考虑不同模态之间的相互作用。在这里,我们提出了一种基于注意力的融合架构,它将病理图像的图形表示与基因表达数据集成在一起,并同时从融合信息中学习以预测患者特定的生存率。在我们的方法中,病理图像被表示为无向图,并且它们的嵌入与基因表达特征的嵌入相结合,使用注意力机制根据患者的生存情况对肿瘤进行分层。我们表明,我们的框架改善了人类非小细胞肺癌的生存预测,优于利用多模态数据的现有最先进方法。我们的框架可以促进空间分子分析,以使用病理图像和基因表达数据来识别肿瘤异质性,补充从更昂贵的空间转录组和蛋白质组技术获得的结果。
AU Meng, Xiangxi
Sun, Kaicong
Xu, Jun
He, Xuming
Shen, Dinggang
区萌、孙向西、徐凯聪、何俊、沉旭明、定刚
Multi-Modal Modality-Masked Diffusion Network for Brain MRI Synthesis
With Random Modality Missing
用于随机模态缺失的脑 MRI 合成的多模态掩蔽扩散网络
Synthesis of unavailable imaging modalities from available ones can
generate modality-specific complementary information and enable
multi-modality based medical images diagnosis or treatment. Existing
generative methods for medical image synthesis are usually based on
cross-modal translation between acquired and missing modalities. These
methods are usually dedicated to specific missing modality and perform
synthesis in one shot, which cannot deal with varying number of missing
modalities flexibly and construct the mapping across modalities
effectively. To address the above issues, in this paper, we propose a
unified Multi-modal Modality-masked Diffusion Network (M2DN), tackling
multi-modal synthesis from the perspective of "progressive
whole-modality inpainting", instead of "cross-modal translation".
Specifically, our M2DN considers the missing modalities as random noise
and takes all the modalities as a unity in each reverse diffusion step.
The proposed joint synthesis scheme performs synthesis for the missing
modalities and self-reconstruction for the available ones, which not
only enables synthesis for arbitrary missing scenarios, but also
facilitates the construction of common latent space and enhances the
model representation ability. Besides, we introduce a modality-mask
scheme to encode availability status of each incoming modality
explicitly in a binary mask, which is adopted as condition for the
diffusion model to further enhance the synthesis performance of our M2DN
for arbitrary missing scenarios. We carry out experiments on two public
brain MRI datasets for synthesis and downstream segmentation tasks.
Experimental results demonstrate that our M2DN outperforms the
state-of-the-art models significantly and shows great generalizability
for arbitrary missing modalities.
从可用的成像模式中合成不可用的成像模式可以生成特定于模式的补充信息,并实现基于多模态的医学图像诊断或治疗。现有的医学图像合成生成方法通常基于已获取模态和缺失模态之间的跨模态转换。这些方法通常致力于特定的缺失模态并一次性进行合成,不能灵活地处理不同数量的缺失模态并有效地构建跨模态的映射。为了解决上述问题,在本文中,我们提出了一种统一的多模态模态掩蔽扩散网络(M2DN),从“渐进式全模态修复”的角度来解决多模态合成,而不是“跨模态翻译” ”。具体来说,我们的 M2DN 将缺失的模态视为随机噪声,并将每个反向扩散步骤中的所有模态视为一个整体。所提出的联合合成方案对缺失的模态进行合成,对可用的模态进行自重建,这不仅能够合成任意缺失的场景,而且有利于公共潜在空间的构建并增强模型表示能力。此外,我们引入了一种模态掩码方案,将每个传入模态的可用性状态明确地编码在二进制掩码中,这被用作扩散模型的条件,以进一步增强我们的 M2DN 对于任意缺失场景的综合性能。我们在两个公共脑 MRI 数据集上进行了实验,用于合成和下游分割任务。 实验结果表明,我们的 M2DN 显着优于最先进的模型,并且对于任意缺失的模态表现出良好的通用性。
AU Cai, De
Chen, Jie
Zhao, Junhan
Xue, Yuan
Yang, Sen
Yuan, Wei
Feng, Min
Weng, Haiyan
Liu, Shuguang
Peng, Yulong
Zhu, Junyou
Wang, Kanran
Jackson, Christopher
Tang, Hongping
Huang, Junzhou
Wang, Xiyue
蔡区、陈德、赵杰、薛俊瀚、杨源、袁森、冯伟、翁敏、刘海燕、彭曙光、朱玉龙、王俊友、Kanran Jackson、Christopher Tang、黄红平、王俊洲、夕月
HiCervix: An Extensive Hierarchical Dataset and Benchmark for Cervical
Cytology Classification.
HiCervix:广泛的分层数据集和宫颈细胞学分类基准。
Cervical cytology is a critical screening strategy for early detection
of pre-cancerous and cancerous cervical lesions. The challenge lies in
accurately classifying various cervical cytology cell types. Existing
automated cervical cytology methods are primarily trained on databases
covering a narrow range of coarse-grained cell types, which fail to
provide a comprehensive and detailed performance analysis that
accurately represents real-world cytopathology conditions. To overcome
these limitations, we introduce HiCervix, the most extensive,
multi-center cervical cytology dataset currently available to the
public. HiCervix includes 40,229 cervical cells from 4,496 whole slide
images, categorized into 29 annotated classes. These classes are
organized within a three-level hierarchical tree to capture fine-grained
subtype information. To exploit the semantic correlation inherent in
this hierarchical tree, we propose HierSwin, a hierarchical vision
transformer-based classification network. HierSwin serves as a benchmark
for detailed feature learning in both coarse-level and fine-level
cervical cancer classification tasks. In our comprehensive experiments,
HierSwin demonstrated remarkable performance, achieving 92.08% accuracy
for coarse-level classification and 82.93% accuracy averaged across all
three levels. When compared to board-certified cytopathologists,
HierSwin achieved high classification performance (0.8293 versus 0.7359
averaged accuracy), highlighting its potential for clinical
applications. This newly released HiCervix dataset, along with our
benchmark HierSwin method, is poised to make a substantial impact on the
advancement of deep learning algorithms for rapid cervical cancer
screening and greatly improve cancer prevention and patient outcomes in
real-world clinical settings.
宫颈细胞学检查是早期发现癌前病变和癌性宫颈病变的重要筛查策略。挑战在于准确分类各种宫颈细胞学细胞类型。现有的自动化宫颈细胞学方法主要是在涵盖狭窄范围的粗粒度细胞类型的数据库上进行训练,这些数据库无法提供准确代表真实世界细胞病理学条件的全面且详细的性能分析。为了克服这些限制,我们引入了 HiCervix,这是目前向公众提供的最广泛的多中心宫颈细胞学数据集。 HiCervix 包含来自 4,496 个完整幻灯片图像的 40,229 个宫颈细胞,分为 29 个带注释的类别。这些类被组织在一个三级层次树中,以捕获细粒度的子类型信息。为了利用这种分层树中固有的语义相关性,我们提出了 HierSwin,一种基于分层视觉变换器的分类网络。 HierSwin 可以作为粗级和精细级宫颈癌分类任务中详细特征学习的基准。在我们的综合实验中,HierSwin 表现出了卓越的性能,粗级分类准确率达到 92.08%,所有三个级别的平均准确率达到 82.93%。与经过委员会认证的细胞病理学家相比,HierSwin 实现了较高的分类性能(平均准确度为 0.8293 对比 0.7359),凸显了其临床应用的潜力。 这个新发布的 HiCervix 数据集以及我们的基准 HierSwin 方法有望对用于快速宫颈癌筛查的深度学习算法的进步产生重大影响,并极大地改善现实临床环境中的癌症预防和患者结果。
AU Xu, Yanwu Sun, Li Peng, Wei Jia, Shuyue Morrison, Katelyn Perer, Adam Zandifar, Afrooz Visweswaran, Shyam Eslami, Motahhare Batmanghelich, Kayhan
MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT
Images.
MedSyn:文本引导的高保真 3D CT 图像解剖感知合成。
This paper introduces an innovative methodology for producing
high-quality 3D lung CT images guided by textual information. While
diffusion-based generative models are increasingly used in medical
imaging, current state-of-the-art approaches are limited to
low-resolution outputs and underutilize radiology reports' abundant
information. The radiology reports can enhance the generation process by
providing additional guidance and offering fine-grained control over the
synthesis of images. Nevertheless, expanding text-guided generation to
high-resolution 3D images poses significant memory and anatomical
detail-preserving challenges. Addressing the memory issue, we introduce
a hierarchical scheme that uses a modified UNet architecture. We start
by synthesizing low-resolution images conditioned on the text, serving
as a foundation for subsequent generators for complete volumetric data.
To ensure the anatomical plausibility of the generated samples, we
provide further guidance by generating vascular, airway, and lobular
segmentation masks in conjunction with the CT images. The model
demonstrates the capability to use textual input and segmentation tasks
to generate synthesized images. Algorithmic comparative assessments and
blind evaluations conducted by 10 board-certified radiologists indicate
that our approach exhibits superior performance compared to the most
advanced models based on GAN and diffusion techniques, especially in
accurately retaining crucial anatomical features such as fissure lines
and airways. This innovation introduces novel possibilities. This study
focuses on two main objectives: (1) the development of a method for
creating images based on textual prompts and anatomical components, and
(2) the capability to generate new images conditioning on anatomical
elements. The advancements in image generation can be applied to enhance
numerous downstream tasks.
本文介绍了一种在文本信息引导下生成高质量 3D 肺部 CT 图像的创新方法。虽然基于扩散的生成模型越来越多地用于医学成像,但当前最先进的方法仅限于低分辨率输出,并且未充分利用放射学报告的丰富信息。放射学报告可以通过提供额外的指导和对图像合成的细粒度控制来增强生成过程。然而,将文本引导生成扩展到高分辨率 3D 图像对记忆和解剖细节保留提出了重大挑战。为了解决内存问题,我们引入了一种使用修改后的 UNet 架构的分层方案。我们首先合成以文本为条件的低分辨率图像,作为后续完整体积数据生成器的基础。为了确保生成的样本的解剖学合理性,我们通过结合 CT 图像生成血管、气道和小叶分割掩模来提供进一步的指导。该模型演示了使用文本输入和分割任务生成合成图像的能力。由 10 名经过委员会认证的放射科医生进行的算法比较评估和盲评估表明,与基于 GAN 和扩散技术的最先进模型相比,我们的方法表现出优越的性能,特别是在准确保留裂痕线和气道等关键解剖特征方面。这项创新带来了新的可能性。 本研究重点关注两个主要目标:(1)开发一种基于文本提示和解剖成分创建图像的方法,以及(2)根据解剖元素生成新图像的能力。图像生成方面的进步可用于增强众多下游任务。
AU Xu, Ziang
Rittscher, Jens
Ali, Sharib
AU Xu、Ziang Rittscher、Jens Ali、Sharib
SSL-CPCD: Self-supervised learning with composite pretext-class
discrimination for improved generalisability in endoscopic image
analysis.
SSL-CPCD:具有复合借口类别区分的自监督学习,可提高内窥镜图像分析的通用性。
Data-driven methods have shown tremendous progress in medical image
analysis. In this context, deep learning-based supervised methods are
widely popular. However, they require a large amount of training data
and face issues in generalisability to unseen datasets that hinder
clinical translation. Endoscopic imaging data is characterised by large
inter- and intra-patient variability that makes these models more
challenging to learn representative features for downstream tasks. Thus,
despite the publicly available datasets and datasets that can be
generated within hospitals, most supervised models still underperform.
While self-supervised learning has addressed this problem to some extent
in natural scene data, there is a considerable performance gap in the
medical image domain. In this paper, we propose to explore patch-level
instance-group discrimination and penalisation of inter-class variation
using additive angular margin within the cosine similarity metrics. Our
novel approach enables models to learn to cluster similar
representations, thereby improving their ability to provide better
separation between different classes. Our results demonstrate
significant improvement on all metrics over the state-of-the-art (SOTA)
methods on the test set from the same and diverse datasets. We evaluated
our approach for classification, detection, and segmentation. SSL-CPCD
attains notable Top 1 accuracy of 79.77% in ulcerative colitis
classification, an 88.62% mean average precision (mAP) for detection,
and an 82.32% dice similarity coefficient for segmentation tasks. These
represent improvements of over 4%, 2%, and 3%, respectively, compared to
the baseline architectures. We demonstrate that our method generalises
better than all SOTA methods to unseen datasets, reporting over 7%
improvement.
数据驱动的方法在医学图像分析方面取得了巨大进步。在此背景下,基于深度学习的监督方法广泛流行。然而,它们需要大量的训练数据,并且面临着对看不见的数据集的通用性问题,从而阻碍了临床转化。内窥镜成像数据的特点是患者间和患者内差异较大,这使得这些模型在学习下游任务的代表性特征方面更具挑战性。因此,尽管有公开的数据集和可以在医院内部生成的数据集,但大多数监督模型仍然表现不佳。虽然自监督学习在自然场景数据中一定程度上解决了这个问题,但在医学图像领域还存在相当大的性能差距。在本文中,我们建议使用余弦相似性度量内的加性角度裕度来探索补丁级实例组歧视和类间变异的惩罚。我们的新颖方法使模型能够学习对相似的表示进行聚类,从而提高它们在不同类别之间提供更好分离的能力。我们的结果表明,在来自相同和不同数据集的测试集上,所有指标均比最先进的 (SOTA) 方法有了显着改进。我们评估了我们的分类、检测和分割方法。 SSL-CPCD 在溃疡性结肠炎分类中达到了 79.77% 的 Top 1 准确率,检测平均精度 (mAP) 为 88.62%,分割任务的骰子相似系数为 82.32%。与基准架构相比,这些改进分别超过 4%、2% 和 3%。 我们证明,我们的方法比所有 SOTA 方法对未见过的数据集具有更好的泛化能力,报告改进超过 7%。
AU Chen, Yuanyuan
Guo, Xiaoqing
Xia, Yong
Yuan, Yixuan
陈AU、郭媛媛、夏晓青、袁永、艺轩
Disentangle Then Calibrate With Gradient Guidance: A Unified Framework
for Common and Rare Disease Diagnosis
用梯度引导解开然后校准:常见和罕见疾病诊断的统一框架
The computer-aided diagnosis (CAD) for rare diseases using medical
imaging poses a significant challenge due to the requirement of large
volumes of labeled training data, which is particularly difficult to
collect for rare diseases. Although Few-shot learning (FSL) methods have
been developed for this task, these methods focus solely on rare disease
diagnosis, failing to preserve the performance in common disease
diagnosis. To address this issue, we propose the Disentangle then
Calibrate with Gradient Guidance (DCGG) framework under the setting of
generalized few-shot learning, i.e., using one model to diagnose both
common and rare diseases. The DCGG framework consists of a network
backbone, a gradient-guided network disentanglement (GND) module, and a
gradient-induced feature calibration (GFC) module. The GND module
disentangles the network into a disease-shared component and a
disease-specific component based on gradient guidance, and devises
independent optimization strategies for both components, respectively,
when learning from rare diseases. The GFC module transfers only the
disease-shared channels of common-disease features to rare diseases, and
incorporates the optimal transport theory to identify the best transport
scheme based on the semantic relationship among different diseases.
Based on the best transport scheme, the GFC module calibrates the
distribution of rare-disease features at the disease-shared channels,
deriving more informative rare-disease features for better diagnosis.
The proposed DCGG framework has been evaluated on three public medical
image classification datasets. Our results suggest that the DCGG
framework achieves state-of-the-art performance in diagnosing both
common and rare diseases.
由于需要大量标记的训练数据,而对于罕见疾病来说,收集这些数据尤其困难,因此使用医学成像对罕见疾病进行计算机辅助诊断(CAD)提出了重大挑战。尽管已经为此任务开发了少样本学习(FSL)方法,但这些方法仅专注于罕见疾病诊断,未能保持常见疾病诊断的性能。为了解决这个问题,我们在广义少样本学习的背景下提出了“Disentangle then Calibrate with Gradient Guidance”(DCGG)框架,即使用一个模型来诊断常见疾病和罕见疾病。 DCGG框架由网络主干、梯度引导网络解缠(GND)模块和梯度诱导特征校准(GFC)模块组成。 GND 模块基于梯度引导将网络分解为疾病共享组件和疾病特定组件,并在学习罕见疾病时分别为这两个组件设计独立的优化策略。 GFC模块仅将常见疾病特征的疾病共享通道转移到罕见疾病,并结合最优传输理论,根据不同疾病之间的语义关系识别最佳传输方案。 GFC模块基于最佳传输方案,校准罕见疾病特征在疾病共享通道上的分布,得出更多信息丰富的罕见疾病特征,以更好地进行诊断。所提出的 DCGG 框架已在三个公共医学图像分类数据集上进行了评估。我们的结果表明,DCGG 框架在诊断常见疾病和罕见疾病方面均实现了最先进的性能。
AU Ma, Jiabo
Chen, Hao
区马、陈家博、郝
Efficient Supervised Pretraining of Swin-Transformer for Virtual
Staining of Microscopy Images
用于显微图像虚拟染色的 Swin-Transformer 的高效监督预训练
Fluorescence staining is an important technique in life science for
labeling cellular constituents. However, it also suffers from being
time-consuming, having difficulty in simultaneous labeling, etc. Thus,
virtual staining, which does not rely on chemical labeling, has been
introduced. Recently, deep learning models such as transformers have
been applied to virtual staining tasks. However, their performance
relies on large-scale pretraining, hindering their development in the
field. To reduce the reliance on large amounts of computation and data,
we construct a Swin-transformer model and propose an efficient
supervised pretraining method based on the masked autoencoder (MAE).
Specifically, we adopt downsampling and grid sampling to mask 75% of
pixels and reduce the number of tokens. The pretraining time of our
method is only 1/16 compared with the original MAE. We also design a
supervised proxy task to predict stained images with multiple styles
instead of masked pixels. Additionally, most virtual staining approaches
are based on private datasets and evaluated by different metrics, making
a fair comparison difficult. Therefore, we develop a standard benchmark
based on three public datasets and build a baseline for the convenience
of future researchers. We conduct extensive experiments on three
benchmark datasets, and the experimental results show the proposed
method achieves the best performance both quantitatively and
qualitatively. In addition, ablation studies are conducted, and
experimental results illustrate the effectiveness of the proposed
pretraining method. The benchmark and code are available at
https://github.com/birkhoffkiki/CAS-Transformer.
荧光染色是生命科学中标记细胞成分的重要技术。然而,它也存在耗时、难以同时标记等问题。因此,不依赖化学标记的虚拟染色被引入。最近,变压器等深度学习模型已应用于虚拟染色任务。然而,它们的性能依赖于大规模的预训练,阻碍了它们在该领域的发展。为了减少对大量计算和数据的依赖,我们构建了 Swin-transformer 模型,并提出了一种基于掩码自动编码器(MAE)的有效监督预训练方法。具体来说,我们采用下采样和网格采样来屏蔽 75% 的像素并减少 token 的数量。与原始 MAE 相比,我们方法的预训练时间仅为 1/16。我们还设计了一个监督代理任务来预测具有多种样式而不是屏蔽像素的染色图像。此外,大多数虚拟染色方法都基于私有数据集并通过不同的指标进行评估,这使得公平比较变得困难。因此,我们基于三个公共数据集制定了标准基准,并建立了一个基线,以方便未来的研究人员。我们在三个基准数据集上进行了广泛的实验,实验结果表明所提出的方法在定量和定性上都达到了最佳性能。此外,还进行了消融研究,实验结果说明了所提出的预训练方法的有效性。基准测试和代码可在 https://github.com/birkhoffkiki/CAS-Transformer 获取。
AU Meng, Qingjie
Bai, Wenjia
O'Regan, Declan P.
Rueckert, Daniel
区萌、白清杰、Wenjia O'Regan、Declan P. Rueckert、Daniel
DeepMesh: Mesh-Based Cardiac Motion Tracking Using Deep Learning
DeepMesh:使用深度学习进行基于网格的心脏运动跟踪
3D motion estimation from cine cardiac magnetic resonance (CMR) images
is important for the assessment of cardiac function and the diagnosis of
cardiovascular diseases. Current state-of-the art methods focus on
estimating dense pixel-/voxel-wise motion fields in image space, which
ignores the fact that motion estimation is only relevant and useful
within the anatomical objects of interest, e.g., the heart. In this
work, we model the heart as a 3D mesh consisting of epi- and endocardial
surfaces. We propose a novel learning framework, DeepMesh, which
propagates a template heart mesh to a subject space and estimates the 3D
motion of the heart mesh from CMR images for individual subjects. In
DeepMesh, the heart mesh of the end-diastolic frame of an individual
subject is first reconstructed from the template mesh. Mesh-based 3D
motion fields with respect to the end-diastolic frame are then estimated
from 2D short- and long-axis CMR images. By developing a differentiable
mesh-to-image rasterizer, DeepMesh is able to leverage 2D shape
information from multiple anatomical views for 3D mesh reconstruction
and mesh motion estimation. The proposed method estimates vertex-wise
displacement and thus maintains vertex correspondences between time
frames, which is important for the quantitative assessment of cardiac
function across different subjects and populations. We evaluate DeepMesh
on CMR images acquired from the UK Biobank. We focus on 3D motion
estimation of the left ventricle in this work. Experimental results show
that the proposed method quantitatively and qualitatively outperforms
other image-based and mesh-based cardiac motion tracking methods.
电影心脏磁共振 (CMR) 图像的 3D 运动估计对于评估心脏功能和诊断心血管疾病非常重要。当前最先进的方法集中于估计图像空间中密集的像素/体素运动场,这忽略了运动估计仅在感兴趣的解剖对象(例如心脏)内相关和有用的事实。在这项工作中,我们将心脏建模为由心外膜和心内膜表面组成的 3D 网格。我们提出了一种新颖的学习框架 DeepMesh,它将模板心脏网格传播到主题空间,并根据各个主题的 CMR 图像估计心脏网格的 3D 运动。在 DeepMesh 中,首先根据模板网格重建个体受试者舒张末期帧的心脏网格。然后根据 2D 短轴和长轴 CMR 图像估计相对于舒张末期帧的基于网格的 3D 运动场。通过开发可微分的网格到图像光栅器,DeepMesh 能够利用来自多个解剖视图的 2D 形状信息进行 3D 网格重建和网格运动估计。所提出的方法估计顶点方向的位移,从而维持时间帧之间的顶点对应关系,这对于不同受试者和人群的心脏功能的定量评估非常重要。我们根据从英国生物银行获取的 CMR 图像评估 DeepMesh。在这项工作中,我们重点关注左心室的 3D 运动估计。实验结果表明,该方法在定量和定性上优于其他基于图像和基于网格的心脏运动跟踪方法。
AU Challoob, Mohsin
Gao, Yongsheng
Busch, Andrew
AU Challoob、Mohsin Gau、Yongsheng Busch、Andrew
Distinctive Phase Interdependency Model for Retinal Vasculature
Delineation in OCT-Angiography Images
OCT 血管造影图像中视网膜脉管系统描绘的独特相位相互依赖性模型
Automatic detection of retinal vasculature in optical coherence
tomography angiography (OCTA) images faces several challenges such as
the closely located capillaries, vessel discontinuity and high noise
level. This paper introduces a new distinctive phase interdependency
model to address these problems for delineating centerline patterns of
the vascular network. We capture the inherent property of vascular
centerlines by obtaining the inter-scale dependency information that
exists between neighboring symmetrical wavelets in complex Poisson
domain. In particular, the proposed phase interdependency model
identifies vascular centerlines as the distinctive features that have
high magnitudes over adjacent symmetrical coefficients whereas the
coefficients caused by background noises are decayed rapidly along
adjacent wavelet scales. The potential relationships between the
neighboring Poisson coefficients are established based on the coherency
of distinctive symmetrical wavelets. The proposed phase model is
assessed on the OCTA-500 database (300 OCTA images + 200 OCT images),
ROSE-1-SVC dataset (9 OCTA images), ROSE-1 (SVC+ DVC) dataset (9 OCTA
images), and ROSE-2 dataset (22 OCTA images). The experiments on the
clinically relevant OCTA images validate the effectiveness of the
proposed method in achieving high-quality results. Our method produces
average F-score of 0.822, 0.782, and 0.779 on ROSE-1-SVC, ROSE-1 (SVC+
DVC), and ROSE-2 datasets, respectively, and the F-score of 0.910 and
0.862 on OCTA_6mm and OCT_3mm datasets (OCTA-500 database),
respectively, demonstrating its superior performance over the
state-of-the-art benchmark methods.
光学相干断层扫描血管造影 (OCTA) 图像中视网膜脉管系统的自动检测面临着一些挑战,例如毛细血管位置紧密、血管不连续性和高噪声水平。本文介绍了一种新的独特的相位相互依赖性模型来解决这些问题,以描绘血管网络的中心线模式。我们通过获取复杂泊松域中相邻对称小波之间存在的尺度间依赖性信息来捕获血管中心线的固有属性。特别是,所提出的相位相互依赖性模型将血管中心线识别为在相邻对称系数上具有高幅度的独特特征,而由背景噪声引起的系数沿相邻小波尺度快速衰减。相邻泊松系数之间的潜在关系是基于独特对称小波的相干性建立的。所提出的相位模型在 OCTA-500 数据库(300 个 OCTA 图像 + 200 个 OCT 图像)、ROSE-1-SVC 数据集(9 个 OCTA 图像)、ROSE-1 (SVC+ DVC) 数据集(9 个 OCTA 图像)和 ROSE 上进行评估-2 数据集(22 个 OCTA 图像)。对临床相关 OCTA 图像的实验验证了所提出的方法在获得高质量结果方面的有效性。我们的方法在 ROSE-1-SVC、ROSE-1 (SVC+ DVC) 和 ROSE-2 数据集上产生的平均 F 分数分别为 0.822、0.782 和 0.779,在 OCTA_6mm 和 OCT_3mm 上产生的 F 分数分别为 0.910 和 0.862数据集(OCTA-500 数据库),分别证明了其优于最先进的基准方法的性能。
AU Kurtz, Samuel
Wattrisse, Bertrand
Van Houten, Elijah E. W.
AU Kurtz、Samuel Wattrisse、Bertrand Van Houten、Elijah EW
Minimizing Measurement-Induced Errors in Viscoelastic MR Elastography
最大限度地减少粘弹性 MR 弹性成像中测量引起的误差
The inverse problem that underlies Magnetic Resonance Elastography (MRE)
is sensitive to the measurement data and the quality of the results of
this tissue elasticity imaging process can be influenced both directly
and indirectly by measurement noise. In this work, we apply a coupled
adjoint field formulation of the viscoelastic constitutive parameter
identification problem, where the indirect influence of noise through
applied boundary conditions is avoided. A well-posed formulation of the
coupled field problem is obtained through conditions applied to the
adjoint field, relieving the computed displacement field from kinematic
errors on the boundary. The theoretical framework for this formulation
via a nearly incompressible, parallel subdomain-decomposition approach
is presented, along with verification and a detailed exploration of the
performance of the methods via a numerical simulation study. In
addition, the advantages of this novel approach are demonstrated in-vivo
in the human brain, showing the ability of the method to obtain viable
tissue property maps in difficult configurations, enhancing the accuracy
of the method.
磁共振弹性成像 (MRE) 背后的反问题对测量数据很敏感,并且该组织弹性成像过程的结果的质量可能会直接或间接地受到测量噪声的影响。在这项工作中,我们应用了粘弹性本构参数识别问题的耦合伴随场公式,其中避免了噪声通过应用边界条件的间接影响。通过应用于伴随场的条件,获得了耦合场问题的适定公式,从而消除了计算的位移场的边界运动学误差。通过几乎不可压缩的并行子域分解方法提出了该公式的理论框架,并通过数值模拟研究对该方法的性能进行了验证和详细探索。此外,这种新颖方法的优点在人脑体内得到了证明,表明该方法能够在困难的配置中获得可行的组织特性图,从而提高了该方法的准确性。
AU Dan, Tingting
Kim, Minjeong
Kim, Won Hwa
Wu, Guorong
AU Dan、Tingting Kim、Minjeong Kim、Won Hwa Wu、Guorong
Developing Explainable Deep Model for Discovering Novel Control
Mechanism of Neuro-Dynamics
开发可解释的深度模型来发现神经动力学的新型控制机制
Human brain is a complex system composed of many components that
interact with each other. A well-designed computational model, usually
in the format of partial differential equations (PDEs), is vital to
understand the working mechanisms that can explain dynamic and
self-organized behaviors. However, the model formulation and parameters
are often tuned empirically based on the predefined domain-specific
knowledge, which lags behind the emerging paradigm of discovering novel
mechanisms from the unprecedented amount of spatiotemporal data. To
address this limitation, we sought to link the power of deep neural
networks and physics principles of complex systems, which allows us to
design explainable deep models for uncovering the mechanistic role of
how human brain (the most sophisticated complex system) maintains
controllable functions while interacting with external stimulations. In
the spirit of optimal control, we present a unified framework to design
an explainable deep model that describes the dynamic behaviors of
underlying neurobiological processes, allowing us to understand the
latent control mechanism at a system level. We have uncovered the
pathophysiological mechanism of Alzheimer's disease to the extent of
controllability of disease progression, where the dissected system-level
understanding enables higher prediction accuracy for disease progression
and better explainability for disease etiology than conventional (black
box) deep models.
人脑是一个复杂的系统,由许多相互作用的组件组成。精心设计的计算模型(通常采用偏微分方程 (PDE) 格式)对于理解解释动态和自组织行为的工作机制至关重要。然而,模型公式和参数通常根据预定义的特定领域知识进行经验调整,这落后于从前所未有的时空数据中发现新机制的新兴范式。为了解决这个限制,我们试图将深度神经网络的力量和复杂系统的物理原理联系起来,这使我们能够设计可解释的深度模型,以揭示人脑(最复杂的复杂系统)如何维持可控功能的机械作用,同时与外界刺激相互作用。本着最优控制的精神,我们提出了一个统一的框架来设计一个可解释的深度模型,该模型描述了底层神经生物学过程的动态行为,使我们能够在系统层面理解潜在的控制机制。我们在疾病进展可控的程度上揭示了阿尔茨海默病的病理生理机制,与传统(黑匣子)深度模型相比,剖析的系统级理解能够实现更高的疾病进展预测准确性和更好的疾病病因解释性。
AU Li, Jingxiong
Zheng, Sunyi
Shui, Zhongyi
Zhang, Shichuan
Yang, Linyi
Sun, Yuxuan
Zhang, Yunlong
Li, Honglin
Ye, Yuanxin
van Ooijen, Peter M. A.
Li, Kang
Yang, Lin
AU Li、郑竞雄、水孙一、张中一、杨石川、孙林一、张宇轩、李云龙、叶红林、Yuanxin van Ooijen、Peter MA Li、Kang Yang、Lin
Masked Conditional Variational Autoencoders for Chromosome Straightening
用于染色体矫正的掩蔽条件变分自动编码器
Karyotyping is of importance for detecting chromosomal aberrations in
human disease. However, chromosomes easily appear curved in microscopic
images, which prevents cytogeneticists from analyzing chromosome types.
To address this issue, we propose a framework for chromosome
straightening, which comprises a preliminary processing algorithm and a
generative model called masked conditional variational autoencoders
(MC-VAE). The processing method utilizes patch rearrangement to address
the difficulty in erasing low degrees of curvature, providing reasonable
preliminary results for the MC-VAE. The MC-VAE further straightens the
results by leveraging chromosome patches conditioned on their curvatures
to learn the mapping between banding patterns and conditions. During
model training, we apply a masking strategy with a high masking ratio to
train the MC-VAE with eliminated redundancy. This yields a non-trivial
reconstruction task, allowing the model to effectively preserve
chromosome banding patterns and structure details in the reconstructed
results. Extensive experiments on three public datasets with two stain
styles show that our framework surpasses the performance of
state-of-the-art methods in retaining banding patterns and structure
details. Compared to using real-world bent chromosomes, the use of
high-quality straightened chromosomes generated by our proposed method
can improve the performance of various deep learning models for
chromosome classification by a large margin. Such a straightening
approach has the potential to be combined with other karyotyping systems
to assist cytogeneticists in chromosome analysis.
核型分析对于检测人类疾病中的染色体畸变非常重要。然而,染色体在显微图像中很容易出现弯曲,这阻碍了细胞遗传学家分析染色体类型。为了解决这个问题,我们提出了一个染色体拉直框架,其中包括初步处理算法和称为掩码条件变分自动编码器(MC-VAE)的生成模型。该处理方法利用块重排解决了擦除低曲率的困难,为MC-VAE提供了合理的初步结果。 MC-VAE 通过利用以曲率为条件的染色体补丁来学习条带模式和条件之间的映射,从而进一步矫正结果。在模型训练过程中,我们应用高掩蔽率的掩蔽策略来训练消除冗余的MC-VAE。这产生了一个不平凡的重建任务,使模型能够有效地保留重建结果中的染色体带型模式和结构细节。对具有两种染色风格的三个公共数据集进行的广泛实验表明,我们的框架在保留条带图案和结构细节方面超越了最先进的方法的性能。与使用现实世界的弯曲染色体相比,使用我们提出的方法生成的高质量直染色体可以大幅提高各种染色体分类深度学习模型的性能。这种拉直方法有可能与其他核型分析系统相结合,以协助细胞遗传学家进行染色体分析。
AU Kou, Zhengchang
Lowerison, Matthew R.
You, Qi
Wang, Yike
Song, Pengfei
Oelze, Michael L.
AU Kou, 正昌 Lowerison, Matthew R. You, 王琪, 宋一科, Pengfei Oelze, Michael L.
High-Resolution Power Doppler Using Null Subtraction Imaging
使用零减成像的高分辨率功率多普勒
To improve the spatial resolution of power Doppler (PD) imaging, we
explored null subtraction imaging (NSI) as an alternative beamforming
technique to delay-and-sum (DAS). NSI is a nonlinear beamforming
approach that uses three different apodizations on receive and
incoherently sums the beamformed envelopes. NSI uses a null in the beam
pattern to improve the lateral resolution, which we apply here for
improving PD spatial resolution both with and without contrast
microbubbles. In this study, we used NSI with three types of singular
value decomposition (SVD)-based clutter filters and noise equalization
to generate high-resolution PD images. An element sensitivity correction
scheme was also proposed as a crucial component of NSI-based PD imaging.
First, a microbubble trace experiment was performed to evaluate the
resolution improvement of NSI-based PD over traditional DAS-based PD.
Then, both contrast-enhanced and contrast free ultrasound PD images were
generated from the scan of a rat brain. The cross-sectional profile of
the microbubble traces and microvessels were plotted. FWHM was also
estimated to provide a quantitative metric. Furthermore, iso-frequency
curves were calculated to provide a resolution evaluation metric over
the global field of view. Up to six-fold resolution improvement was
demonstrated by the FWHM estimate and four-fold resolution improvement
was demonstrated by the iso-frequency curve from the NSI-based PD
microvessel images compared to microvessel images generated by
traditional DAS-based beamforming. A resolvability of 39 mu m was
measured from the NSI-based PD microvessel image. The computational cost
of NSI-based PD was only increased by 40 percent over the DAS-based PD.
为了提高功率多普勒 (PD) 成像的空间分辨率,我们探索了零减成像 (NSI) 作为延迟求和 (DAS) 的替代波束形成技术。 NSI 是一种非线性波束形成方法,在接收时使用三种不同的变迹并对波束形成的包络进行非相干求和。 NSI 在波束图案中使用零点来提高横向分辨率,我们在这里应用它来提高有或没有对比微泡的 PD 空间分辨率。在本研究中,我们使用 NSI 以及三种基于奇异值分解 (SVD) 的杂波滤波器和噪声均衡来生成高分辨率 PD 图像。还提出了元件灵敏度校正方案作为基于 NSI 的 PD 成像的关键组成部分。首先,进行微泡痕量实验来评估基于 NSI 的 PD 相对于传统的基于 DAS 的 PD 的分辨率改进。然后,通过扫描大鼠大脑生成对比增强和无对比超声 PD 图像。绘制了微泡痕迹和微血管的横截面轮廓。 FWHM 还被估计为提供定量指标。此外,计算等频曲线以提供全局视场的分辨率评估指标。与传统的基于 DAS 的波束形成生成的微血管图像相比,FWHM 估计表明分辨率提高了六倍,基于 NSI 的 PD 微血管图像的等频曲线表明分辨率提高了四倍。从基于 NSI 的 PD 微血管图像中测得分辨率为 39 μm。基于 NSI 的 PD 的计算成本仅比基于 DAS 的 PD 增加了 40%。
AU Guo, Shouchang
Fessler, Jeffrey A.
Noll, Douglas C.
郭守昌、费斯勒、杰弗里·诺尔、道格拉斯·C.
Manifold Regularizer for High-Resolution fMRI Joint Reconstruction and
Dynamic Quantification
用于高分辨率 fMRI 联合重建和动态量化的流形正则化器
Oscillating Steady-State Imaging (OSSI) is a recently developed fMRI
acquisition method that can provide 2 to 3 times higher SNR than
standard fMRI approaches. However, because the OSSI signal exhibits a
nonlinear oscillation pattern, one must acquire and combine n(c) (e.g.,
10) OSSI images to get an image that is free of oscillation for fMRI,
and fully sampled acquisitions would compromise temporal resolution. To
improve temporal resolution and accurately model the nonlinearity of
OSSI signals, instead of using subspace models that are not well suited
for the data, we build the MR physics for OSSI signal generation as a
regularizer for the undersampled reconstruction. Our proposed
physics-based manifold model turns the disadvantages of OSSI acquisition
into advantages and enables joint reconstruction and quantification.
OSSI manifold model (OSSIMM) outperforms subspace models and
reconstructs high-resolution fMRI images with a factor of 12
acceleration and without spatial or temporal smoothing. Furthermore,
OSSIMM can dynamically quantify important physics parameters, including
R-2* maps, with a temporal resolution of 150 ms.
振荡稳态成像 (OSSI) 是一种最近开发的 fMRI 采集方法,其信噪比比标准 fMRI 方法高 2 至 3 倍。然而,由于OSSI信号表现出非线性振荡模式,因此必须采集并组合n(c)(例如,10)个OSSI图像以获得用于fMRI的无振荡图像,并且完全采样的采集会损害时间分辨率。为了提高时间分辨率并准确建模 OSSI 信号的非线性,我们没有使用不太适合数据的子空间模型,而是构建了用于 OSSI 信号生成的 MR 物理场,作为欠采样重建的正则器。我们提出的基于物理的流形模型将 OSSI 采集的缺点转化为优点,并实现联合重建和量化。 OSSI 流形模型 (OSSIMM) 的性能优于子空间模型,可以以 12 倍的加速度重建高分辨率 fMRI 图像,并且无需空间或时间平滑。此外,OSSIMM 可以动态量化重要的物理参数,包括 R-2* 图,时间分辨率为 150 ms。
AU Huang, Xingru
Huang, Jian
Zhao, Kai
Zhang, Tianyun
Li, Zhi
Yue, Changpeng
Chen, Wenhao
Wang, Ruihao
Chen, Xuanbin
Zhang, Qianni
Fu, Ying
Wang, Yangyundou
Guo, Yihao
黄AU、黄星茹、赵健、张凯、李天云、岳志、陈长鹏、王文浩、陈瑞豪、张玄彬、付倩妮、王瑛、郭杨云斗、一号
SASAN: Spectrum-Axial Spatial Approach Networks for Medical Image
Segmentation
SASAN:用于医学图像分割的谱轴空间方法网络
Ophthalmic diseases such as central serous chorioretinopathy (CSC)
significantly impair the vision of millions of people globally. Precise
segmentation of choroid and macular edema is critical for diagnosing and
treating these conditions. However, existing 3D medical image
segmentation methods often fall short due to the heterogeneous nature
and blurry features of these conditions, compounded by medical image
clarity issues and noise interference arising from equipment and
environmental limitations. To address these challenges, we propose the
Spectrum Analysis Synergy Axial-Spatial Network (SASAN), an approach
that innovatively integrates spectrum features using the Fast Fourier
Transform (FFT). SASAN incorporates two key modules: the Frequency
Integrated Neural Enhancer (FINE), which mitigates noise interference,
and the Axial-Spatial Elementum Multiplier (ASEM), which enhances
feature extraction. Additionally, we introduce the Self-Adaptive
Multi-Aspect Loss ( $\mathcal {L}_{\textit {SM}}$ ), which balances
image regions, distribution, and boundaries, adaptively updating weights
during training. We compiled and meticulously annotated the Choroid and
Macular Edema OCT Mega Dataset (CMED-18k), currently the world's largest
dataset of its kind. Comparative analysis against 13 baselines shows our
method surpasses these benchmarks, achieving the highest Dice scores and
lowest HD95 in the CMED and OIMHS datasets.
中心性浆液性脉络膜视网膜病变 (CSC) 等眼科疾病严重损害全球数百万人的视力。脉络膜和黄斑水肿的精确分割对于诊断和治疗这些疾病至关重要。然而,由于这些条件的异构性和模糊特征,再加上医学图像清晰度问题以及设备和环境限制产生的噪声干扰,现有的 3D 医学图像分割方法往往存在不足。为了应对这些挑战,我们提出了频谱分析协同轴向空间网络(SASAN),这是一种使用快速傅立叶变换(FFT)创新地集成频谱特征的方法。 SASAN 包含两个关键模块:用于减轻噪声干扰的频率集成神经增强器 (FINE) 和用于增强特征提取的轴向空间元素乘法器 (ASEM)。此外,我们引入了自适应多方面损失( $\mathcal {L}_{\textit {SM}}$ ),它平衡图像区域、分布和边界,在训练期间自适应更新权重。我们编译并精心注释了脉络膜和黄斑水肿 OCT 大数据集 (CMED-18k),这是目前世界上最大的同类数据集。对 13 个基线的比较分析表明,我们的方法超越了这些基准,在 CMED 和 OIMHS 数据集中实现了最高的 Dice 分数和最低的 HD95。
AU Zhou, Houliang
He, Lifang
Chen, Brian Y
Shen, Li
Zhang, Yu
周AU、何厚良、陈丽芳、沉毅、张莉、余
Multi-Modal Diagnosis of Alzheimer's Disease using Interpretable Graph
Convolutional Networks.
使用可解释的图卷积网络对阿尔茨海默病进行多模态诊断。
The interconnection between brain regions in neurological disease
encodes vital information for the advancement of biomarkers and
diagnostics. Although graph convolutional networks are widely applied
for discovering brain connection patterns that point to disease
conditions, the potential of connection patterns that arise from
multiple imaging modalities has yet to be fully realized. In this paper,
we propose a multi-modal sparse interpretable GCN framework (SGCN) for
the detection of Alzheimer's disease (AD) and its prodromal stage, known
as mild cognitive impairment (MCI). In our experimentation, SGCN learned
the sparse regional importance probability to find signature regions of
interest (ROIs), and the connective importance probability to reveal
disease-specific brain network connections. We evaluated SGCN on the
Alzheimer's Disease Neuroimaging Initiative database with multi-modal
brain images and demonstrated that the ROI features learned by SGCN were
effective for enhancing AD status identification. The identified
abnormalities were significantly correlated with AD-related clinical
symptoms. We further interpreted the identified brain dysfunctions at
the level of large-scale neural systems and sex-related connectivity
abnormalities in AD/MCI. The salient ROIs and the prominent brain
connectivity abnormalities interpreted by SGCN are considerably
important for developing novel biomarkers. These findings contribute to
a better understanding of the network-based disorder via multi-modal
diagnosis and offer the potential for precision diagnostics. The source
code is available at https://github.com/Houliang-Zhou/SGCN.
神经系统疾病中大脑区域之间的互连编码了生物标志物和诊断学进步的重要信息。尽管图卷积网络广泛应用于发现指向疾病状况的大脑连接模式,但多种成像模式产生的连接模式的潜力尚未完全实现。在本文中,我们提出了一种多模态稀疏可解释 GCN 框架(SGCN),用于检测阿尔茨海默病(AD)及其前驱阶段,即轻度认知障碍(MCI)。在我们的实验中,SGCN 学习了稀疏区域重要性概率来找到感兴趣的特征区域 (ROI),并学习了连接重要性概率来揭示疾病特定的大脑网络连接。我们使用多模态脑图像在阿尔茨海默病神经影像计划数据库上评估了 SGCN,并证明 SGCN 学习的 ROI 特征可有效增强 AD 状态识别。所发现的异常与 AD 相关的临床症状显着相关。我们进一步解释了 AD/MCI 中大规模神经系统水平上已发现的脑功能障碍和性别相关的连接异常。 SGCN 解释的显着 ROI 和显着的大脑连接异常对于开发新型生物标志物非常重要。这些发现有助于通过多模式诊断更好地理解基于网络的疾病,并提供精确诊断的潜力。源代码可在 https://github.com/Houliang-Zhou/SGCN 获取。
AU He, Yufang
Liu, Zeyu
Qi, Mingxin
Ding, Shengwei
Zhang, Peng
Song, Fan
Ma, Chenbin
Wu, Huijie
Cai, Ruxin
Feng, Youdan
Zhang, Haonan
Zhang, Tianyi
Zhang, Guanglei
AU、刘玉芳、齐泽宇、丁明欣、张胜伟、宋鹏、马凡、吴陈斌、蔡惠杰、冯如新、张友丹、张浩南、张天一、光磊
PST-Diff: Achieving High-consistency Stain Transfer by Diffusion Models
with Pathological and Structural Constraints.
PST-Diff:通过具有病理和结构约束的扩散模型实现高浓度污渍转移。
Histopathological examinations heavily rely on hematoxylin and eosin
(HE) and immunohistochemistry (IHC) staining. IHC staining can offer
more accurate diagnostic details but it brings significant financial and
time costs. Furthermore, either re-staining HE-stained slides or using
adjacent slides for IHC may compromise the accuracy of pathological
diagnosis due to information loss. To address these challenges, we
develop PST-Diff, a method for generating virtual IHC images from HE
images based on diffusion models, which allows pathologists to
simultaneously view multiple staining results from the same tissue
slide. To maintain the pathological consistency of the stain transfer,
we propose the asymmetric attention mechanism (AAM) and latent transfer
(LT) module in PST-Diff. Specifically, the AAM can retain more local
pathological information of the source domain images through the design
of asymmetric attention mechanisms, while ensuring the model's
flexibility in generating virtual stained images that highly confirm to
the target domain. Subsequently, the LT module transfers the implicit
representations across different domains, effectively alleviating the
bias introduced by direct connection and further enhancing the
pathological consistency of PST-Diff. Furthermore, to maintain the
structural consistency of the stain transfer, the conditional frequency
guidance (CFG) module is proposed to precisely control image generation
and preserve structural details according to the frequency recovery
process. To conclude, the pathological and structural consistency
constraints provide PST-Diff with effectiveness and superior
generalization in generating stable and functionally pathological IHC
images with the best evaluation score. In general, PST-Diff offers
prospective application in clinical virtual staining and pathological
image analysis.
组织病理学检查严重依赖苏木精和伊红 (HE) 以及免疫组织化学 (IHC) 染色。 IHC 染色可以提供更准确的诊断细节,但会带来巨大的财务和时间成本。此外,重新染色 HE 染色的载玻片或使用相邻载玻片进行 IHC 可能会因信息丢失而影响病理诊断的准确性。为了应对这些挑战,我们开发了 PST-Diff,这是一种基于扩散模型从 HE 图像生成虚拟 IHC 图像的方法,它允许病理学家同时查看同一组织载玻片的多个染色结果。为了保持染色转移的病理一致性,我们在 PST-Diff 中提出了不对称注意机制(AAM)和潜在转移(LT)模块。具体来说,AAM可以通过不对称注意机制的设计保留源域图像的更多局部病理信息,同时确保模型在生成与目标域高度一致的虚拟染色图像时的灵活性。随后,LT模块跨不同域传递隐式表示,有效减轻直接连接引入的偏差,进一步增强PST-Diff的病理一致性。此外,为了保持染色转移的结构一致性,提出了条件频率引导(CFG)模块,以根据频率恢复过程精确控制图像生成并保留结构细节。总之,病理和结构一致性约束为 PST-Diff 提供了有效性和卓越的泛化性,可生成具有最佳评估分数的稳定且功能性病理 IHC 图像。 总的来说,PST-Diff在临床虚拟染色和病理图像分析方面具有广阔的应用前景。
AU Wang, Hongqiu
Yang, Guang
Zhang, Shichen
Qin, Jing
Guo, Yike
Xu, Bo
Jin, Yueming
Zhu, Lei
王AU、杨红秋、张光、秦世辰、郭靖、徐一科、金波、朱月明、雷
Video-Instrument Synergistic Network for Referring Video Instrument
Segmentation in Robotic Surgery.
用于机器人手术中参考视频仪器分割的视频仪器协同网络。
Surgical instrument segmentation is fundamentally important for
facilitating cognitive intelligence in robot-assisted surgery. Although
existing methods have achieved accurate instrument segmentation results,
they simultaneously generate segmentation masks of all instruments,
which lack the capability to specify a target object and allow an
interactive experience. This paper focuses on a novel and essential task
in robotic surgery, i.e., Referring Surgical Video Instrument
Segmentation (RSVIS), which aims to automatically identify and segment
the target surgical instruments from each video frame, referred by a
given language expression. This interactive feature offers enhanced user
engagement and customized experiences, greatly benefiting the
development of the next generation of surgical education systems. To
achieve this, this paper constructs two surgery video datasets to
promote the RSVIS research. Then, we devise a novel Video-Instrument
Synergistic Network (VIS-Net) to learn both video-level and
instrument-level knowledge to boost performance, while previous work
only utilized video-level information. Meanwhile, we design a
Graph-based Relation-aware Module (GRM) to model the correlation between
multi-modal information (i.e., textual description and video frame) to
facilitate the extraction of instrument-level information. Extensive
experimental results on two RSVIS datasets exhibit that the VIS-Net can
significantly outperform existing state-of-the-art referring
segmentation methods. We will release our code and dataset for future
research (Git).
手术器械分割对于促进机器人辅助手术中的认知智能至关重要。尽管现有方法已经获得了准确的仪器分割结果,但它们同时生成所有仪器的分割掩模,缺乏指定目标对象和允许交互体验的能力。本文重点研究机器人手术中一项新颖且重要的任务,即参考手术视频器械分割(RSVIS),其目的是通过给定的语言表达来自动识别和分割每个视频帧中的目标手术器械。这种交互功能提供了增强的用户参与度和定制体验,极大有利于下一代外科教育系统的开发。为此,本文构建了两个手术视频数据集来促进 RSVIS 研究。然后,我们设计了一种新颖的视频仪器协同网络(VIS-Net)来学习视频级和仪器级知识以提高性能,而以前的工作仅利用视频级信息。同时,我们设计了基于图的关系感知模块(GRM)来对多模态信息(即文本描述和视频帧)之间的相关性进行建模,以方便提取仪器级信息。对两个 RSVIS 数据集的大量实验结果表明,VIS-Net 的性能可以显着优于现有的最先进的参考分割方法。我们将发布我们的代码和数据集以供未来研究(Git)。
AU Kreitner, Linus
Paetzold, Johannes C.
Rauch, Nikolaus
Chen, Chen
Hagag, Ahmed M.
Fayed, Alaa E.
Sivaprasad, Sobha
Rausch, Sebastian
Weichsel, Julian
Menze, Bjoern H.
Harders, Matthias
Knier, Benjamin
Rueckert, Daniel
Menten, Martin J.
AU Kreitner, Linus Paetzold, Johannes C. Rauch, Nikolaus Chen, Chen Hagag, Ahmed M. Fayed, Alaa E. Sivaprasad, Sobha Rausch, Sebastian Weichsel, Julian Menze, Bjoern H. Harders, Matthias Knier, Benjamin Rueckert, Daniel Menten,马丁·J。
Synthetic Optical Coherence Tomography Angiographs for Detailed Retinal
Vessel Segmentation Without Human Annotations
合成光学相干断层扫描血管造影无需人工注释即可进行详细的视网膜血管分割
Optical coherence tomography angiography (OCTA) is a non-invasive
imaging modality that can acquire high-resolution volumes of the retinal
vasculature and aid the diagnosis of ocular, neurological and cardiac
diseases. Segmenting the visible blood vessels is a common first step
when extracting quantitative biomarkers from these images. Classical
segmentation algorithms based on thresholding are strongly affected by
image artifacts and limited signal-to-noise ratio. The use of modern,
deep learning-based segmentation methods has been inhibited by a lack of
large datasets with detailed annotations of the blood vessels. To
address this issue, recent work has employed transfer learning, where a
segmentation network is trained on synthetic OCTA images and is then
applied to real data. However, the previously proposed simulations fail
to faithfully model the retinal vasculature and do not provide effective
domain adaptation. Because of this, current methods are unable to fully
segment the retinal vasculature, in particular the smallest capillaries.
In this work, we present a lightweight simulation of the retinal
vascular network based on space colonization for faster and more
realistic OCTA synthesis. We then introduce three contrast adaptation
pipelines to decrease the domain gap between real and artificial images.
We demonstrate the superior segmentation performance of our approach in
extensive quantitative and qualitative experiments on three public
datasets that compare our method to traditional computer vision
algorithms and supervised training using human annotations. Finally, we
make our entire pipeline publicly available, including the source code,
pretrained models, and a large dataset of synthetic OCTA images.
光学相干断层扫描血管造影 (OCTA) 是一种非侵入性成像方式,可以获取视网膜脉管系统的高分辨率体积,并有助于诊断眼部、神经系统和心脏疾病。从这些图像中提取定量生物标记物时,分割可见血管是常见的第一步。基于阈值的经典分割算法受到图像伪影和有限信噪比的强烈影响。由于缺乏带有血管详细注释的大型数据集,基于深度学习的现代分割方法的使用受到限制。为了解决这个问题,最近的工作采用了迁移学习,其中分割网络在合成 OCTA 图像上进行训练,然后应用于真实数据。然而,先前提出的模拟未能忠实地模拟视网膜脉管系统,并且没有提供有效的域适应。因此,当前的方法无法完全分割视网膜脉管系统,特别是最小的毛细血管。在这项工作中,我们提出了基于空间殖民的视网膜血管网络的轻量级模拟,以实现更快、更真实的 OCTA 合成。然后,我们引入三个对比度适应管道来减少真实图像和人造图像之间的域差距。我们在三个公共数据集上进行了广泛的定量和定性实验,展示了我们的方法的卓越分割性能,这些实验将我们的方法与传统计算机视觉算法和使用人工注释的监督训练进行比较。最后,我们公开了整个流程,包括源代码、预训练模型和合成 OCTA 图像的大型数据集。
AU Leynes, Andrew P.
Deveshwar, Nikhil
Nagarajan, Srikantan S.
Larson, Peder E. Z.
AU Leynes、Andrew P. Deveshwar、Nikhil Nagarajan、Srikantan S. Larson、Peder EZ
Scan-Specific Self-Supervised Bayesian Deep Non-Linear Inversion for
Undersampled MRI Reconstruction
用于欠采样 MRI 重建的扫描特定自监督贝叶斯深度非线性反演
Magnetic resonance imaging is subject to slow acquisition times due to
the inherent limitations in data sampling. Recently, supervised deep
learning has emerged as a promising technique for reconstructing
sub-sampled MRI. However, supervised deep learning requires a large
dataset of fully-sampled data. Although unsupervised or self-supervised
deep learning methods have emerged to address the limitations of
supervised deep learning approaches, they still require a database of
images. In contrast, scan-specific deep learning methods learn and
reconstruct using only the sub-sampled data from a single scan. Here, we
introduce Scan-Specific Self-Supervised Bayesian Deep Non-Linear
Inversion (DNLINV) that does not require an auto calibration scan
region. DNLINV utilizes a Deep Image Prior-type generative modeling
approach and relies on approximate Bayesian inference to regularize the
deep convolutional neural network. We demonstrate our approach on
several anatomies, contrasts, and sampling patterns and show improved
performance over existing approaches in scan-specific calibrationless
parallel imaging and compressed sensing.
由于数据采样的固有限制,磁共振成像的采集时间很慢。最近,监督深度学习已成为重建子采样 MRI 的一种有前途的技术。然而,监督深度学习需要大量完全采样的数据。尽管无监督或自监督深度学习方法已经出现,以解决有监督深度学习方法的局限性,但它们仍然需要图像数据库。相比之下,特定于扫描的深度学习方法仅使用来自单次扫描的子采样数据进行学习和重建。在这里,我们介绍不需要自动校准扫描区域的扫描特定自监督贝叶斯深度非线性反演(DNLINV)。 DNLINV 采用深度图像先验型生成建模方法,并依靠近似贝叶斯推理来正则化深度卷积神经网络。我们在多个解剖结构、对比度和采样模式上展示了我们的方法,并在特定于扫描的无校准并行成像和压缩传感方面展示了比现有方法更高的性能。
AU Wang, Haiqiao
Ni, Dong
Wang, Yi
王AU、倪海桥、王东、易
Recursive Deformable Pyramid Network for Unsupervised Medical Image
Registration
用于无监督医学图像配准的递归可变形金字塔网络
Complicated deformation problems are frequently encountered in medical
image registration tasks. Although various advanced registration models
have been proposed, accurate and efficient deformable registration
remains challenging, especially for handling the large volumetric
deformations. To this end, we propose a novel recursive deformable
pyramid (RDP) network for unsupervised non-rigid registration. Our
network is a pure convolutional pyramid, which fully utilizes the
advantages of the pyramid structure itself, but does not rely on any
high-weight attentions or transformers. In particular, our network
leverages a step-by-step recursion strategy with the integration of
high-level semantics to predict the deformation field from coarse to
fine, while ensuring the rationality of the deformation field.
Meanwhile, due to the recursive pyramid strategy, our network can
effectively attain deformable registration without separate affine
pre-alignment. We compare the RDP network with several existing
registration methods on three public brain magnetic resonance imaging
(MRI) datasets, including LPBA, Mindboggle and IXI. Experimental results
demonstrate our network consistently outcompetes state of the art with
respect to the metrics of Dice score, average symmetric surface
distance, Hausdorff distance, and Jacobian. Even for the data without
the affine pre-alignment, our network maintains satisfactory performance
on compensating for the large deformation. The code is publicly
available at https://github.com/ZAX130/RDP.
医学图像配准任务中经常遇到复杂的变形问题。尽管已经提出了各种先进的配准模型,但准确有效的变形配准仍然具有挑战性,特别是在处理大体积变形时。为此,我们提出了一种新颖的递归变形金字塔(RDP)网络,用于无监督非刚性配准。我们的网络是一个纯卷积金字塔,充分利用了金字塔结构本身的优势,但不依赖于任何高权重的注意力或变压器。特别是,我们的网络利用逐步递归策略并融合高级语义来预测变形场从粗到细,同时保证变形场的合理性。同时,由于递归金字塔策略,我们的网络可以有效地实现变形配准,而无需单独的仿射预对齐。我们将 RDP 网络与三个公共脑磁共振成像 (MRI) 数据集(包括 LPBA、Mindboggle 和 IXI)上的几种现有配准方法进行比较。实验结果表明,我们的网络在 Dice 得分、平均对称表面距离、豪斯多夫距离和雅可比行列式等指标方面始终优于最先进的技术。即使对于没有仿射预对齐的数据,我们的网络在补偿大变形方面也保持了令人满意的性能。该代码可在 https://github.com/ZAX130/RDP 上公开获取。
AU Bayasi, Nourhan
Hamarneh, Ghassan
Garbi, Rafeef
AU Bayasi、Nourhan Hamarneh、Ghassan Garbi、Rafeef
GC2: Generalizable Continual Classification of Medical Images.
GC2:医学图像的可概括连续分类。
Deep learning models have achieved remarkable success in medical image
classification. These models are typically trained once on the available
annotated images and thus lack the ability of continually learning new
tasks (i.e., new classes or data distributions) due to the problem of
catastrophic forgetting. Recently, there has been more interest in
designing continual learning methods to learn different tasks presented
sequentially over time while preserving previously acquired knowledge.
However, these methods focus mainly on preventing catastrophic
forgetting and are tested under a closed-world assumption; i.e.,
assuming the test data is drawn from the same distribution as the
training data. In this work, we advance the state-of-the-art in
continual learning by proposing GC2 for medical image classification,
which learns a sequence of tasks while simultaneously enhancing its
out-of-distribution robustness. To alleviate forgetting, GC2 employs a
gradual culpability-based network pruning to identify an optimal
subnetwork for each task. To improve generalization, GC2 incorporates
adversarial image augmentation and knowledge distillation approaches for
learning generalized and robust representations for each subnetwork. Our
extensive experiments on multiple benchmarks in a task-agnostic
inference demonstrate that GC2 significantly outperforms baselines and
other continual learning methods in reducing forgetting and enhancing
generalization. Our code is publicly available at the following link:
https://github.com/ nourhanb/TMI2024-GC2.
深度学习模型在医学图像分类方面取得了显着的成功。这些模型通常在可用的带注释图像上进行一次训练,因此由于灾难性遗忘的问题而缺乏持续学习新任务(即新类别或数据分布)的能力。最近,人们对设计持续学习方法越来越感兴趣,以学习随时间顺序呈现的不同任务,同时保留以前获得的知识。然而,这些方法主要侧重于防止灾难性遗忘,并在封闭世界假设下进行测试;即,假设测试数据来自与训练数据相同的分布。在这项工作中,我们通过提出用于医学图像分类的 GC2 来推进持续学习的最新技术,它学习一系列任务,同时增强其分布外鲁棒性。为了减少遗忘,GC2 采用基于罪责的渐进网络修剪来确定每个任务的最佳子网络。为了提高泛化能力,GC2 结合了对抗性图像增强和知识蒸馏方法来学习每个子网络的泛化和鲁棒表示。我们在与任务无关的推理中对多个基准进行的广泛实验表明,GC2 在减少遗忘和增强泛化方面显着优于基线和其他持续学习方法。我们的代码可通过以下链接公开获取:https://github.com/nourhanb/TMI2024-GC2。
AU Zhong, Yutian
Zhang, Shuangyang
Liu, Zhenyang
Zhang, Xiaoming
Mo, Zongxin
Zhang, Yizhe
Hu, Haoyu
Chen, Wufan
Qi, Li
钟AU、张雨田、刘双阳、张振阳、莫晓明、张宗欣、胡一哲、陈浩宇、齐吴凡、李
Unsupervised Fusion of Misaligned PAT and MRI Images via Mutually
Reinforcing Cross-Modality Image Generation and Registration
通过相互增强的跨模态图像生成和配准,对未对准的 PAT 和 MRI 图像进行无监督融合
Photoacoustic tomography (PAT) and magnetic resonance imaging (MRI) are
two advanced imaging techniques widely used in pre-clinical research.
PAT has high optical contrast and deep imaging range but poor soft
tissue contrast, whereas MRI provides excellent soft tissue information
but poor temporal resolution. Despite recent advances in medical image
fusion with pre-aligned multimodal data, PAT-MRI image fusion remains
challenging due to misaligned images and spatial distortion. To address
these issues, we propose an unsupervised multi-stage deep learning
framework called PAMRFuse for misaligned PAT and MRI image fusion.
PAMRFuse comprises a multimodal to unimodal registration network to
accurately align the input PAT-MRI image pairs and a self-attentive
fusion network that selects information-rich features for fusion. We
employ an end-to-end mutually reinforcing mode in our registration
network, which enables joint optimization of cross-modality image
generation and registration. To the best of our knowledge, this is the
first attempt at information fusion for misaligned PAT and MRI.
Qualitative and quantitative experimental results show the excellent
performance of our method in fusing PAT-MRI images of small animals
captured from commercial imaging systems.
光声断层扫描(PAT)和磁共振成像(MRI)是两种广泛应用于临床前研究的先进成像技术。 PAT具有高光学对比度和深成像范围,但软组织对比度较差,而MRI提供良好的软组织信息,但时间分辨率较差。尽管最近在使用预对齐多模态数据的医学图像融合方面取得了进展,但由于图像未对齐和空间失真,PAT-MRI 图像融合仍然具有挑战性。为了解决这些问题,我们提出了一种名为 PAMRFuse 的无监督多阶段深度学习框架,用于未对齐的 PAT 和 MRI 图像融合。 PAMRFuse 包括一个多模态到单模态配准网络,用于精确对齐输入 PAT-MRI 图像对,以及一个自注意力融合网络,用于选择信息丰富的特征进行融合。我们在配准网络中采用端到端的相互增强模式,从而能够联合优化跨模态图像生成和配准。据我们所知,这是针对未对准的 PAT 和 MRI 进行信息融合的首次尝试。定性和定量实验结果表明,我们的方法在融合从商业成像系统捕获的小动物的 PAT-MRI 图像方面具有出色的性能。
AU Liu, Bingxue
Wang, Yongchao
Fomin-Thunemann, Natalie
Thunemann, Martin
Kilic, Kivilcim
Devor, Anna
Cheng, Xiaojun
Tan, Jiyong
Jiang, John
Boas, David A.
Tang, Jianbo
AU Liu、王冰雪、Yongchao Fomin-Thunemann、Natalie Thunemann、Martin Kilic、Kivilcim Devor、Anna Cheng、谭晓君、Jiyong Jiang、John Boas、David A. Tang、Jianbo
Time-Lagged Functional Ultrasound for Multi-Parametric Cerebral
Hemodynamic Imaging
用于多参数脑血流动力学成像的时滞功能超声
We introduce an ultrasound speckle decorrelation-based time-lagged
functional ultrasound technique (tl-fUS) for the quantification of the
relative changes in cerebral blood flow speed (rCBF $_{\text {speed}}$
), cerebral blood volume (rCBV) and cerebral blood flow (rCBF) during
functional stimulations. Numerical simulations, phantom validations, and
in vivo mouse brain experiments were performed to test the capability of
tl-fUS to parse out and quantify the ratio change of these hemodynamic
parameters. The blood volume change was found to be more prominent in
arterioles compared to venules and the peak blood flow changes were
around 2.5 times the peak blood volume change during brain activation,
agreeing with previous observations in the literature. The tl-fUS shows
the ability of distinguishing the relative changes of rCBFspeed, rCBV,
and rCBF, which can inform specific physiological interpretations of the
fUS measurements.
我们引入了一种基于超声散斑去相关的时滞功能超声技术(tl-fUS),用于量化脑血流速度(rCBF $_{\text {speed}}$)、脑血容量(rCBV )和功能刺激期间的脑血流量(rCBF)。通过数值模拟、模型验证和体内小鼠大脑实验来测试 tl-fUS 解析和量化这些血流动力学参数的比率变化的能力。研究发现,与小静脉相比,小动脉的血容量变化更为显着,并且峰值血流量变化约为大脑激活期间峰值血容量变化的 2.5 倍,这与文献中先前的观察结果一致。 tl-fUS 显示了区分 rCBFspeed、rCBV 和 rCBF 相对变化的能力,这可以为 fUS 测量的特定生理学解释提供信息。
AU Tan, Yubo
Shen, Wen-Da
Wu, Ming-Yuan
Liu, Gui-Na
Zhao, Shi-Xuan
Chen, Yang
Yang, Kai-Fu
Li, Yong-Jie
谭AU、沉宇波、吴文达、刘明远、赵桂娜、陈世轩、杨阳、李开复、永杰
Retinal Layer Segmentation in OCT Images With Boundary Regression and
Feature Polarization
使用边界回归和特征偏振进行 OCT 图像中的视网膜层分割
The geometry of retinal layers is an important imaging feature for the
diagnosis of some ophthalmic diseases. In recent years, retinal layer
segmentation methods for optical coherence tomography (OCT) images have
emerged one after another, and huge progress has been achieved. However,
challenges due to interference factors such as noise, blurring, fundus
effusion, and tissue artifacts remain in existing methods, primarily
manifesting as intra-layer false positives and inter-layer boundary
deviation. To solve these problems, we propose a method called Tightly
combined Cross-Convolution and Transformer with Boundary regression and
feature Polarization (TCCT-BP). This method uses a hybrid architecture
of CNN and lightweight Transformer to improve the perception of retinal
layers. In addition, a feature grouping and sampling method and the
corresponding polarization loss function are designed to maximize the
differentiation of the feature vectors of different retinal layers, and
a boundary regression loss function is devised to constrain the retinal
boundary distribution for a better fit to the ground truth. Extensive
experiments on four benchmark datasets demonstrate that the proposed
method achieves state-of-the-art performance in dealing with problems of
false positives and boundary distortion. The proposed method ranked
first in the OCT Layer Segmentation task of GOALS challenge held by
MICCAI 2022. The source code is available at
https://www.github.com/tyb311/TCCT.
视网膜层的几何形状是诊断某些眼科疾病的重要成像特征。近年来,光学相干断层扫描(OCT)图像的视网膜层分割方法相继出现,并取得了巨大的进展。然而,现有方法仍然面临噪声、模糊、眼底积液和组织伪影等干扰因素带来的挑战,主要表现为层内误报和层间边界偏差。为了解决这些问题,我们提出了一种称为“具有边界回归和特征极化的紧密组合交叉卷积和变换器”(TCCT-BP)的方法。该方法使用 CNN 和轻量级 Transformer 的混合架构来改善视网膜层的感知。此外,设计了特征分组和采样方法以及相应的偏振损失函数,以最大限度地区分不同视网膜层的特征向量,并设计了边界回归损失函数来约束视网膜边界分布,以更好地拟合视网膜边界分布。基本事实。对四个基准数据集的大量实验表明,所提出的方法在处理误报和边界失真问题方面实现了最先进的性能。该方法在MICCAI 2022举办的GOALS挑战赛的OCT层分割任务中排名第一。源代码位于https://www.github.com/tyb311/TCCT。
EF