沉浸式翻译 - 文档双语翻译：一键翻译 PDF， ePub 电子书，字幕文件，txt文件

沉浸式翻译

已完成翻译 ✅

关闭

翻译服务：

目标语言：

译文显示：

翻译服务：

目标语言：

译文显示：

FN Clarivate Analytics Web of Science VR 1.0 AU Zhou, Yanfeng Li, Lingrui Wang, Chenlong Song, Le Yang, Ge
FN Clarivate Analytics Web of Science VR 1.0 AU Zhou、Yanfeng Li、Lingrui Wang、Chenlong Song、Le Yang、Ge

GobletNet: Wavelet-Based High-Frequency Fusion Network for Semantic Segmentation of Electron Microscopy Images.
GobletNet：基于小波的高频融合网络，用于电子显微镜图像的语义分割。

Semantic segmentation of electron microscopy (EM) images is crucial for nanoscale analysis. With the development of deep neural networks (DNNs), semantic segmentation of EM images has achieved remarkable success. However, current EM image segmentation models are usually extensions or adaptations of natural or biomedical models. They lack the full exploration and utilization of the intrinsic characteristics of EM images. Furthermore, they are often designed only for several specific segmentation objects and lack versatility. In this study, we quantitatively analyze the characteristics of EM images compared with those of natural and other biomedical images via the wavelet transform. To better utilize these characteristics, we design a high-frequency (HF) fusion network, GobletNet, which outperforms state-of-the-art models by a large margin in the semantic segmentation of EM images. We use the wavelet transform to generate HF images as extra inputs and use an extra encoding branch to extract HF information. Furthermore, we introduce a fusion-attention module (FAM) into GobletNet to facilitate better absorption and fusion of information from raw images and HF images. Extensive benchmarking on seven public EM datasets (EPFL, CREMI, SNEMI3D, UroCell, MitoEM, Nanowire and BetaSeg) demonstrates the effectiveness of our model. The code is available at https://github.com/Yanfeng-Zhou/GobletNet.
电子显微镜 (EM) 图像的语义分割对于纳米级分析至关重要。随着深度神经网络（DNN）的发展，EM 图像的语义分割取得了显着的成功。然而，当前的电磁图像分割模型通常是自然或生物医学模型的扩展或改编。缺乏对电磁图像内在特征的充分探索和利用。此外，它们通常仅针对几个特定的分割对象而设计，缺乏通用性。在本研究中，我们通过小波变换定量分析了电磁图像与自然图像和其他生物医学图像的特征。为了更好地利用这些特性，我们设计了一种高频（HF）融合网络 GobletNet，它在 EM 图像的语义分割方面远远优于最先进的模型。我们使用小波变换生成 HF 图像作为额外输入，并使用额外的编码分支来提取 HF 信息。此外，我们在 GobletNet 中引入了融合注意力模块（FAM），以促进更好地吸收和融合原始图像和 HF 图像的信息。对七个公共 EM 数据集（EPFL、CREMI、SNEMI3D、UroCell、MitoEM、Nanowire 和 BetaSeg）的广泛基准测试证明了我们模型的有效性。代码可在 https://github.com/Yanfeng-Zhou/GobletNet 获取。

EI 1558-254X DA 2024-10-06 UT MEDLINE:39365717 PM 39365717 ER
EI 1558-254X DA 2024-10-06 UT MEDLINE：39365717 PM 39365717 ER

AU Zhang, Runshi Mo, Hao Wang, Junchen Jie, Bimeng He, Yang Jin, Nenghao Zhu, Liang
张AU、莫润世、王浩、杰俊辰、何必萌、金杨、朱能浩、梁

UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration.
UTSRMorph：用于无监督医学图像配准的统一变压器和超分辨率网络。

Complicated image registration is a key issue in medical image analysis, and deep learning-based methods have achieved better results than traditional methods. The methods include ConvNet-based and Transformer-based methods. Although ConvNets can effectively utilize local information to reduce redundancy via small neighborhood convolution, the limited receptive field results in the inability to capture global dependencies. Transformers can establish long-distance dependencies via a self-attention mechanism; however, the intense calculation of the relationships among all tokens leads to high redundancy. We propose a novel unsupervised image registration method named the unified Transformer and superresolution (UTSRMorph) network, which can enhance feature representation learning in the encoder and generate detailed displacement fields in the decoder to overcome these problems. We first propose a fusion attention block to integrate the advantages of ConvNets and Transformers, which inserts a ConvNet-based channel attention module into a multihead self-attention module. The overlapping attention block, a novel cross-attention method, uses overlapping windows to obtain abundant correlations with match information of a pair of images. Then, the blocks are flexibly stacked into a new powerful encoder. The decoder generation process of a high-resolution deformation displacement field from low-resolution features is considered as a superresolution process. Specifically, the superresolution module was employed to replace interpolation upsampling, which can overcome feature degradation. UTSRMorph was compared to state-of-the-art registration methods in the 3D brain MR (OASIS, IXI) and MR-CT datasets (abdomen, craniomaxillofacial). The qualitative and quantitative results indicate that UTSRMorph achieves relatively better performance. The code and datasets used are publicly available at https://github.com/Runshi-Zhang/UTSRMorph.
复杂的图像配准是医学图像分析的关键问题，基于深度学习的方法取得了比传统方法更好的结果。这些方法包括基于 ConvNet 和基于 Transformer 的方法。尽管ConvNets可以通过小邻域卷积有效地利用局部信息来减少冗余，但有限的感受野导致无法捕获全局依赖性。 Transformer 可以通过 self-attention 机制建立长距离依赖；然而，对所有令牌之间关系的密集计算导致了高度冗余。我们提出了一种新颖的无监督图像配准方法，称为统一变换器和超分辨率（UTSRMorph）网络，它可以增强编码器中的特征表示学习并在解码器中生成详细的位移场以克服这些问题。我们首先提出了一种融合注意力模块来整合ConvNets和Transformers的优点，它将基于ConvNet的通道注意力模块插入到多头自注意力模块中。重叠注意力块是一种新颖的交叉注意力方法，它使用重叠窗口来获得与一对图像的匹配信息的丰富相关性。然后，这些块被灵活地堆叠到一个新的强大编码器中。从低分辨率特征生成高分辨率变形位移场的解码器生成过程被认为是超分辨率过程。具体来说，采用超分辨率模块来代替插值上采样，这可以克服特征退化。将 UTSRMorph 与 3D 大脑 MR（OASIS、IXI）和 MR-CT 数据集（腹部、颅颌面）中最先进的配准方法进行了比较。定性和定量结果表明UTSRMorph取得了相对较好的性能。使用的代码和数据集可在 https://github.com/Runshi-Zhang/UTSRMorph 上公开获取。

AU Siebert, Hanna Grossbrohmer, Christoph Hansen, Lasse Heinrich, Mattias P
AU Siebert、Hanna Grossbrohmer、Christoph Hansen、Lasse Heinrich、Mattias P

ConvexAdam: Self-Configuring Dual-Optimisation-Based 3D Multitask Medical Image Registration.
ConvexAdam：基于双优化的自配置 3D 多任务医学图像配准。

Registration of medical image data requires methods that can align anatomical structures precisely while applying smooth and plausible transformations. Ideally, these methods should furthermore operate quickly and apply to a wide variety of tasks. Deep learning-based image registration methods usually entail an elaborate learning procedure with the need for extensive training data. However, they often struggle with versatility when aiming to apply the same approach across various anatomical regions and different imaging modalities. In this work, we present a method that extracts semantic or hand-crafted image features and uses a coupled convex optimisation followed by Adam-based instance optimisation for multitask medical image registration. We make use of pre-trained semantic feature extraction models for the individual datasets and combine them with our fast dual optimisation procedure for deformation field computation. Furthermore, we propose a very fast automatic hyperparameter selection procedure that explores many settings and ranks them on validation data to provide a self-configuring image registration framework. With our approach, we can align image data for various tasks with little learning. We conduct experiments on all available Learn2Reg challenge datasets and obtain results that are to be positioned in the upper ranks of the challenge leaderboards. github.com/multimodallearning/convexAdam.
医学图像数据的配准需要能够精确对齐解剖结构同时应用平滑且合理的变换的方法。理想情况下，这些方法应该能够快速运行并适用于各种任务。基于深度学习的图像配准方法通常需要复杂的学习过程，并且需要大量的训练数据。然而，当他们的目标是在不同的解剖区域和不同的成像模式中应用相同的方法时，他们经常会遇到多功能性的问题。在这项工作中，我们提出了一种提取语义或手工制作的图像特征的方法，并使用耦合凸优化和基于 Adam 的实例优化来进行多任务医学图像配准。我们对各个数据集使用预先训练的语义特征提取模型，并将它们与我们用于变形场计算的快速双重优化程序相结合。此外，我们提出了一种非常快速的自动超参数选择过程，该过程探索许多设置并根据验证数据对它们进行排名，以提供自配置图像配准框架。通过我们的方法，我们只需很少的学习就可以为各种任务调整图像数据。我们对所有可用的 Learn2Reg 挑战数据集进行实验，并获得将位于挑战排行榜上位的结果。 github.com/multimodallearning/convexAdam。

AU Hu, Yan Wang, Jun Zhu, Hao Li, Juncheng Shi, Jun
胡AU、王艳、朱军、李浩、施俊成、Jun

Cost-Sensitive Weighted Contrastive Learning Based on Graph Convolutional Networks for Imbalanced Alzheimer's Disease Staging
基于图卷积网络的成本敏感加权对比学习，用于治疗不平衡的阿尔茨海默病分期

Identifying the progression stages of Alzheimer's disease (AD) can be considered as an imbalanced multi-class classification problem in machine learning. It is challenging due to the class imbalance issue and the heterogeneity of the disease. Recently, graph convolutional networks (GCNs) have been successfully applied in AD classification. However, these works did not handle the class imbalance issue in classification. Besides, they ignore the heterogeneity of the disease. To this end, we propose a novel cost-sensitive weighted contrastive learning method based on graph convolutional networks (CSWCL-GCNs) for imbalanced AD staging using resting-state functional magnetic resonance imaging (rs-fMRI). The proposed method is developed on a multi-view graph constructed by the functional connectivity (FC) and high-order functional connectivity (HOFC) features of the subjects. A novel cost-sensitive weighted contrastive learning procedure is proposed to capture discriminative information from the minority classes, encouraging the samples in the minority class to provide adequate supervision. Considering the heterogeneity of the disease, the weights of the negative pairs are introduced into contrastive learning and they are computed based on the distance to class prototypes, which are automatically learned from the training data. Meanwhile, the cost-sensitive mechanism is further introduced into contrastive learning to handle the class imbalance issue. The proposed CSWCL-GCN is evaluated on 720 subjects (including 184 NCs, 40 SMC patients, 208 EMCI patients, 172 LMCI patients and 116 AD patients) from the ADNI (Alzheimer's Disease Neuroimaging Initiative). Experimental results show that the proposed CSWCL-GCN outperforms state-of-the-art methods on the ADNI database.
识别阿尔茨海默病（AD）的进展阶段可以被视为机器学习中的不平衡多类分类问题。由于类别不平衡问题和疾病的异质性，这具有挑战性。最近，图卷积网络（GCN）已成功应用于AD分类。然而，这些工作并没有解决分类中的类别不平衡问题。此外，他们忽视了疾病的异质性。为此，我们提出了一种基于图卷积网络（CSWCL-GCN）的新型成本敏感加权对比学习方法，用于使用静息态功能磁共振成像（rs-fMRI）进行不平衡的 AD 分期。该方法是在由主体的功能连接（FC）和高阶功能连接（HOFC）特征构建的多视图图上开发的。提出了一种新颖的成本敏感加权对比学习程序来捕获少数类别的判别信息，鼓励少数类别中的样本提供充分的监督。考虑到疾病的异质性，将负对的权重引入对比学习中，并根据与类原型的距离来计算它们，这是从训练数据中自动学习的。同时，将成本敏感机制进一步引入对比学习中，以处理类别不平衡问题。拟议的 CSWCL-GCN 在 ADNI（阿尔茨海默病神经影像计划）的 720 名受试者（包括 184 名 NC、40 名 SMC 患者、208 名 EMCI 患者、172 名 LMCI 患者和 116 名 AD 患者）上进行了评估。实验结果表明，所提出的 CSWCL-GCN 优于 ADNI 数据库上最先进的方法。

AU Qiu, Zifeng Yang, Peng Xiao, Chunlun Wang, Shuqiang Xiao, Xiaohua Qin, Jing Liu, Chuan-Ming Wang, Tianfu Lei, Baiying
邱秋、杨子峰、肖鹏、王春伦、肖书强、秦晓华、刘静、王传明、雷天福、白英

3D Multimodal Fusion Network With Disease-Induced Joint Learning for Early Alzheimer's Disease Diagnosis
具有疾病诱导联合学习的 3D 多模态融合网络用于早期阿尔茨海默病诊断

Multimodal neuroimaging provides complementary information critical for accurate early diagnosis of Alzheimer's disease (AD). However, the inherent variability between multimodal neuroimages hinders the effective fusion of multimodal features. Moreover, achieving reliable and interpretable diagnoses in the field of multimodal fusion remains challenging. To address them, we propose a novel multimodal diagnosis network based on multi-fusion and disease-induced learning (MDL-Net) to enhance early AD diagnosis by efficiently fusing multimodal data. Specifically, MDL-Net proposes a multi-fusion joint learning (MJL) module, which effectively fuses multimodal features and enhances the feature representation from global, local, and latent learning perspectives. MJL consists of three modules, global-aware learning (GAL), local-aware learning (LAL), and outer latent-space learning (LSL) modules. GAL via a self-adaptive Transformer (SAT) learns the global relationships among the modalities. LAL constructs local-aware convolution to learn the local associations. LSL module introduces latent information through outer product operation to further enhance feature representation. MDL-Net integrates the disease-induced region-aware learning (DRL) module via gradient weight to enhance interpretability, which iteratively learns weight matrices to identify AD-related brain regions. We conduct the extensive experiments on public datasets and the results confirm the superiority of our proposed method. Our code will be available at: https://github.com/qzf0320/MDL-Net.
多模态神经影像提供了对于阿尔茨海默病 (AD) 的准确早期诊断至关重要的补充信息。然而，多模态神经图像之间固有的变异性阻碍了多模态特征的有效融合。此外，在多模态融合领域实现可靠且可解释的诊断仍然具有挑战性。为了解决这些问题，我们提出了一种基于多融合和疾病诱导学习（MDL-Net）的新型多模态诊断网络，通过有效融合多模态数据来增强早期 AD 诊断。具体来说，MDL-Net提出了一种多融合联合学习（MJL）模块，该模块有效地融合了多模态特征，并从全局、局部和潜在学习的角度增强了特征表示。 MJL 由三个模块组成，全局感知学习（GAL）、局部感知学习（LAL）和外部潜在空间学习（LSL）模块。 GAL 通过自适应 Transformer (SAT) 学习模态之间的全局关系。 LAL 构建局部感知卷积来学习局部关联。 LSL模块通过外积运算引入潜在信息，进一步增强特征表示。 MDL-Net 通过梯度权重集成疾病诱导区域感知学习 (DRL) 模块以增强可解释性，迭代学习权重矩阵来识别 AD 相关的大脑区域。我们对公共数据集进行了广泛的实验，结果证实了我们提出的方法的优越性。我们的代码将在以下位置提供：https://github.com/qzf0320/MDL-Net。

AU Lu, Ziru Zhang, Yizhe Zhou, Yi Wu, Ye Zhou, Tao
AU Lu、张自如、周一哲、吴一、周晔、涛

Domain-interactive Contrastive Learning and Prototype-guided Self-training for Cross-domain Polyp Segmentation.
用于跨域息肉分割的域交互式对比学习和原型引导的自我训练。

Accurate polyp segmentation plays a critical role from colonoscopy images in the diagnosis and treatment of colorectal cancer. While deep learning-based polyp segmentation models have made significant progress, they often suffer from performance degradation when applied to unseen target domain datasets collected from different imaging devices. To address this challenge, unsupervised domain adaptation (UDA) methods have gained attention by leveraging labeled source data and unlabeled target data to reduce the domain gap. However, existing UDA methods primarily focus on capturing class-wise representations, neglecting domain-wise representations. Additionally, uncertainty in pseudo labels could hinder the segmentation performance. To tackle these issues, we propose a novel Domain-interactive Contrastive Learning and Prototype-guided Self-training (DCL-PS) framework for cross-domain polyp segmentation. Specifically, domaininteractive contrastive learning (DCL) with a domain-mixed prototype updating strategy is proposed to discriminate class-wise feature representations across domains. Then, to enhance the feature extraction ability of the encoder, we present a contrastive learning-based cross-consistency training (CL-CCT) strategy, which is imposed on both the prototypes obtained by the outputs of the main decoder and perturbed auxiliary outputs. Furthermore, we propose a prototype-guided self-training (PS) strategy, which dynamically assigns a weight for each pixel during selftraining, filtering out unreliable pixels and improving the quality of pseudo-labels. Experimental results demonstrate the superiority of DCL-PS in improving polyp segmentation performance in the target domain. The code will be released at https://github.com/taozh2017/DCLPS.
结肠镜图像的准确息肉分割在结直肠癌的诊断和治疗中起着至关重要的作用。虽然基于深度学习的息肉分割模型取得了重大进展，但当应用于从不同成像设备收集的看不见的目标域数据集时，它们常常会出现性能下降的问题。为了应对这一挑战，无监督域适应（UDA）方法通过利用标记的源数据和未标记的目标数据来缩小域差距而受到关注。然而，现有的 UDA 方法主要侧重于捕获类级表示，而忽略了域级表示。此外，伪标签的不确定性可能会阻碍分割性能。为了解决这些问题，我们提出了一种新的域交互式对比学习和原型引导自我训练（DCL-PS）框架，用于跨域息肉分割。具体来说，提出了具有域混合原型更新策略的域交互式对比学习（DCL）来区分跨域的类特征表示。然后，为了增强编码器的特征提取能力，我们提出了一种基于对比学习的交叉一致性训练（CL-CCT）策略，该策略应用于由主解码器的输出和扰动的辅助输出获得的原型。此外，我们提出了一种原型引导的自训练（PS）策略，该策略在自训练过程中动态为每个像素分配权重，过滤掉不可靠的像素并提高伪标签的质量。实验结果证明了 DCL-PS 在提高目标域息肉分割性能方面的优越性。代码将发布在https://github.com/taozh2017/DCLPS。

AU Tian, Xiang Ye, Jian'an Zhang, Tao Zhang, Liangliang Liu, Xuechao Fu, Feng Shi, Xuetao Xu, Canhua
区田、叶向、张建安、张涛、刘亮亮、付雪超、石峰、徐雪涛、灿华

Multi-Path Fusion in SFCF-Net for Enhanced Multi-Frequency Electrical Impedance Tomography
SFCF-Net 中的多路径融合用于增强型多频电阻抗断层扫描

Multi-frequency electrical impedance tomography (mfEIT) offers a nondestructive imaging technology that reconstructs the distribution of electrical characteristics within a subject based on the impedance spectral differences among biological tissues. However, the technology faces challenges in imaging multi-class lesion targets when the conductivity of background tissues is frequency-dependent. To address these issues, we propose a spatial-frequency cross-fusion network (SFCF-Net) imaging algorithm, built on a multi-path fusion structure. This algorithm uses multi-path structures and hyper-dense connections to capture both spatial and frequency correlations between multi-frequency conductivity images, which achieves differential imaging for lesion targets of multiple categories through cross-fusion of information. According to both simulation and physical experiment results, the proposed SFCF-Net algorithm shows an excellent performance in terms of lesion imaging and category discrimination compared to the weighted frequency-difference, U-Net, and MMV-Net algorithms. The proposed algorithm enhances the ability of mfEIT to simultaneously obtain both structural and spectral information from the tissue being examined and improves the accuracy and reliability of mfEIT, opening new avenues for its application in clinical diagnostics and treatment monitoring.
多频电阻抗断层扫描 (mfEIT) 提供了一种无损成像技术，可根据生物组织之间的阻抗谱差异重建对象内的电特性分布。然而，当背景组织的电导率依赖于频率时，该技术在对多类病变目标进行成像时面临挑战。为了解决这些问题，我们提出了一种基于多路径融合结构的空间频率交叉融合网络（SFCF-Net）成像算法。该算法利用多路径结构和超密集连接来捕获多频电导率图像之间的空间和频率相关性，通过信息的交叉融合实现多类别病变目标的差分成像。根据仿真和物理实验结果，与加权频差、U-Net 和 MMV-Net 算法相比，所提出的 SFCF-Net 算法在病变成像和类别区分方面表现出优异的性能。所提出的算法增强了mfEIT同时从被检查组织获取结构和光谱信息的能力，提高了mfEIT的准确性和可靠性，为其在临床诊断和治疗监测中的应用开辟了新途径。

AU Wen, Chi Ye, Mang Li, He Chen, Ting Xiao, Xuan
区文、池野、李芒、何晨、肖霆、轩

Concept-based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis.
基于概念的病变感知变压器，用于可解释的视网膜疾病诊断。

Existing deep learning methods have achieved remarkable results in diagnosing retinal diseases, showcasing the potential of advanced AI in ophthalmology. However, the black-box nature of these methods obscures the decision-making process, compromising their trustworthiness and acceptability. Inspired by the concept-based approaches and recognizing the intrinsic correlation between retinal lesions and diseases, we regard retinal lesions as concepts and propose an inherently interpretable framework designed to enhance both the performance and explainability of diagnostic models. Leveraging the transformer architecture, known for its proficiency in capturing long-range dependencies, our model can effectively identify lesion features. By integrating with image-level annotations, it achieves the alignment of lesion concepts with human cognition under the guidance of a retinal foundation model. Furthermore, to attain interpretability without losing lesion-specific information, our method employs a classifier built on a cross-attention mechanism for disease diagnosis and explanation, where explanations are grounded in the contributions of human-understandable lesion concepts and their visual localization. Notably, due to the structure and inherent interpretability of our model, clinicians can implement concept-level interventions to correct the diagnostic errors by simply adjusting erroneous lesion predictions. Experiments conducted on four fundus image datasets demonstrate that our method achieves favorable performance against state-of-the-art methods while providing faithful explanations and enabling conceptlevel interventions. Our code is publicly available at https://github.com/Sorades/CLAT.
现有的深度学习方法在诊断视网膜疾病方面取得了显着的成果，展示了先进人工智能在眼科领域的潜力。然而，这些方法的黑箱性质掩盖了决策过程，损害了其可信度和可接受性。受到基于概念的方法的启发并认识到视网膜病变与疾病之间的内在相关性，我们将视网膜病变视为概念，并提出了一个本质上可解释的框架，旨在增强诊断模型的性能和可解释性。利用以其擅长捕获远程依赖性而闻名的变压器架构，我们的模型可以有效地识别病变特征。通过与图像级注释相结合，在视网膜基础模型的指导下实现了病变概念与人类认知的一致性。此外，为了在不丢失病变特定信息的情况下获得可解释性，我们的方法采用了基于交叉注意机制的分类器来进行疾病诊断和解释，其中解释基于人类可理解的病变概念及其视觉定位的贡献。值得注意的是，由于我们模型的结构和固有的可解释性，临床医生可以通过简单地调整错误的病变预测来实施概念级干预来纠正诊断错误。在四个眼底图像数据集上进行的实验表明，我们的方法相对于最先进的方法取得了良好的性能，同时提供了忠实的解释并实现了概念级干预。我们的代码可在 https://github.com/Sorades/CLAT 上公开获取。

AU Zhang, Ke Yang, Yan Yu, Jun Fan, Jianping Jiang, Hanliang Huang, Qingming Han, Weidong
张AU、柯阳、于彦、范军、蒋建平、黄汉良、韩清明、卫东

Attribute Prototype-guided Iterative Scene Graph for Explainable Radiology Report Generation.
用于生成可解释的放射学报告的属性原型引导的迭代场景图。

The potential of automated radiology report generation in alleviating the time-consuming tasks of radiologists is increasingly being recognized in medical practice. Existing report generation methods have evolved from using image-level features to the latest approach of utilizing anatomical regions, significantly enhancing interpretability. However, directly and simplistically using region features for report generation compromises the capability of relation reasoning and overlooks the common attributes potentially shared across regions. To address these limitations, we propose a novel region-based Attribute Prototype-guided Iterative Scene Graph generation framework (AP-ISG) for report generation, utilizing scene graph generation as an auxiliary task to further enhance interpretability and relational reasoning capability. The core components of AP-ISG are the Iterative Scene Graph Generation (ISGG) module and the Attribute Prototype-guided Learning (APL) module. Specifically, ISSG employs an autoregressive scheme for structural edge reasoning and a contextualization mechanism for relational reasoning. APL enhances intra-prototype matching and reduces inter-prototype semantic overlap in the visual space to fully model the potential attribute commonalities among regions. Extensive experiments on the MIMIC-CXR with Chest ImaGenome datasets demonstrate the superiority of AP-ISG across multiple metrics.
自动生成放射学报告在减轻放射科医生耗时任务方面的潜力在医疗实践中越来越得到认可。现有的报告生成方法已经从使用图像级特征发展到利用解剖区域的最新方法，显着增强了可解释性。然而，直接简单地使用区域特征来生成报告会损害关系推理的能力，并忽略跨区域可能共享的共同属性。为了解决这些限制，我们提出了一种新颖的基于区域的属性原型引导的迭代场景图生成框架（AP-ISG）用于报告生成，利用场景图生成作为辅助任务来进一步增强可解释性和关系推理能力。 AP-ISG的核心组件是迭代场景图生成（ISGG）模块和属性原型引导学习（APL）模块。具体来说，ISSG 采用自回归方案进行结构边缘推理，并采用上下文化机制进行关系推理。 APL增强了原型内匹配并减少了视觉空间中原型间语义重叠，以充分建模区域之间潜在的属性共性。使用 Chest ImaGenome 数据集对 MIMIC-CXR 进行的大量实验证明了 AP-ISG 在多个指标上的优越性。

AU Huang, Zhili Sun, Jingyi Shao, Yifan Wang, Zixuan Wang, Su Li, Qiyong Li, Jinsong Yu, Qian
AU Huang, 孙志立, 邵静一, 王一凡, 王子轩, 苏力, 李启勇, 于劲松, 钱

PolarFormer: A Transformer-based Method for Multi-lesion Segmentation in Intravascular OCT.
PolarFormer：一种基于 Transformer 的血管内 OCT 多病灶分割方法。

Several deep learning-based methods have been proposed to extract vulnerable plaques of a single class from intravascular optical coherence tomography (OCT) images. However, further research is limited by the lack of publicly available large-scale intravascular OCT datasets with multi-class vulnerable plaque annotations. Additionally, multi-class vulnerable plaque segmentation is extremely challenging due to the irregular distribution of plaques, their unique geometric shapes, and fuzzy boundaries. Existing methods have not adequately addressed the geometric features and spatial prior information of vulnerable plaques. To address these issues, we collected a dataset containing 70 pullback data and developed a multi-class vulnerable plaque segmentation model, called PolarFormer, that incorporates the prior knowledge of vulnerable plaques in spatial distribution. The key module of our proposed model is Polar Attention, which models the spatial relationship of vulnerable plaques in the radial direction. Extensive experiments conducted on the new dataset demonstrate that our proposed method outperforms other baseline methods. Code and data can be accessed via this link: https://github.com/sunjingyi0415/IVOCT-segementaion.
已经提出了几种基于深度学习的方法来从血管内光学相干断层扫描（OCT）图像中提取单一类别的易损斑块。然而，由于缺乏具有多类易损斑块注释的公开大规模血管内 OCT 数据集，进一步的研究受到限制。此外，由于斑块的不规则分布、独特的几何形状和模糊的边界，多类易损斑块分割极具挑战性。现有方法尚未充分解决易损斑块的几何特征和空间先验信息。为了解决这些问题，我们收集了包含 70 个回拉数据的数据集，并开发了一个名为 PolarFormer 的多类易损斑块分割模型，该模型结合了易损斑块空间分布的先验知识。我们提出的模型的关键模块是极地注意力，它模拟易损斑块在径向方向的空间关系。对新数据集进行的大量实验表明，我们提出的方法优于其他基线方法。代码和数据可以通过以下链接访问：https://github.com/sunjingyi0415/IVOCT-segementaion。

AU Yang, Yanwu Ye, Chenfei Su, Guinan Zhang, Ziyao Chang, Zhikai Chen, Hairui Chan, Piu Yu, Yue Ma, Ting
欧阳、叶彦武、苏晨飞、张桂楠、常子耀、陈志凯、陈海瑞、于彪、马跃、丁

BrainMass: Advancing Brain Network Analysis for Diagnosis with Large-scale Self-Supervised Learning.
BrainMass：通过大规模自我监督学习推进大脑网络分析诊断。

Foundation models pretrained on large-scale datasets via self-supervised learning demonstrate exceptional versatility across various tasks. Due to the heterogeneity and hard-to-collect medical data, this approach is especially beneficial for medical image analysis and neuroscience research, as it streamlines broad downstream tasks without the need for numerous costly annotations. However, there has been limited investigation into brain network foundation models, limiting their adaptability and generalizability for broad neuroscience studies. In this study, we aim to bridge this gap. In particular, (1) we curated a comprehensive dataset by collating images from 30 datasets, which comprises 70,781 samples of 46,686 participants. Moreover, we introduce pseudo-functional connectivity (pFC) to further generates millions of augmented brain networks by randomly dropping certain timepoints of the BOLD signal. (2) We propose the BrainMass framework for brain network self-supervised learning via mask modeling and feature alignment. BrainMass employs Mask-ROI Modeling (MRM) to bolster intra-network dependencies and regional specificity. Furthermore, Latent Representation Alignment (LRA) module is utilized to regularize augmented brain networks of the same participant with similar topological properties to yield similar latent representations by aligning their latent embeddings. Extensive experiments on eight internal tasks and seven external brain disorder diagnosis tasks show BrainMass's superior performance, highlighting its significant generalizability and adaptability. Nonetheless, BrainMass demonstrates powerful few/zero-shot learning abilities and exhibits meaningful interpretation to various diseases, showcasing its potential use for clinical applications.
通过自我监督学习在大规模数据集上进行预训练的基础模型在各种任务中表现出了卓越的多功能性。由于医疗数据的异质性和难以收集，这种方法对于医学图像分析和神经科学研究特别有益，因为它简化了广泛的下游任务，而不需要大量昂贵的注释。然而，对脑网络基础模型的研究有限，限制了它们对广泛神经科学研究的适应性和普遍性。在这项研究中，我们的目标是弥合这一差距。特别是，(1) 我们通过整理 30 个数据集的图像来整理一个综合数据集，其中包括 46,686 名参与者的 70,781 个样本。此外，我们引入伪功能连接（pFC），通过随机丢弃 BOLD 信号的某些时间点来进一步生成数百万个增强的大脑网络。 (2) 我们提出了通过掩模建模和特征对齐进行脑网络自监督学习的 BrainMass 框架。 BrainMass 采用 Mask-ROI 建模 (MRM) 来增强网络内依赖性和区域特异性。此外，潜在表示对齐（LRA）模块用于规范具有相似拓扑属性的同一参与者的增强大脑网络，以通过对齐其潜在嵌入来产生相似的潜在表示。对8个内部任务和7个外部脑部疾病诊断任务的大量实验显示了BrainMass的优越性能，凸显了其显着的通用性和适应性。尽管如此，BrainMass 展示了强大的少样本/零样本学习能力，并对各种疾病表现出有意义的解释，展示了其在临床应用中的潜在用途。

AU Jang, Se-In Pan, Tinsu Li, Ye Heidari, Pedram Chen, Junyu Li, Quanzheng Gong, Kuang
AU Jang、Se-In Pan、Tinsu Li、Ye Heidari、Pedram Chen、Junyu Li、Quanzheng Kong、Kuang

Spach Transformer: Spatial and Channel-Wise Transformer Based on Local and Global Self-Attentions for PET Image Denoising
Spach Transformer：基于局部和全局自注意力的空间和通道变换器，用于 PET 图像去噪

Position emission tomography (PET) is widely used in clinics and research due to its quantitative merits and high sensitivity, but suffers from low signal-to-noise ratio (SNR). Recently convolutional neural networks (CNNs) have been widely used to improve PET image quality. Though successful and efficient in local feature extraction, CNN cannot capture long-range dependencies well due to its limited receptive field. Global multi-head self-attention (MSA) is a popular approach to capture long-range information. However, the calculation of global MSA for 3D images has high computational costs. In this work, we proposed an efficient spatial and channel-wise encoder-decoder transformer, Spach Transformer, that can leverage spatial and channel information based on local and global MSAs. Experiments based on datasets of different PET tracers, i.e., F-18-FDG, F-18-ACBC, F-18-DCFPyL, and Ga-68-DOTATATE, were conducted to evaluate the proposed framework. Quantitative results show that the proposed Spach Transformer framework outperforms state-of-the-art deep learning architectures.
位置发射断层扫描（PET）因其定量优点和高灵敏度而广泛应用于临床和研究，但其信噪比（SNR）较低。最近，卷积神经网络（CNN）已被广泛用于提高 PET 图像质量。尽管 CNN 在局部特征提取方面成功且高效，但由于其感受野有限，无法很好地捕获远程依赖性。全局多头自注意力（MSA）是一种捕获远程信息的流行方法。然而，3D图像的全局MSA计算具有较高的计算成本。在这项工作中，我们提出了一种高效的空间和通道编码器-解码器变压器 Spach Transformer，它可以利用基于本地和全局 MSA 的空间和通道信息。基于不同 PET 示踪剂（即 F-18-FDG、F-18-ACBC、F-18-DCFPyL 和 Ga-68-DOTATATE）数据集进行实验来评估所提出的框架。定量结果表明，所提出的 Spach Transformer 框架优于最先进的深度学习架构。

AU Penso, Coby Frenkel, Lior Goldberger, Jacob
AU Penso、科比·弗兰克尔、利奥尔·戈德伯格、雅各布

Confidence Calibration of a Medical Imaging Classification System That is Robust to Label Noise
能够鲁棒地标记噪声的医学成像分类系统的置信度校准

A classification model is calibrated if its predicted probabilities of outcomes reflect their accuracy. Calibrating neural networks is critical in medical analysis applications where clinical decisions rely upon the predicted probabilities. Most calibration procedures, such as temperature scaling, operate as a post processing step by using holdout validation data. In practice, it is difficult to collect medical image data with correct labels due to the complexity of the medical data and the considerable variability across experts. This study presents a network calibration procedure that is robust to label noise. We draw on the fact that the confusion matrix of the noisy labels can be expressed as the matrix product between the confusion matrix of the clean labels and the label noises. The method is based on estimating the noise level as part of a noise-robust training method. The noise level is then used to estimate the network accuracy required by the calibration procedure. We show that despite the unreliable labels, we can still achieve calibration results that are on a par with the results of a calibration procedure using data with reliable labels.
如果分类模型的预测结果概率反映了其准确性，那么分类模型就被校准。校准神经网络在临床决策依赖于预测概率的医学分析应用中至关重要。大多数校准程序（例如温度缩放）通过使用保留验证数据作为后处理步骤运行。在实践中，由于医学数据的复杂性以及专家之间的巨大差异，很难收集具有正确标签的医学图像数据。这项研究提出了一种对标签噪声具有鲁棒性的网络校准程序。我们利用这样的事实：噪声标签的混淆矩阵可以表示为干净标签的混淆矩阵与标签噪声之间的矩阵乘积。该方法基于估计噪声水平，作为抗噪声训练方法的一部分。然后使用噪声水平来估计校准过程所需的网络精度。我们表明，尽管标签不可靠，我们仍然可以获得与使用具有可靠标签的数据的校准程序的结果相同的校准结果。

AU Chen, Qianqian Zhang, Jiadong Meng, Runqi Zhou, Lei Li, Zhenhui Feng, Qianjin Shen, Dinggang
陈AU、张倩倩、孟家栋、周润奇、李雷、冯振辉、沉前进、丁刚

Modality-Specific Information Disentanglement From Multi-Parametric MRI for Breast Tumor Segmentation and Computer-Aided Diagnosis
用于乳腺肿瘤分割和计算机辅助诊断的多参数 MRI 的模态特定信息分离

Breast cancer is becoming a significant global health challenge, with millions of fatalities annually. Magnetic Resonance Imaging (MRI) can provide various sequences for characterizing tumor morphology and internal patterns, and becomes an effective tool for detection and diagnosis of breast tumors. However, previous deep-learning based tumor segmentation methods from multi-parametric MRI still have limitations in exploring inter-modality information and focusing task-informative modality/modalities. To address these shortcomings, we propose a Modality-Specific Information Disentanglement (MoSID) framework to extract both inter- and intra-modality attention maps as prior knowledge for guiding tumor segmentation. Specifically, by disentangling modality-specific information, the MoSID framework provides complementary clues for the segmentation task, by generating modality-specific attention maps to guide modality selection and inter-modality evaluation. Our experiments on two 3D breast datasets and one 2D prostate dataset demonstrate that the MoSID framework outperforms other state-of-the-art multi-modality segmentation methods, even in the cases of missing modalities. Based on the segmented lesions, we further train a classifier to predict the patients' response to radiotherapy. The prediction accuracy is comparable to the case of using manually-segmented tumors for treatment outcome prediction, indicating the robustness and effectiveness of the proposed segmentation method. The code is available at https://github.com/Qianqian-Chen/MoSID.
乳腺癌正在成为一项重大的全球健康挑战，每年导致数百万人死亡。磁共振成像（MRI）可以提供表征肿瘤形态和内部模式的各种序列，成为乳腺肿瘤检测和诊断的有效工具。然而，先前基于多参数 MRI 的深度学习肿瘤分割方法在探索模态间信息和聚焦任务信息模态方面仍然存在局限性。为了解决这些缺点，我们提出了一种模态特定信息解缠（MoSID）框架，以提取模态间和模内注意图作为指导肿瘤分割的先验知识。具体来说，通过解开特定于模态的信息，MoSID 框架通过生成特定于模态的注意力图来指导模态选择和模态间评估，为分割任务提供补充线索。我们对两个 3D 乳房数据集和一个 2D 前列腺数据集进行的实验表明，即使在缺少模态的情况下，MoSID 框架也优于其他最先进的多模态分割方法。基于分割的病灶，我们进一步训练分类器来预测患者对放疗的反应。预测精度与使用手动分割肿瘤进行治疗结果预测的情况相当，表明所提出的分割方法的稳健性和有效性。代码可在 https://github.com/Qianqian-Chen/MoSID 获取。

AU Sengupta, Sourya Anastasio, Mark A.
AU Sengupta、Sourya Anastasio、Mark A.

A Test Statistic Estimation-Based Approach for Establishing Self-Interpretable CNN-Based Binary Classifiers
一种基于测试统计估计的方法，用于建立可自解释的基于 CNN 的二元分类器

Interpretability is highly desired for deep neural network-based classifiers, especially when addressing high-stake decisions in medical imaging. Commonly used post-hoc interpretability methods have the limitation that they can produce plausible but different interpretations of a given model, leading to ambiguity about which one to choose. To address this problem, a novel decision-theory-inspired approach is investigated to establish a self-interpretable model, given a pre-trained deep binary black-box medical image classifier. This approach involves utilizing a self-interpretable encoder-decoder model in conjunction with a single-layer fully connected network with unity weights. The model is trained to estimate the test statistic of the given trained black-box deep binary classifier to maintain a similar accuracy. The decoder output image, referred to as an equivalency map, is an image that represents a transformed version of the to-be-classified image that, when processed by the fixed fully connected layer, produces the same test statistic value as the original classifier. The equivalency map provides a visualization of the transformed image features that directly contribute to the test statistic value and, moreover, permits quantification of their relative contributions. Unlike the traditional post-hoc interpretability methods, the proposed method is self-interpretable, quantitative. Detailed quantitative and qualitative analyses have been performed with three different medical image binary classification tasks.
基于深度神经网络的分类器非常需要可解释性，特别是在处理医学成像中的高风险决策时。常用的事后可解释性方法具有局限性，即它们可以对给定模型产生看似合理但不同的解释，从而导致选择哪一种模型的模糊性。为了解决这个问题，研究了一种新颖的决策理论启发方法，在给定预训练的深度二元黑盒医学图像分类器的情况下建立可自我解释的模型。这种方法涉及利用可自解释的编码器-解码器模型以及具有统一权重的单层全连接网络。该模型经过训练来估计给定训练的黑盒深度二元分类器的测试统计量，以保持类似的准确性。解码器输出图像，称为等价图，是表示待分类图像的变换版本的图像，当由固定的全连接层处理时，产生与原始分类器相同的测试统计值。等价图提供了直接贡献于测试统计值的变换图像特征的可视化，此外，还允许量化它们的相对贡献。与传统的事后可解释性方法不同，所提出的方法是可自我解释的、定量的。对三种不同的医学图像二元分类任务进行了详细的定量和定性分析。

C1 Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA C1 Univ Illinois, Dept Bioengn, Urbana, IL 61801 USA SN 0278-0062 EI 1558-254X DA 2024-05-23 UT WOS:001214547800003 PM 38163307 ER
C1 伊利诺伊大学，Elect & Comp Engn，厄巴纳，IL 61801 美国 C1 伊利诺伊大学，生物工程系，厄巴纳，IL 61801 美国 SN 0278-0062 EI 1558-254X DA 2024-05-23 UT WOS:001214547800003 PM 38163307 ER

AU Beuret, Samuel Heriard-Dubreuil, Baptiste Martiartu, Naiara Korta Jaeger, Michael Thiran, Jean-Philippe
AU Beuret、Samuel Heriard-Dubreuil、Baptiste Martiartu、Naiara Korta Jaeger、Michael Thiran、Jean-Philippe

Windowed Radon Transform for Robust Speed-of-Sound Imaging With Pulse-Echo Ultrasound
窗口氡变换用于脉冲回波超声的鲁棒声速成像

In recent years, methods estimating the spatial distribution of tissue speed of sound with pulse-echo ultrasound are gaining considerable traction. They can address limitations of B-mode imaging, for instance in diagnosing fatty liver diseases. Current state-of-the-art methods relate the tissue speed of sound to local echo shifts computed between images that are beamformed using restricted transmit and receive apertures. However, the aperture limitation affects the robustness of phase-shift estimations and, consequently, the accuracy of reconstructed speed-of-sound maps. Here, we propose a method based on the Radon transform of image patches able to estimate local phase shifts from full-aperture images. We validate our technique on simulated, phantom and in-vivo data acquired on a liver and compare it with a state-of-the-art method. We show that the proposed method enhances the stability to changes of beamforming speed of sound and to a reduction of the number of insonifications. In particular, the deployment of pulse-echo speed-of-sound estimation methods onto portable ultrasound devices can be eased by the reduction of the number of insonifications allowed by the proposed method.
近年来，利用脉冲回波超声估计组织声速空间分布的方法获得了相当大的关注。它们可以解决 B 型成像的局限性，例如在诊断脂肪肝疾病方面。当前最先进的方法将声波的组织速度与使用受限的发射和接收孔径进行波束形成的图像之间计算的局部回声偏移相关联。然而，孔径限制影响相移估计的鲁棒性，从而影响重建声速图的准确性。在这里，我们提出了一种基于图像块 Radon 变换的方法，能够估计全孔径图像的局部相移。我们在肝脏上获得的模拟数据、模型数据和体内数据验证了我们的技术，并将其与最先进的方法进行比较。我们表明，所提出的方法增强了声波束形成速度变化的稳定性并减少了声穿透的数量。特别地，通过减少所提出的方法允许的声穿透的数量，可以简化脉冲回波声速估计方法在便携式超声设备上的部署。

AU Ortiz-Gonzalez, Antonio Kobler, Erich Simon, Stefan Bischoff, Leon Nowak, Sebastian Isaak, Alexander Block, Wolfgang Sprinkart, Alois M. Attenberger, Ulrike Luetkens, Julian A. Bayro-Corrochano, Eduardo Effland, Alexander
AU Ortiz-Gonzalez、Antonio Kobler、Erich Simon、Stefan Bischoff、Leon Nowak、Sebastian Isaak、Alexander Block、Wolfgang Sprinkart、Alois M. Attenberger、Ulrike Luetkens、Julian A. Bayro-Corrochano、Eduardo Effland、Alexander

Optical Flow-Guided Cine MRI Segmentation With Learned Corrections
具有学习校正的光流引导电影 MRI 分割

In cardiac cine magnetic resonance imaging (MRI), the heart is repeatedly imaged at numerous time points during the cardiac cycle. Frequently, the temporal evolution of a certain region of interest such as the ventricles or the atria is highly relevant for clinical diagnosis. In this paper, we devise a novel approach that allows for an automatized propagation of an arbitrary region of interest (ROI) along the cardiac cycle from respective annotated ROIs provided by medical experts at two different points in time, most frequently at the end-systolic (ES) and the end-diastolic (ED) cardiac phases. At its core, a 3D TV- $\boldsymbol {L<^>{1}}$ -based optical flow algorithm computes the apparent motion of consecutive MRI images in forward and backward directions. Subsequently, the given terminal annotated masks are propagated by this bidirectional optical flow in 3D, which results, however, in improper initial estimates of the segmentation masks due to numerical inaccuracies. These initially propagated segmentation masks are then refined by a 3D U-Net-based convolutional neural network (CNN), which was trained to enforce consistency with the forward and backward warped masks using a novel loss function. Moreover, a penalization term in the loss function controls large deviations from the initial segmentation masks. This method is benchmarked both on a new dataset with annotated single ventricles containing patients with severe heart diseases and on a publicly available dataset with different annotated ROIs. We emphasize that our novel loss function enables fine-tuning the CNN on a single patient, thereby yielding state-of-the-art results along the complete cardiac cycle.
在心脏电影磁共振成像 (MRI) 中，心脏在心动周期的多个时间点重复成像。通常，某个感兴趣区域（例如心室或心房）的时间演变与临床诊断高度相关。在本文中，我们设计了一种新颖的方法，允许根据医学专家在两个不同时间点（最常见的是收缩末期）提供的各自注释的 ROI，沿着心动周期自动传播任意感兴趣区域 (ROI) (ES) 和舒张末期 (ED) 心脏阶段。其核心是基于 3D TV- $\boldsymbol {L<^>{1}}$ 的光流算法计算连续 MRI 图像向前和向后方向的表观运动。随后，给定的终端注释掩模通过 3D 中的双向光流传播，然而，由于数值不准确，这导致分割掩模的初始估计不正确。然后，这些最初传播的分割掩模由基于 3D U-Net 的卷积神经网络 (CNN) 进行细化，该网络经过训练，可使用新颖的损失函数来强制与前向和后向扭曲掩模保持一致。此外，损失函数中的惩罚项控制与初始分割掩模的大偏差。该方法在包含严重心脏病患者的带注释单心室的新数据集和具有不同注释 ROI 的公开数据集上进行基准测试。我们强调，我们的新颖损失函数能够对单个患者的 CNN 进行微调，从而在整个心动周期中产生最先进的结果。

AU Qiao, Mengyun Wang, Shuo Qiu, Huaqi de Marvao, Antonio O'Regan, Declan P. Rueckert, Daniel Bai, Wenjia
AU Qiao, 王梦云, 邱硕, Huaqi de Marvao, Antonio O'Regan, Declan P. Rueckert, Daniel Bai, Wenjia

CHeart: A Conditional Spatio-Temporal Generative Model for Cardiac Anatomy
CHart：心脏解剖学条件时空生成模型

Two key questions in cardiac image analysis are to assess the anatomy and motion of the heart from images; and to understand how they are associated with non-imaging clinical factors such as gender, age and diseases. While the first question can often be addressed by image segmentation and motion tracking algorithms, our capability to model and answer the second question is still limited. In this work, we propose a novel conditional generative model to describe the 4D spatio-temporal anatomy of the heart and its interaction with non-imaging clinical factors. The clinical factors are integrated as the conditions of the generative modelling, which allows us to investigate how these factors influence the cardiac anatomy. We evaluate the model performance in mainly two tasks, anatomical sequence completion and sequence generation. The model achieves high performance in anatomical sequence completion, comparable to or outperforming other state-of-the-art generative models. In terms of sequence generation, given clinical conditions, the model can generate realistic synthetic 4D sequential anatomies that share similar distributions with the real data.
心脏图像分析中的两个关键问题是从图像中评估心脏的解剖结构和运动；并了解它们与性别、年龄和疾病等非影像学临床因素的关系。虽然第一个问题通常可以通过图像分割和运动跟踪算法来解决，但我们建模和回答第二个问题的能力仍然有限。在这项工作中，我们提出了一种新颖的条件生成模型来描述心脏的 4D 时空解剖结构及其与非成像临床因素的相互作用。临床因素被整合为生成模型的条件，这使我们能够研究这些因素如何影响心脏解剖结构。我们主要在两个任务中评估模型性能：解剖序列完成和序列生成。该模型在解剖序列完成方面实现了高性能，可与其他最先进的生成模型相媲美或优于其他最先进的生成模型。在序列生成方面，在给定临床条件下，该模型可以生成与真实数据具有相似分布的真实合成 4D 序列解剖结构。

AU Gao, Qi Li, Zilong Zhang, Junping Zhang, Yi Shan, Hongming
AU 高、李奇、张子龙、张军平、单毅、洪明

CoreDiff: Contextual Error-Modulated Generalized Diffusion Model for Low-Dose CT Denoising and Generalization
CoreDiff：用于低剂量 CT 去噪和泛化的上下文误差调制广义扩散模型

Low-dose computed tomography (CT) images suffer from noise and artifacts due to photon starvation and electronic noise. Recently, some works have attempted to use diffusion models to address the over-smoothness and training instability encountered by previous deep-learning-based denoising models. However, diffusion models suffer from long inference time due to a large number of sampling steps involved. Very recently, cold diffusion model generalizes classical diffusion models and has greater flexibility. Inspired by cold diffusion, this paper presents a novel COntextual eRror-modulated gEneralized Diffusion model for low-dose CT (LDCT) denoising, termed CoreDiff. First, CoreDiff utilizes LDCT images to displace the random Gaussian noise and employs a novel mean-preserving degradation operator to mimic the physical process of CT degradation, significantly reducing sampling steps thanks to the informative LDCT images as the starting point of the sampling process. Second, to alleviate the error accumulation problem caused by the imperfect restoration operator in the sampling process, we propose a novel ContextuaL Error-modulAted Restoration Network (CLEAR-Net), which can leverage contextual information to constrain the sampling process from structural distortion and modulate time step embedding features for better alignment with the input at the next time step. Third, to rapidly generalize the trained model to a new, unseen dose level with as few resources as possible, we devise a one-shot learning framework to make CoreDiff generalize faster and better using only one single LDCT image (un)paired with normal-dose CT (NDCT). Extensive experimental results on four datasets demonstrate that our CoreDiff outperforms competing methods in denoising and generalization performance, with clinically acceptable inference time. Source code is made available at https://github.com/qgao21/CoreDiff.
由于光子匮乏和电子噪声，低剂量计算机断层扫描 (CT) 图像会受到噪声和伪影的影响。最近，一些工作尝试使用扩散模型来解决先前基于深度学习的去噪模型遇到的过度平滑和训练不稳定的问题。然而，由于涉及大量采样步骤，扩散模型的推理时间较长。最近，冷扩散模型概括了经典扩散模型并具有更大的灵活性。受冷扩散的启发，本文提出了一种用于低剂量 CT (LDCT) 降噪的新型上下文误差调制广义扩散模型，称为 CoreDiff。首先，CoreDiff 利用 LDCT 图像来取代随机高斯噪声，并采用新颖的保均退化算子来模拟 CT 退化的物理过程，由于信息丰富的 LDCT 图像作为采样过程的起点，显着减少了采样步骤。其次，为了缓解采样过程中不完美恢复算子引起的误差累积问题，我们提出了一种新颖的上下文误差调制恢复网络（CLEAR-Net），它可以利用上下文信息来约束采样过程的结构失真和调制时间步嵌入特征，以便更好地与下一个时间步的输入对齐。第三，为了用尽可能少的资源将训练好的模型快速泛化到新的、未见过的剂量水平，我们设计了一个一次性学习框架，使 CoreDiff 只使用一张（未）与正常图像配对的 LDCT 图像来更快更好地泛化。剂量CT（NDCT）。四个数据集的广泛实验结果表明，我们的 CoreDiff 在去噪和泛化性能方面优于竞争方法，并且具有临床可接受的推理时间。源代码可在 https://github.com/qgao21/CoreDiff 获取。

AU Zhang, Ruipeng Qin, Binjie Zhao, Jun Zhu, Yueqi Lv, Yisong Ding, Song
张AU、秦瑞鹏、赵斌杰、朱军、吕悦琪、丁一松、宋

Locating X-Ray Coronary Angiogram Keyframes via Long Short-Term Spatiotemporal Attention With Image-to-Patch Contrastive Learning
通过图像到斑块对比学习的长短期时空注意力定位 X 射线冠状动脉造影关键帧

Locating the start, apex and end keyframes of moving contrast agents for keyframe counting in X-ray coronary angiography (XCA) is very important for the diagnosis and treatment of cardiovascular diseases. To locate these keyframes from the class-imbalanced and boundary-agnostic foreground vessel actions that overlap complex backgrounds, we propose long short-term spatiotemporal attention by integrating a convolutional long short-term memory (CLSTM) network into a multiscale Transformer to learn the segment- and sequence-level dependencies in the consecutive-frame-based deep features. Image-to-patch contrastive learning is further embedded between the CLSTM-based long-term spatiotemporal attention and Transformer-based short-term attention modules. The imagewise contrastive module reuses the long-term attention to contrast image-level foreground/background of XCA sequence, while patchwise contrastive projection selects the random patches of backgrounds as convolution kernels to project foreground/background frames into different latent spaces. A new XCA video dataset is collected to evaluate the proposed method. The experimental results show that the proposed method achieves a mAP (mean average precision) of 72.45% and a F-score of 0.8296, considerably outperforming the state-of-the-art methods. The source code is available at https://github.com/Binjie-Qin/STA-IPCon.
在X射线冠状动脉造影（XCA）中定位移动造影剂的起始、顶点和结束关键帧进行关键帧计数对于心血管疾病的诊断和治疗非常重要。为了从与复杂背景重叠的类不平衡和边界无关的前景血管动作中定位这些关键帧，我们通过将卷积长短期记忆（CLSTM）网络集成到多尺度 Transformer 中来学习分段，从而提出长期短期时空注意力- 基于连续帧的深层特征中的序列级依赖性。图像到补丁对比学习进一步嵌入基于 CLSTM 的长期时空注意力模块和基于 Transformer 的短期注意力模块之间。图像对比模块重用了对XCA序列的图像级前景/背景对比的长期关注，而补丁对比投影则选择背景的随机补丁作为卷积核，将前景/背景帧投影到不同的潜在空间中。收集新的 XCA 视频数据集来评估所提出的方法。实验结果表明，所提出的方法实现了 72.45% 的 mAP（平均平均精度）和 0.8296 的 F 分数，大大优于最先进的方法。源代码可在 https://github.com/Binjie-Qin/STA-IPCon 获取。

AU Wu, Junde Zhang, Yu Fang, Huihui Duan, Lixin Tan, Mingkui Yang, Weihua Wang, Chunhui Liu, Huiying Jin, Yueming Xu, Yanwu
吴AU、张俊德、方宇、段慧慧、谭立新、杨明奎、王伟华、刘春慧、金惠英、徐月明、吴彦

Calibrate the Inter-Observer Segmentation Uncertainty via Diagnosis-First Principle
通过诊断第一原则校准观察者间的分割不确定性

Many of the tissues/lesions in the medical images may be ambiguous. Therefore, medical segmentation is typically annotated by a group of clinical experts to mitigate personal bias. A common solution to fuse different annotations is the majority vote, e.g., taking the average of multiple labels. However, such a strategy ignores the difference between the grader expertness. Inspired by the observation that medical image segmentation is usually used to assist the disease diagnosis in clinical practice, we propose the diagnosis-first principle, which is to take disease diagnosis as the criterion to calibrate the inter-observer segmentation uncertainty. Following this idea, a framework named Diagnosis-First segmentation Framework (DiFF) is proposed. Specifically, DiFF will first learn to fuse the multi-rater segmentation labels to a single ground-truth which could maximize the disease diagnosis performance. We dubbed the fused ground-truth as Diagnosis-First Ground-truth (DF-GT). Then, the Take and Give Model (T&G Model) to segment DF-GT from the raw image is proposed. With the T&G Model, DiFF can learn the segmentation with the calibrated uncertainty that facilitate the disease diagnosis. We verify the effectiveness of DiFF on three different medical segmentation tasks: optic-disc/optic-cup (OD/OC) segmentation on fundus images, thyroid nodule segmentation on ultrasound images, and skin lesion segmentation on dermoscopic images. Experimental results show that the proposed DiFF can effectively calibrate the segmentation uncertainty, and thus significantly facilitate the corresponding disease diagnosis, which outperforms previous state-of-the-art multi-rater learning methods.
医学图像中的许多组织/病变可能是不明确的。因此，医学分割通常由一组临床专家进行注释，以减少个人偏见。融合不同注释的常见解决方案是多数投票，例如，取多个标签的平均值。然而，这样的策略忽略了评分者专业知识之间的差异。受到临床实践中医学图像分割通常用于辅助疾病诊断的观察的启发，我们提出了诊断优先原则，即以疾病诊断为标准来校准观察者间分割的不确定性。遵循这个想法，提出了一个名为诊断优先分割框架（DiFF）的框架。具体来说，DiFF 将首先学习将多评估者分割标签融合到单个基本事实，这可以最大限度地提高疾病诊断性能。我们将融合的地面实况称为诊断优先地面实况（DF-GT）。然后，提出了从原始图像中分割 DF-GT 的 Take and Give 模型（T&G 模型）。通过 T&G 模型，DiFF 可以学习具有校准不确定性的分割，从而促进疾病诊断。我们验证了 DiFF 在三种不同医学分割任务上的有效性：眼底图像上的视盘/视杯（OD/OC）分割、超声图像上的甲状腺结节分割以及皮肤镜图像上的皮肤病变分割。实验结果表明，所提出的 DiFF 可以有效校准分割不确定性，从而显着促进相应的疾病诊断，优于以前最先进的多评估者学习方法。

Technol, Hefei 230037, Peoples R China C1 Pazhou Lab, Guangzhou 510005, Peoples R China C1 Univ Elect Sci & Technol China, Sch Comp Sci & Technol, Chengdu 611731, Sichuan, Peoples R China C1 South China Univ Technol, Sch Software Engn, Guangzhou 518055, Guangdong, Peoples R China C1 Jinan Univ, Shenzhen Eye Hosp, Big Data & Artificial Intelligence Inst, Shenzhen 518040, Peoples R China C1 Harbin Inst Technol, Dept Elect Sci & Technol, Harbin 150001, Peoples R China C1 ASTAR, Inst Infocomm Res, Singapore 138632, Singapore SN 0278-0062 EI 1558-254X DA 2024-09-18 UT WOS:001307429600009 PM 38669168 ER
合肥 230037, 人民 R 中国 C1 琶洲实验室, 广州 510005, 人民 R 中国 C1 科技大学, 科学计算科学与技术, 四川成都 611731, 人民 R 中国 C1 华南理工大学, 科学软件工程,广州 518055, 广东省, 人民路 C1 暨南大学, 深圳眼科医院大数据与人工智能研究所, 深圳 518040, 人民路 C1 哈尔滨理工学院, 哈尔滨 150001, 人民路 C1 ASTAR, 研究所Infocomm Res，新加坡 138632，新加坡 SN 0278-0062 EI 1558-254X DA 2024-09-18 UT WOS：001307429600009 PM 38669168 ER

AU Xiao, Chunlun Zhu, Anqi Xia, Chunmei Qiu, Zifeng Liu, Yuanlin Zhao, Cheng Ren, Weiwei Wang, Lifan Dong, Lei Wang, Tianfu Guo, Lehang Lei, Baiying
AU肖, 朱春伦, 夏安琪, 邱春梅, 刘子峰, 赵元林, 任成, 王伟伟, 董立凡, 王磊, 郭天福, 雷乐航, 白英

Attention-Guided Learning with Feature Reconstruction for Skin Lesion Diagnosis using Clinical and Ultrasound Images.
使用临床和超声图像进行皮肤病变诊断的注意力引导学习和特征重建。

Skin lesion is one of the most common diseases, and most categories are highly similar in morphology and appearance. Deep learning models effectively reduce the variability between classes and within classes, and improve diagnostic accuracy. However, the existing multi-modal methods are only limited to the surface information of lesions in skin clinical and dermatoscopic modalities, which hinders the further improvement of skin lesion diagnostic accuracy. This requires us to further study the depth information of lesions in skin ultrasound. In this paper, we propose a novel skin lesion diagnosis network, which combines clinical and ultrasound modalities to fuse the surface and depth information of the lesion to improve diagnostic accuracy. Specifically, we propose an attention-guided learning (AL) module that fuses clinical and ultrasound modalities from both local and global perspectives to enhance feature representation. The AL module consists of two parts, attention-guided local learning (ALL) computes the intra-modality and inter-modality correlations to fuse multi-scale information, which makes the network focus on the local information of each modality, and attention-guided global learning (AGL) fuses global information to further enhance the feature representation. In addition, we propose a feature reconstruction learning (FRL) strategy which encourages the network to extract more discriminative features and corrects the focus of the network to enhance the model's robustness and certainty. We conduct extensive experiments and the results confirm the superiority of our proposed method. Our code is available at: https://github.com/XCL-hub/AGFnet.
皮肤病变是最常见的疾病之一，大多数类别在形态和外观上高度相似。深度学习模型有效减少类间和类内的变异性，提高诊断准确性。然而，现有的多模态方法仅局限于皮肤临床和皮肤镜模态中皮损的表面信息，阻碍了皮损诊断准确性的进一步提高。这就需要我们进一步研究皮肤超声中病变的深度信息。在本文中，我们提出了一种新颖的皮肤病变诊断网络，该网络结合临床和超声模式来融合病变的表面和深度信息，以提高诊断准确性。具体来说，我们提出了一种注意力引导学习（AL）模块，该模块从局部和全局角度融合临床和超声模式，以增强特征表示。 AL模块由两部分组成，注意力引导局部学习（ALL）计算模态内和模态间相关性以融合多尺度信息，这使得网络专注于每个模态的局部信息，注意力引导局部学习（ALL）全局学习（AGL）融合全局信息以进一步增强特征表示。此外，我们提出了一种特征重建学习（FRL）策略，鼓励网络提取更多判别性特征并纠正网络的焦点，以增强模型的鲁棒性和确定性。我们进行了大量的实验，结果证实了我们提出的方法的优越性。我们的代码位于：https://github.com/XCL-hub/AGFnet。

EI 1558-254X DA 2024-09-04 UT MEDLINE:39208042 PM 39208042 ER
EI 1558-254X DA 2024-09-04 UT MEDLINE：39208042 PM 39208042 ER

AU Chaudhary, Muhammad F. A. Gerard, Sarah E. Christensen, Gary E. Cooper, Christopher B. Schroeder, Joyce D. Hoffman, Eric A. Reinhardt, Joseph M.
AU Chaudhary、Muhammad FA Gerard、Sarah E. Christensen、Gary E. Cooper、Christopher B. Schroeder、Joyce D. Hoffman、Eric A. Reinhardt、Joseph M.

LungViT: Ensembling Cascade of Texture Sensitive Hierarchical Vision Transformers for Cross-Volume Chest CT Image-to-Image Translation
LungViT：用于跨体积胸部 CT 图像到图像转换的纹理敏感分层视觉变换器的集成级联

Chest computed tomography (CT) at inspiration is often complemented by an expiratory CT to identify peripheral airways disease. Additionally, co-registered inspiratory-expiratory volumes can be used to derive various markers of lung function. Expiratory CT scans, however, may not be acquired due to dose or scan time considerations or may be inadequate due to motion or insufficient exhale; leading to a missed opportunity to evaluate underlying small airways disease. Here, we propose LungViT- a generative adversarial learning approach using hierarchical vision transformers for translating inspiratory CT intensities to corresponding expiratory CT intensities. LungViT addresses several limitations of the traditional generative models including slicewise discontinuities, limited size of generated volumes, and their inability to model texture transfer at volumetric level. We propose a shifted-window hierarchical vision transformer architecture with squeeze-and-excitation decoder blocks for modeling dependencies between features. We also propose a multiview texture similarity distance metric for texture and style transfer in 3D. To incorporate global information into the training process and refine the output of our model, we use ensemble cascading. LungViT is able to generate large 3D volumes of size $320\times 320\times320$ . We train and validate our model using a diverse cohort of 1500 subjects with varying disease severity. To assess model generalizability beyond the development set biases, we evaluate our model on an out-of-distribution external validation set of 200 subjects. Clinical validation on internal and external testing sets shows that synthetic volumes could be reliably adopted for deriving clinical endpoints of chronic obstructive pulmonary disease.
吸气时胸部计算机断层扫描 (CT) 通常辅以呼气 CT 来识别周围气道疾病。此外，共同记录的吸气-呼气量可用于得出肺功能的各种标志物。然而，由于剂量或扫描时间的考虑，可能无法获取呼气 CT 扫描，或者由于运动或呼气不足而导致扫描不充分；导致错失评估潜在小气道疾病的机会。在这里，我们提出了 LungViT——一种生成对抗性学习方法，使用分层视觉转换器将吸气 CT 强度转换为相应的呼气 CT 强度。 LungViT 解决了传统生成模型的几个限制，包括切片不连续性、生成体积的有限大小以及它们无法在体积水平上模拟纹理传输。我们提出了一种带有挤压和激励解码器块的移位窗口分层视觉变换器架构，用于对特征之间的依赖关系进行建模。我们还提出了一种用于 3D 纹理和风格迁移的多视图纹理相似性距离度量。为了将全局信息纳入训练过程并完善模型的输出，我们使用集成级联。 LungViT 能够生成大小为 $320\times 320\times320$ 的大型 3D 体积。我们使用 1500 名患有不同疾病严重程度的受试者组成的不同队列来训练和验证我们的模型。为了评估模型超越开发集偏差的通用性，我们在包含 200 名受试者的分布外外部验证集上评估我们的模型。内部和外部测试集的临床验证表明，可以可靠地采用合成体积来得出慢性阻塞性肺疾病的临床终点。

AU Luo, Yan Tian, Yu Shi, Min Pasquale, Louis R. Shen, Lucy Q. Zebardast, Nazlee Elze, Tobias Wang, Mengyu
AU Luo、Yan Tian、于石、Min Pasquale、Louis R. Shen、Lucy Q. Zebardast、Nazlee Elze、Tobias Wang、Mengyu

Harvard Glaucoma Fairness: A Retinal Nerve Disease Dataset for Fairness Learning and Fair Identity Normalization
哈佛青光眼公平：用于公平学习和公平身份标准化的视网膜神经疾病数据集

Fairness (also known as equity interchangeably) in machine learning is important for societal well-being, but limited public datasets hinder its progress. Currently, no dedicated public medical datasets with imaging data for fairness learning are available, though underrepresented groups suffer from more health issues. To address this gap, we introduce Harvard Glaucoma Fairness (Harvard-GF), a retinal nerve disease dataset including 3,300 subjects with both 2D and 3D imaging data and balanced racial groups for glaucoma detection. Glaucoma is the leading cause of irreversible blindness globally with Blacks having doubled glaucoma prevalence than other races. We also propose a fair identity normalization (FIN) approach to equalize the feature importance between different identity groups. Our FIN approach is compared with various state-of-the-art fairness learning methods with superior performance in the racial, gender, and ethnicity fairness tasks with 2D and 3D imaging data, demonstrating the utilities of our dataset Harvard-GF for fairness learning. To facilitate fairness comparisons between different models, we propose an equity-scaled performance measure, which can be flexibly used to compare all kinds of performance metrics in the context of fairness. The dataset and code are publicly accessible via https://ophai.hms.harvard.edu/datasets/harvard-gf3300/.
机器学习中的公平（也称为公平）对于社会福祉很重要，但有限的公共数据集阻碍了其进步。目前，没有专门的公共医疗数据集和用于公平学习的成像数据，尽管代表性不足的群体遭受更多的健康问题。为了解决这一差距，我们引入了哈佛青光眼公平 (Harvard-GF)，这是一个视网膜神经疾病数据集，包括 3,300 名受试者，具有 2D 和 3D 成像数据以及用于青光眼检测的平衡种族群体。青光眼是全球不可逆性失明的主要原因，黑人的青光眼患病率是其他种族的两倍。我们还提出了一种公平身份标准化（FIN）方法来均衡不同身份组之间的特征重要性。我们的 FIN 方法与各种最先进的公平学习方法进行了比较，这些方法在使用 2D 和 3D 成像数据的种族、性别和民族公平任务中表现出色，证明了我们的数据集Harvard-GF 在公平学习方面的实用性。为了便于不同模型之间的公平性比较，我们提出了一种公平尺度的绩效衡量标准，可以灵活地用于在公平性的背景下比较各种绩效指标。数据集和代码可通过 https://ophai.hms.harvard.edu/datasets/harvard-gf3300/ 公开访问。

AU Zhao, Zihao Wang, Sheng Gu, Jinchen Zhu, Yitao Mei, Lanzhuju Zhuang, Zixu Cui, Zhiming Wang, Qian Shen, Dinggang
赵AU、王子豪、顾胜、朱金晨、梅一涛、庄兰珠菊、崔子旭、王志明、沉谦、丁刚

ChatCAD+: Towards a Universal and Reliable Interactive CAD using LLMs.
ChatCAD+：使用LLMs实现通用且可靠的交互式 CAD。

The integration of Computer-Aided Diagnosis (CAD) with Large Language Models (LLMs) presents a promising frontier in clinical applications, notably in automating diagnostic processes akin to those performed by radiologists and providing consultations similar to a virtual family doctor. Despite the promising potential of this integration, current works face at least two limitations: (1) From the perspective of a radiologist, existing studies typically have a restricted scope of applicable imaging domains, failing to meet the diagnostic needs of different patients. Also, the insufficient diagnostic capability of LLMs further undermine the quality and reliability of the generated medical reports. (2) Current LLMs lack the requisite depth in medical expertise, rendering them less effective as virtual family doctors due to the potential unreliability of the advice provided during patient consultations. To address these limitations, we introduce ChatCAD+, to be universal and reliable. Specifically, it is featured by two main modules: (1) Reliable Report Generation and (2) Reliable Interaction. The Reliable Report Generation module is capable of interpreting medical images from diverse domains and generate high-quality medical reports via our proposed hierarchical in-context learning. Concurrently, the interaction module leverages up-to-date information from reputable medical websites to provide reliable medical advice. Together, these designed modules synergize to closely align with the expertise of human medical professionals, offering enhanced consistency and reliability for interpretation and advice. The source code is available at GitHub.
计算机辅助诊断 (CAD) 与大型语言模型 ( LLMs ) 的集成在临床应用中展现了一个充满希望的前沿，特别是在类似于放射科医生执行的自动化诊断过程以及提供类似于虚拟家庭医生的咨询方面。尽管这种整合具有广阔的前景，但目前的工作至少面临两个局限性：（1）从放射科医生的角度来看，现有的研究通常适用的成像领域范围有限，无法满足不同患者的诊断需求。此外， LLMs诊断能力不足进一步损害了生成的医疗报告的质量和可靠性。 (2) 目前的LLMs缺乏必要的医学专业知识深度，由于在患者咨询期间提供的建议可能不可靠，导致他们作为虚拟家庭医生的效率较低。为了解决这些限制，我们引入了通用且可靠的 ChatCAD+。具体来说，它具有两个主要模块：（1）可靠的报告生成和（2）可靠的交互。可靠的报告生成模块能够解释来自不同领域的医学图像，并通过我们提出的分层上下文学习生成高质量的医学报告。同时，交互模块利用知名医疗网站的最新信息来提供可靠的医疗建议。这些设计的模块协同作用，与人类医疗专业人员的专业知识紧密结合，提供增强的解释和建议的一致性和可靠性。源代码可在 GitHub 上获取。

AU He, Along Li, Tao Yan, Juncheng Wang, Kai Fu, Huazhu
区赫、李阿龙、严涛、王俊成、付凯、华珠

Bilateral Supervision Network for Semi-Supervised Medical Image Segmentation
用于半监督医学图像分割的双边监督网络

Massive high-quality annotated data is required by fully-supervised learning, which is difficult to obtain for image segmentation since the pixel-level annotation is expensive, especially for medical image segmentation tasks that need domain knowledge. As an alternative solution, semi-supervised learning (SSL) can effectively alleviate the dependence on the annotated samples by leveraging abundant unlabeled samples. Among the SSL methods, mean-teacher (MT) is the most popular one. However, in MT, teacher model's weights are completely determined by student model's weights, which will lead to the training bottleneck at the late training stages. Besides, only pixel-wise consistency is applied for unlabeled data, which ignores the category information and is susceptible to noise. In this paper, we propose a bilateral supervision network with bilateral exponential moving average (bilateral-EMA), named BSNet to overcome these issues. On the one hand, both the student and teacher models are trained on labeled data, and then their weights are updated with the bilateral-EMA, and thus the two models can learn from each other. On the other hand, pseudo labels are used to perform bilateral supervision for unlabeled data. Moreover, for enhancing the supervision, we adopt adversarial learning to enforce the network generate more reliable pseudo labels for unlabeled data. We conduct extensive experiments on three datasets to evaluate the proposed BSNet, and results show that BSNet can improve the semi-supervised segmentation performance by a large margin and surpass other state-of-the-art SSL methods.
全监督学习需要大量高质量的标注数据，而这对于图像分割来说是很难获得的，因为像素级标注的成本很高，特别是对于需要领域知识的医学图像分割任务。作为替代解决方案，半监督学习（SSL）可以利用大量的未标记样本，有效减轻对注释样本的依赖。在 SSL 方法中，平均教师 (MT) 是最流行的一种。然而，在MT中，教师模型的权重完全由学生模型的权重决定，这将导致训练后期出现训练瓶颈。此外，对于未标记的数据仅应用像素级一致性，这忽略了类别信息并且容易受到噪声的影响。在本文中，我们提出了一种具有双边指数移动平均线（双边-EMA）的双边监督网络，称为 BSNet 来克服这些问题。一方面，学生模型和教师模型都基于标记数据进行训练，然后使用双边 EMA 更新它们的权重，因此两个模型可以相互学习。另一方面，伪标签用于对未标签数据进行双边监督。此外，为了加强监督，我们采用对抗性学习来强制网络为未标记的数据生成更可靠的伪标签。我们在三个数据集上进行了广泛的实验来评估所提出的 BSNet，结果表明 BSNet 可以大幅提高半监督分割性能，并超越其他最先进的 SSL 方法。

AU Li, Yinsheng Feng, Juan Xiang, Jun Li, Zixiao Liang, Dong
区莉、冯寅生、向娟、李军、梁子晓、董

AIRPORT: A Data Consistency Constrained Deep Temporal Extrapolation Method To Improve Temporal Resolution In Contrast Enhanced CT Imaging
AIRPORT：一种数据一致性约束的深度时间外推方法，可提高对比增强 CT 成像的时间分辨率

Typical tomographic image reconstruction methods require that the imaged object is static and stationary during the time window to acquire a minimally complete data set. The violation of this requirement leads to temporal-averaging errors in the reconstructed images. For a fixed gantry rotation speed, to reduce the errors, it is desired to reconstruct images using data acquired over a narrower angular range, i.e., with a higher temporal resolution. However, image reconstruction with a narrower angular range violates the data sufficiency condition, resulting in severe data-insufficiency-induced errors. The purpose of this work is to decouple the trade-off between these two types of errors in contrast-enhanced computed tomography (CT) imaging. We demonstrated that using the developed data consistency constrained deep temporal extrapolation method (AIRPORT), the entire time-varying imaged object can be accurately reconstructed with 40 frames-per-second temporal resolution, the time window needed to acquire a single projection view data using a typical C-arm cone-beam CT system. AIRPORT is applicable to general non-sparse imaging tasks using a single short-scan data acquisition.
典型的断层扫描图像重建方法要求成像对象在时间窗口内是静止的，以获得最小完整的数据集。违反此要求会导致重建图像中的时间平均误差。对于固定的机架旋转速度，为了减少误差，需要使用在较窄的角度范围内（即，具有较高的时间分辨率）获取的数据来重建图像。然而，具有较窄角度范围的图像重建违反了数据充足性条件，导致严重的数据不足引起的错误。这项工作的目的是消除对比增强计算机断层扫描 (CT) 成像中这两类错误之间的权衡。我们证明，使用开发的数据一致性约束深度时间外推方法（AIRPORT），可以以每秒 40 帧的时间分辨率精确重建整个时变成像对象，这是获取单个投影视图数据所需的时间窗口典型的 C 形臂锥束 CT 系统。 AIRPORT 适用于使用单次短扫描数据采集的一般非稀疏成像任务。

AU Elbatel, Marawan Marti, Robert Li, Xiaomeng
AU Elbatel、Marawan Marti、Robert Li、小萌

FoPro-KD: Fourier Prompted Effective Knowledge Distillation for Long-Tailed Medical Image Recognition
FoPro-KD：傅立叶促进长尾医学图像识别的有效知识蒸馏

Representational transfer from publicly available models is a promising technique for improving medical image classification, especially in long-tailed datasets with rare diseases. However, existing methods often overlook the frequency-dependent behavior of these models, thereby limiting their effectiveness in transferring representations and generalizations to rare diseases. In this paper, we propose FoPro-KD, a novel framework that leverages the power of frequency patterns learned from frozen pre-trained models to enhance their transferability and compression, presenting a few unique insights: 1) We demonstrate that leveraging representations from publicly available pre-trained models can substantially improve performance, specifically for rare classes, even when utilizing representations from a smaller pre-trained model. 2) We observe that pre-trained models exhibit frequency preferences, which we explore using our proposed Fourier Prompt Generator (FPG), allowing us to manipulate specific frequencies in the input image, enhancing the discriminative representational transfer. 3) By amplifying or diminishing these frequencies in the input image, we enable Effective Knowledge Distillation (EKD). EKD facilitates the transfer of knowledge from pre-trained models to smaller models. Through extensive experiments in long-tailed gastrointestinal image recognition and skin lesion classification, where rare diseases are prevalent, our FoPro-KD framework outperforms existing methods, enabling more accessible medical models for rare disease classification.
来自公开可用模型的表征转移是一种有前途的改进医学图像分类的技术，特别是在罕见疾病的长尾数据集中。然而，现有的方法常常忽视这些模型的频率依赖性行为，从而限制了它们将表征和概括转移到罕见疾病的有效性。在本文中，我们提出了 FoPro-KD，这是一种新颖的框架，它利用从冻结的预训练模型中学习到的频率模式的力量来增强其可转移性和压缩性，并提出了一些独特的见解：1）我们证明了利用公开可用的表示预训练模型可以显着提高性能，特别是对于稀有类别，即使使用较小的预训练模型的表示也是如此。 2）我们观察到预训练的模型表现出频率偏好，我们使用我们提出的傅立叶提示生成器（FPG）进行探索，使我们能够操纵输入图像中的特定频率，从而增强判别性表征转移。 3）通过放大或减少输入图像中的这些频率，我们实现了有效知识蒸馏（EKD）。 EKD 有助于将知识从预训练模型转移到更小的模型。通过在罕见疾病普遍存在的长尾胃肠道图像识别和皮肤病变分类方面进行大量实验，我们的 FoPro-KD 框架优于现有方法，为罕见疾病分类提供了更容易访问的医学模型。

AU Fu, Li-Wei Liu, Chih-Hao Jain, Manu Chen, Chih-Shan Jason Wu, Yu-Hung Huang, Sheng-Lung Chen, Homer H.
AU Fu、Li-Wei Liu、Chih-Hao Jain、Manu Chen、Chih-Shan Jason Wu、Yu-Hung Huang、Sheng-Lung Chen、Homer H.

Training With Uncertain Annotations for Semantic Segmentation of Basal Cell Carcinoma From Full-Field OCT Images
使用不确定注释对全视野 OCT 图像进行基底细胞癌语义分割的训练

Semantic segmentation of basal cell carcinoma (BCC) from full-field optical coherence tomography (FF-OCT) images of human skin has received considerable attention in medical imaging. However, it is challenging for dermatopathologists to annotate the training data due to OCT's lack of color specificity. Very often, they are uncertain about the correctness of the annotations they made. In practice, annotations fraught with uncertainty profoundly impact the effectiveness of model training and hence the performance of BCC segmentation. To address this issue, we propose an approach to model training with uncertain annotations. The proposed approach includes a data selection strategy to mitigate the uncertainty of training data, a class expansion to consider sebaceous gland and hair follicle as additional classes to enhance the performance of BCC segmentation, and a self-supervised pre-training procedure to improve the initial weights of the segmentation model parameters. Furthermore, we develop three post-processing techniques to reduce the impact of speckle noise and image discontinuities on BCC segmentation. The mean Dice score of BCC of our model reaches 0.503 +/- 0.003, which, to the best of our knowledge, is the best performance to date for semantic segmentation of BCC from FF-OCT images.
从人类皮肤的全场光学相干断层扫描（FF-OCT）图像中对基底细胞癌（BCC）进行语义分割在医学成像领域受到了相当大的关注。然而，由于 OCT 缺乏颜色特异性，皮肤病理学家对训练数据进行注释具有挑战性。很多时候，他们不确定自己所做的注释的正确性。在实践中，充满不确定性的注释会深刻影响模型训练的有效性，从而影响 BCC 分割的性能。为了解决这个问题，我们提出了一种使用不确定注释进行模型训练的方法。所提出的方法包括用于减轻训练数据不确定性的数据选择策略、将皮脂腺和毛囊视为附加类以增强 BCC 分割性能的类扩展，以及用于改进初始模型的自监督预训练程序。分割模型参数的权重。此外，我们开发了三种后处理技术来减少散斑噪声和图像不连续性对 BCC 分割的影响。我们模型的 BCC 平均 Dice 分数达到 0.503 +/- 0.003，据我们所知，这是迄今为止对 FF-OCT 图像进行 BCC 语义分割的最佳性能。

AU Zhang, Jianjia Mao, Haiyang Wang, Xinran Guo, Yuan Wu, Weiwen
张AU、毛健佳、王海洋、郭欣然、吴媛、伟文

Wavelet-Inspired Multi-channel Score-based Model for Limited-angle CT Reconstruction.
用于有限角度 CT 重建的小波启发多通道基于评分的模型。

Score-based generative model (SGM) has demonstrated great potential in the challenging limited-angle CT (LA-CT) reconstruction. SGM essentially models the probability density of the ground truth data and generates reconstruction results by sampling from it. Nevertheless, direct application of the existing SGM methods to LA-CT suffers multiple limitations. Firstly, the directional distribution of the artifacts attributing to the missing angles is ignored. Secondly, the different distribution properties of the artifacts in different frequency components have not been fully explored. These drawbacks would inevitably degrade the estimation of the probability density and the reconstruction results. After an in-depth analysis of these factors, this paper proposes a Wavelet-Inspired Score-based Model (WISM) for LA-CT reconstruction. Specifically, besides training a typical SGM with the original images, the proposed method additionally performs the wavelet transform and models the probability density in each wavelet component with an extra SGM. The wavelet components preserve the spatial correspondence with the original image while performing frequency decomposition, thereby keeping the directional property of the artifacts for further analysis. On the other hand, different wavelet components possess more specific contents of the original image in different frequency ranges, simplifying the probability density modeling by decomposing the overall density into component-wise ones. The resulting two SGMs in the image-domain and wavelet-domain are integrated into a unified sampling process under the guidance of the observation data, jointly generating high-quality and consistent LA-CT reconstructions. The experimental evaluation on various datasets consistently verifies the superior performance of the proposed method over the competing method.
基于评分的生成模型 (SGM) 在具有挑战性的有限角度 CT (LA-CT) 重建中表现出了巨大的潜力。 SGM本质上是对地面真实数据的概率密度进行建模，并通过对其进行采样来生成重建结果。然而，将现有的 SGM 方法直接应用于 LA-CT 受到多种限制。首先，忽略了归因于缺失角度的伪影的方向分布。其次，不同频率分量中伪影的不同分布特性尚未得到充分探索。这些缺点将不可避免地降低概率密度的估计和重建结果。在对这些因素进行深入分析后，本文提出了一种用于 LA-CT 重建的小波启发评分模型（WISM）。具体来说，除了用原始图像训练典型的SGM之外，所提出的方法还执行小波变换并用额外的SGM对每个小波分量中的概率密度进行建模。小波分量在执行频率分解时保留了与原始图像的空间对应关系，从而保留了伪影的方向特性以供进一步分析。另一方面，不同的小波分量在不同的频率范围内拥有原始图像的更具体的内容，通过将整体密度分解为分量密度来简化概率密度建模。由此产生的图像域和小波域中的两个 SGM 在观测数据的指导下集成到统一的采样过程中，共同生成高质量且一致的 LA-CT 重建。对各种数据集的实验评估一致验证了所提出的方法相对于竞争方法的优越性能。

AU Zhang, Xiao Sun, Kaicong Wu, Dijia Xiong, Xiaosong Liu, Jiameng Yao, Linlin Li, Shufang Wang, Yining Feng, Jun Shen, Dinggang
张AU、孙晓、吴凯聪、熊迪佳、刘晓松、姚佳萌、李琳琳、王淑芳、冯伊宁、沉军、丁刚

An Anatomy- and Topology-Preserving Framework for Coronary Artery Segmentation
冠状动脉分割的解剖学和拓扑保留框架

Coronary artery segmentation is critical for coronary artery disease diagnosis but challenging due to its tortuous course with numerous small branches and inter-subject variations. Most existing studies ignore important anatomical information and vascular topologies, leading to less desirable segmentation performance that usually cannot satisfy clinical demands. To deal with these challenges, in this paper we propose an anatomy- and topology-preserving two-stage framework for coronary artery segmentation. The proposed framework consists of an anatomical dependency encoding (ADE) module and a hierarchical topology learning (HTL) module for coarse-to-fine segmentation, respectively. Specifically, the ADE module segments four heart chambers and aorta, and thus five distance field maps are obtained to encode distance between chamber surfaces and coarsely segmented coronary artery. Meanwhile, ADE also performs coronary artery detection to crop region-of-interest and eliminate foreground-background imbalance. The follow-up HTL module performs fine segmentation by exploiting three hierarchical vascular topologies, i.e., key points, centerlines, and neighbor connectivity using a multi-task learning scheme. In addition, we adopt a bottom-up attention interaction (BAI) module to integrate the feature representations extracted across hierarchical topologies. Extensive experiments on public and in-house datasets show that the proposed framework achieves state-of-the-art performance for coronary artery segmentation.
冠状动脉分割对于冠状动脉疾病的诊断至关重要，但由于其曲折的路线、大量的小分支和受试者间的差异，因此具有挑战性。大多数现有研究忽略了重要的解剖信息和血管拓扑，导致分割性能不太理想，通常无法满足临床需求。为了应对这些挑战，在本文中，我们提出了一种保留解剖学和拓扑结构的冠状动脉分割两阶段框架。所提出的框架由分别用于从粗到细分割的解剖依赖编码（ADE）模块和层次拓扑学习（HTL）模块组成。具体来说，ADE模块分割四个心室和主动脉，从而获得五个距离场图来编码心室表面和粗分割的冠状动脉之间的距离。同时，ADE 还执行冠状动脉检测以裁剪感兴趣区域并消除前景-背景不平衡。后续的 HTL 模块通过利用三个分层血管拓扑（即关键点、中心线和使用多任务学习方案的邻居连接）来执行精细分割。此外，我们采用自下而上的注意力交互（BAI）模块来集成跨层次拓扑提取的特征表示。对公共和内部数据集的广泛实验表明，所提出的框架实现了冠状动脉分割的最先进性能。

AU Tang, Yuqi Wang, Nanchao Dong, Zhijie Lowerison, Matthew Del Aguila, Angela Johnston, Natalie Vu, Tri Ma, Chenshuo Xu, Yirui Yang, Wei Song, Pengfei Yao, Junjie
AU Tang, Yuqi Wang, Nanchao Dong,zhijie Lowerison, Matthew Del Aguila, Angela Johnston, Natalie Vu, Tri Ma, Chenshuo Xu, Yirui Yang, Wei Song, Pengfei Yao, Junjie

Non-invasive Deep-Brain Imaging with 3D Integrated Photoacoustic Tomography and Ultrasound Localization Microscopy (3D-PAULM).
使用 3D 集成光声断层扫描和超声定位显微镜 (3D-PAULM) 进行非侵入性深脑成像。

Photoacoustic computed tomography (PACT) is a proven technology for imaging hemodynamics in deep brain of small animal models. PACT is inherently compatible with ultrasound (US) imaging, providing complementary contrast mechanisms. While PACT can quantify the brain's oxygen saturation of hemoglobin (sO2), US imaging can probe the blood flow based on the Doppler effect. Further, by tracking gas-filled microbubbles, ultrasound localization microscopy (ULM) can map the blood flow velocity with sub-diffraction spatial resolution. In this work, we present a 3D deep-brain imaging system that seamlessly integrates PACT and ULM into a single device, 3D-PAULM. Using a low ultrasound frequency of 4 MHz, 3D-PAULM is capable of imaging the brain hemodynamic functions with intact scalp and skull in a totally non-invasive manner. Using 3D-PAULM, we studied the mouse brain functions with ischemic stroke. Multi-spectral PACT, US B-mode imaging, microbubble-enhanced power Doppler (PD), and ULM were performed on the same mouse brain with intrinsic image co-registration. From the multi-modality measurements, we further quantified blood perfusion, sO2, vessel density, and flow velocity of the mouse brain, showing stroke-induced ischemia, hypoxia, and reduced blood flow. We expect that 3D-PAULM can find broad applications in studying deep brain functions on small animal models.
光声计算机断层扫描 (PACT) 是一种经过验证的技术，用于对小动物模型深部脑部血流动力学进行成像。 PACT 本质上与超声 (US) 成像兼容，提供互补的对比机制。 PACT 可以量化大脑血红蛋白的氧饱和度 (sO2)，而 US 成像可以基于多普勒效应探测血流。此外，通过跟踪充气微泡，超声定位显微镜（ULM）可以以亚衍射空间分辨率绘制血流速度图。在这项工作中，我们提出了一种 3D 深脑成像系统，它将 PACT 和 ULM 无缝集成到单个设备 3D-PAULM 中。 3D-PAULM 使用 4 MHz 的低超声频率，能够以完全无创的方式对完整头皮和头骨的大脑血流动力学功能进行成像。使用 3D-PAULM，我们研究了缺血性中风小鼠的大脑功能。多光谱 PACT、US B 模式成像、微泡增强功率多普勒 (PD) 和 ULM 在同一小鼠大脑上进行，并具有内在图像共同配准。通过多模态测量，我们进一步量化了小鼠大脑的血液灌注、sO2、血管密度和流速，显示中风引起的缺血、缺氧和血流量减少。我们期望 3D-PAULM 能够在小动物模型的深部脑功能研究中找到广泛的应用。

EI 1558-254X DA 2024-10-11 UT MEDLINE:39383084 PM 39383084 ER
EI 1558-254X DA 2024-10-11 UT MEDLINE：39383084 PM 39383084 ER

AU Park, Jinil Shin, Taehoon Park, Jang-Yeon
AU Park、申真一、朴泰勋、张妍

Three-Dimensional Variable Slab-Selective Projection Acquisition Imaging.
三维可变板选择性投影采集成像。

Three-dimensional (3D) projection acquisition (PA) imaging has recently gained attention because of its advantages, such as achievability of very short echo time, less sensitivity to motion, and undersampled acquisition of projections without sacrificing spatial resolution. However, larger subjects require a stronger Nyquist criterion and are more likely to be affected by outer-volume signals outside the field of view (FOV), which significantly degrades the image quality. Here, we proposed a variable slab-selective projection acquisition (VSS-PA) method to mitigate the Nyquist criterion and effectively suppress aliasing streak artifacts in 3D PA imaging. The proposed method involves maintaining the vertical orientation of the slab-selective gradient for frequency-selective spin excitation and the readout gradient for data acquisition. As VSS-PA can selectively excite spins only in the width of the desired FOV in the projection direction during data acquisition, the effective size of the scanned object that determines the Nyquist criterion can be reduced. Additionally, unwanted signals originating from outside the FOV (e.g., aliasing streak artifacts) can be effectively avoided. The mitigation of the Nyquist criterion owing to VSS-PA was theoretically described and confirmed through numerical simulations and phantom and human lung experiments. These experiments further showed that the aliasing streak artifacts were nearly suppressed.
三维 (3D) 投影采集 (PA) 成像最近因其优点而受到关注，例如可实现非常短的回波时间、对运动的敏感性较低以及在不牺牲空间分辨率的情况下对投影进行欠采样采集。然而，较大的拍摄对象需要更强的奈奎斯特准则，并且更有可能受到视场 (FOV) 之外的外部体积信号的影响，从而显着降低图像质量。在这里，我们提出了一种可变平板选择性投影采集（VSS-PA）方法来减轻奈奎斯特准则并有效抑制 3D PA 成像中的混叠条纹伪影。所提出的方法涉及维持用于频率选择性自旋激发的板选择梯度和用于数据采集的读出梯度的垂直方向。由于VSS-PA在数据采集过程中可以选择性地仅在投影方向上所需FOV的宽度内激发自旋，因此可以减小决定奈奎斯特准则的扫描物体的有效尺寸。此外，可以有效避免源自视场外部的不需要的信号（例如，混叠条纹伪影）。通过数值模拟、模型和人肺实验从理论上描述并证实了 VSS-PA 对奈奎斯特准则的缓解。这些实验进一步表明，混叠条纹伪影几乎被抑制。

EI 1558-254X DA 2024-10-02 UT MEDLINE:39348262 PM 39348262 ER
EI 1558-254X DA 2024-10-02 UT MEDLINE：39348262 PM 39348262 ER

AU Purma, Vishnuvardhan Srinath, Suhas Srirangarajan, Seshan Kakkar, Aanchal Prathosh, A P
AU Purma、Vishnuvardhan Srinath、Suhas Srirangarajan、Seshan Kakkar、Aanchal Prathosh、美联社

GenSelfDiff-HIS: Generative Self-Supervision Using Diffusion for Histopathological Image Segmentation.
GenSelfDiff-HIS：使用扩散进行组织病理学图像分割的生成自我监督。

Histopathological image segmentation is a laborious and time-intensive task, often requiring analysis from experienced pathologists for accurate examinations. To reduce this burden, supervised machine-learning approaches have been adopted using large-scale annotated datasets for histopathological image analysis. However, in several scenarios, the availability of large-scale annotated data is a bottleneck while training such models. Self-supervised learning (SSL) is an alternative paradigm that provides some respite by constructing models utilizing only the unannotated data which is often abundant. The basic idea of SSL is to train a network to perform one or many pseudo or pretext tasks on unannotated data and use it subsequently as the basis for a variety of downstream tasks. It is seen that the success of SSL depends critically on the considered pretext task. While there have been many efforts in designing pretext tasks for classification problems, there have not been many attempts on SSL for histopathological image segmentation. Motivated by this, we propose an SSL approach for segmenting histopathological images via generative diffusion models. Our method is based on the observation that diffusion models effectively solve an image-to-image translation task akin to a segmentation task. Hence, we propose generative diffusion as the pretext task for histopathological image segmentation. We also utilize a multi-loss function-based fine-tuning for the downstream task. We validate our method using several metrics on two publicly available datasets along with a newly proposed head and neck (HN) cancer dataset containing Hematoxylin and Eosin (H&E) stained images along with annotations.
组织病理学图像分割是一项费力且耗时的任务，通常需要经验丰富的病理学家进行分析才能进行准确的检查。为了减轻这种负担，已采用监督机器学习方法，使用大规模注释数据集进行组织病理学图像分析。然而，在某些情况下，大规模注释数据的可用性是训练此类模型的瓶颈。自监督学习（SSL）是一种替代范式，它通过仅利用通常丰富的未注释数据构建模型来提供一些喘息机会。 SSL 的基本思想是训练网络对未注释的数据执行一项或多项伪或借口任务，并随后将其用作各种下游任务的基础。可以看出，SSL 的成功关键取决于所考虑的借口任务。虽然人们在设计分类问题的借口任务方面做出了很多努力，但在 SSL 上进行组织病理学图像分割的尝试却很少。受此启发，我们提出了一种 SSL 方法，通过生成扩散模型分割组织病理学图像。我们的方法基于这样的观察：扩散模型有效地解决了类似于分割任务的图像到图像的翻译任务。因此，我们提出生成扩散作为组织病理学图像分割的借口任务。我们还利用基于多重损失函数的微调来进行下游任务。我们使用两个公开可用的数据集以及新提出的头颈（HN）癌症数据集的多个指标来验证我们的方法，该数据集包含苏木精和伊红（H＆E）染色图像以及注释。

EI 1558-254X DA 2024-09-04 UT MEDLINE:39222449 PM 39222449 ER
EI 1558-254X DA 2024-09-04 UT MEDLINE：39222449 PM 39222449 ER

AU Liu, Qiang Chao, Weian Wen, Ruyi Gong, Yubin Xi, Lei
刘AU、超强、文维安、龚如意、奚玉斌、雷

Optimized Excitation in Microwave-induced Thermoacoustic Imaging for Artifact Suppression.
用于抑制伪影的微波诱导热声成像中的优化激励。

Microwave-induced thermoacoustic imaging (M-TAI) allows the visualization of macroscopic and microscopic structures of bio-tissues. However, it suffers from severe inherent artifacts that might misguide the subsequent diagnostics and treatments of diseases. To overcome this limitation, we propose an optimized excitation strategy. In detail, the strategy integrates dynamically compound specific absorption rate (SAR) and co-planar configuration of polarization state, incident wave vector and imaging plane. Starting from the theoretical analysis, we interpret the underlying mechanism supporting the superiority of the optimized excitation strategy to achieve an effect equivalent to homogenizing the deposited electromagnetic energy in bio-tissues. The following numerical simulations demonstrate that the strategy enables better preservation of the conductivity weighting of samples while increasing Pearson correlation coefficient. Furthermore, the in vitro and in vivo M-TAI experiments validate the effectiveness and robustness of this optimized excitation strategy in artifact suppression, allowing the simultaneous identification of both boundary and inside fine structures within bio-tissues. All the results suggest that the optimized excitation strategy can be expanded to diverse scenarios, inspiring more suitable strategies that remarkably suppress the inherent artifacts in M-TAI.
微波诱导热声成像 (M-TAI) 可以实现生物组织的宏观和微观结构的可视化。然而，它存在严重的固有伪影，可能会误导随后的疾病诊断和治疗。为了克服这个限制，我们提出了一种优化的激励策略。具体来说，该策略动态集成了复合比吸收率（SAR）以及偏振态、入射波矢量和成像平面的共面配置。从理论分析开始，我们解释了支持优化激励策略的优越性的基本机制，以实现相当于均匀化生物组织中沉积的电磁能的效果。以下数值模拟表明，该策略能够更好地保留样品的电导率权重，同时增加皮尔逊相关系数。此外，体外和体内 M-TAI 实验验证了这种优化的激励策略在伪影抑制中的有效性和鲁棒性，允许同时识别生物组织内的边界和内部精细结构。所有结果表明，优化的激励策略可以扩展到不同的场景，激发更合适的策略，显着抑制 M-TAI 中的固有伪影。

AU Sogancioglu, Ecem van Ginneken, Bram Behrendt, Finn Bengs, Marcel Schlaefer, Alexander Radu, Miron Xu, Di Sheng, Ke Scalzo, Fabien Marcus, Eric Papa, Samuele Teuwen, Jonas Scholten, Ernst Th. Schalekamp, Steven Hendrix, Nils Jacobs, Colin Hendrix, Ward Sanchez, Clara I. Murphy, Keelin
AU Sogancioglu、Ecem van Ginneken、Bram Behrendt、Finn Bengs、Marcel Schlaefer、Alexander Radu、Miron Xu、Di Shen、Ke Scalzo、Fabien Marcus、Eric Papa、Samuele Teuwen、Jonas Scholten、Ernst Th。沙勒坎普、史蒂文·亨德里克斯、尼尔斯·雅各布斯、科林·亨德里克斯、沃德·桑切斯、克拉拉·墨菲、基林

Nodule Detection and Generation on Chest X-Rays: NODE21 Challenge
胸部 X 射线结节检测和生成：NODE21 挑战

Pulmonary nodules may be an early manifestation of lung cancer, the leading cause of cancer-related deaths among both men and women. Numerous studies have established that deep learning methods can yield high-performance levels in the detection of lung nodules in chest X-rays. However, the lack of gold-standard public datasets slows down the progression of the research and prevents benchmarking of methods for this task. To address this, we organized a public research challenge, NODE21, aimed at the detection and generation of lung nodules in chest X-rays. While the detection track assesses state-of-the-art nodule detection systems, the generation track determines the utility of nodule generation algorithms to augment training data and hence improve the performance of the detection systems. This paper summarizes the results of the NODE21 challenge and performs extensive additional experiments to examine the impact of the synthetically generated nodule training images on the detection algorithm performance.
肺结节可能是肺癌的早期表现，肺癌是男性和女性癌症相关死亡的主要原因。大量研究表明，深度学习方法可以在胸部 X 光检查中的肺结节检测中达到高性能水平。然而，缺乏黄金标准的公共数据集会减慢研究的进展，并阻碍对该任务的方法进行基准测试。为了解决这个问题，我们组织了一项公共研究挑战赛 NODE21，旨在检测和生成胸部 X 射线中的肺结节。虽然检测轨迹评估最先进的结节检测系统，但生成轨迹确定结节生成算法在增强训练数据方面的效用，从而提高检测系统的性能。本文总结了 NODE21 挑战的结果，并进行了广泛的额外实验，以检查综合生成的结节训练图像对检测算法性能的影响。

AU Wu, Ruoyou Li, Cheng Zou, Juan Liu, Xinfeng Zheng, Hairong Wang, Shanshan
吴AU、李若友、邹成、刘娟、郑新峰、王海蓉、珊珊

Generalizable Reconstruction for Accelerating MR Imaging via Federated Learning with Neural Architecture Search.
通过神经架构搜索联合学习加速 MR 成像的可推广重建。

Heterogeneous data captured by different scanning devices and imaging protocols can affect the generalization performance of the deep learning magnetic resonance (MR) reconstruction model. While a centralized training model is effective in mitigating this problem, it raises concerns about privacy protection. Federated learning is a distributed training paradigm that can utilize multi-institutional data for collaborative training without sharing data. However, existing federated learning MR image reconstruction methods rely on models designed manually by experts, which are complex and computationally expensive, suffering from performance degradation when facing heterogeneous data distributions. In addition, these methods give inadequate consideration to fairness issues, namely ensuring that the model's training does not introduce bias towards any specific dataset's distribution. To this end, this paper proposes a generalizable federated neural architecture search framework for accelerating MR imaging (GAutoMRI). Specifically, automatic neural architecture search is investigated for effective and efficient neural network representation learning of MR images from different centers. Furthermore, we design a fairness adjustment approach that can enable the model to learn features fairly from inconsistent distributions of different devices and centers, and thus facilitate the model to generalize well to the unseen center. Extensive experiments show that our proposed GAutoMRI has better performances and generalization ability compared with seven state-of-the-art federated learning methods. Moreover, the GAutoMRI model is significantly more lightweight, making it an efficient choice for MR image reconstruction tasks. The code will be made available at https://github.com/ternencewu123/GAutoMRI.
不同扫描设备和成像协议捕获的异构数据会影响深度学习磁共振（MR）重建模型的泛化性能。虽然集中式训练模型可以有效缓解这一问题，但它引起了人们对隐私保护的担忧。联邦学习是一种分布式训练范式，可以利用多机构数据进行协作训练，而无需共享数据。然而，现有的联邦学习MR图像重建方法依赖于专家手动设计的模型，模型复杂且计算成本高，在面对异构数据分布时性能会下降。此外，这些方法没有充分考虑公平性问题，即确保模型的训练不会对任何特定数据集的分布引入偏差。为此，本文提出了一种用于加速 MR 成像的通用联邦神经架构搜索框架（GAutoMRI）。具体来说，研究了自动神经架构搜索，以对来自不同中心的 MR 图像进行有效且高效的神经网络表示学习。此外，我们设计了一种公平性调整方法，可以使模型从不同设备和中心的不一致分布中公平地学习特征，从而促进模型很好地泛化到不可见的中心。大量实验表明，与七种最先进的联邦学习方法相比，我们提出的 GAutoMRI 具有更好的性能和泛化能力。此外，GAutoMRI 模型明显更加轻量级，使其成为 MR 图像重建任务的有效选择。该代码将在 https://github 上提供。com/ternencewu123/GAutoMRI。

AU Zhou, Chengfeng Wang, Jun Xiang, Suncheng Liu, Feng Huang, Hefeng Qian, Dahong
周AU、王成峰、向军、刘孙成、黄峰、钱鹤峰、大洪

A Simple Normalization Technique Using Window Statistics to Improve the Out-of-Distribution Generalization on Medical Images
一种利用窗口统计改进医学图像的分布外泛化的简单归一化技术

Since data scarcity and data heterogeneity are prevailing for medical images, well-trained Convolutional Neural Networks (CNNs) using previous normalization methods may perform poorly when deployed to a new site. However, a reliable model for real-world clinical applications should generalize well both on in-distribution (IND) and out-of-distribution (OOD) data (e.g., the new site data). In this study, we present a novel normalization technique called window normalization (WIN) to improve the model generalization on heterogeneous medical images, which offers a simple yet effective alternative to existing normalization methods. Specifically, WIN perturbs the normalizing statistics with the local statistics computed within a window. This feature-level augmentation technique regularizes the models well and improves their OOD generalization significantly. Leveraging its advantage, we propose a novel self-distillation method called WIN-WIN. WIN-WIN can be easily implemented with two forward passes and a consistency constraint, serving as a simple extension to existing methods. Extensive experimental results on various tasks (6 tasks) and datasets (24 datasets) demonstrate the generality and effectiveness of our methods.
由于医学图像普遍存在数据稀缺和数据异质性，因此使用以前的标准化方法训练有素的卷积神经网络 (CNN) 在部署到新站点时可能表现不佳。然而，真实世界临床应用的可靠模型应该能够很好地推广分布内（IND）和分布外（OOD）数据（例如，新站点数据）。在这项研究中，我们提出了一种称为窗口归一化（WIN）的新颖归一化技术，以改进异构医学图像的模型泛化，它为现有归一化方法提供了一种简单而有效的替代方法。具体来说，WIN 使用窗口内计算的本地统计数据扰乱标准化统计数据。这种特征级增强技术可以很好地规范模型并显着提高其 OOD 泛化能力。利用其优势，我们提出了一种称为 WIN-WIN 的新型自蒸馏方法。 WIN-WIN 可以通过两次前向传递和一致性约束轻松实现，作为现有方法的简单扩展。各种任务（6 个任务）和数据集（24 个数据集）的广泛实验结果证明了我们方法的通用性和有效性。

AU Martinez-Sanchez, Antonio Lamm, Lorenz Jasnin, Marion Phelippeau, Harold
AU 马丁内斯-桑切斯、安东尼奥·拉姆、洛伦兹·贾斯宁、马里昂·费利波、哈罗德

Simulating the cellular context in synthetic datasets for cryo-electron tomography.
模拟冷冻电子断层扫描合成数据集中的细胞环境。

Cryo-electron tomography (cryo-ET) allows to visualize the cellular context at macromolecular level. To date, the impossibility of obtaining a reliable ground truth is limiting the application of deep learning-based image processing algorithms in this field. As a consequence, there is a growing demand of realistic synthetic datasets for training deep learning algorithms. In addition, besides assisting the acquisition and interpretation of experimental data, synthetic tomograms are used as reference models for cellular organization analysis from cellular tomograms. Current simulators in cryo-ET focus on reproducing distortions from image acquisition and tomogram reconstruction, however, they can not generate many of the low order features present in cellular tomograms. Here we propose several geometric and organization models to simulate low order cellular structures imaged by cryo-ET. Specifically, clusters of any known cytosolic or membrane bound macromolecules, membranes with different geometries as well as different filamentous structures such as microtubules or actin-like networks. Moreover, we use parametrizable stochastic models to generate a high diversity of geometries and organizations to simulate representative and generalized datasets, including very crowded environments like those observed in native cells. These models have been implemented in a multiplatform open-source Python package, including scripts to generate cryo-tomograms with adjustable sizes and resolutions. In addition, these scripts provide also distortion-free density maps besides the ground truth in different file formats for efficient access and advanced visualization. We show that such a realistic synthetic dataset can be readily used to train generalizable deep learning algorithms.
冷冻电子断层扫描 (cryo-ET) 可以在大分子水平上可视化细胞环境。迄今为止，无法获得可靠的地面事实限制了基于深度学习的图像处理算法在该领域的应用。因此，对用于训练深度学习算法的真实合成数据集的需求不断增长。此外，除了协助实验数据的获取和解释之外，合成断层图还用作细胞断层图细胞组织分析的参考模型。目前冷冻电子断层扫描中的模拟器专注于再现图像采集和断层扫描重建中的失真，但是它们无法生成细胞断层扫描中存在的许多低阶特征。在这里，我们提出了几种几何和组织模型来模拟冷冻电子断层成像的低阶细胞结构。具体来说，任何已知的胞质或膜结合大分子、具有不同几何形状的膜以及不同的丝状结构（例如微管或肌动蛋白样网络）的簇。此外，我们使用可参数化的随机模型来生成高度多样化的几何形状和组织，以模拟代表性和广义的数据集，包括非常拥挤的环境，例如在本地细胞中观察到的环境。这些模型已在多平台开源 Python 包中实现，包括用于生成大小和分辨率可调的冷冻断层图的脚本。此外，除了不同文件格式的地面实况之外，这些脚本还提供无失真密度图，以实现高效访问和高级可视化。我们证明，这样一个真实的合成数据集可以很容易地用于训练可推广的深度学习算法。

AU Li, Zilong Gao, Qi Wu, Yaping Niu, Chuang Zhang, Junping Wang, Meiyun Wang, Ge Shan, Hongming
AU Li、高子龙、吴奇、牛亚平、张闯、王军平、王美云、单戈、洪明

Quad-Net: Quad-Domain Network for CT Metal Artifact Reduction
Quad-Net：用于减少 CT 金属伪影的四域网络

Metal implants and other high-density objects in patients introduce severe streaking artifacts in CT images, compromising image quality and diagnostic performance. Although various methods were developed for CT metal artifact reduction over the past decades, including the latest dual-domain deep networks, remaining metal artifacts are still clinically challenging in many cases. Here we extend the state-of-the-art dual-domain deep network approach into a quad-domain counterpart so that all the features in the sinogram, image, and their corresponding Fourier domains are synergized to eliminate metal artifacts optimally without compromising structural subtleties. Our proposed quad-domain network for MAR, referred to as Quad-Net, takes little additional computational cost since the Fourier transform is highly efficient, and works across the four receptive fields to learn both global and local features as well as their relations. Specifically, we first design a Sinogram-Fourier Restoration Network (SFR-Net) in the sinogram domain and its Fourier space to faithfully inpaint metal-corrupted traces. Then, we couple SFR-Net with an Image-Fourier Refinement Network (IFR-Net) which takes both an image and its Fourier spectrum to improve a CT image reconstructed from the SFR-Net output using cross-domain contextual information. Quad-Net is trained on clinical datasets to minimize a composite loss function. Quad-Net does not require precise metal masks, which is of great importance in clinical practice. Our experimental results demonstrate the superiority of Quad-Net over the state-of-the-art MAR methods quantitatively, visually, and statistically. The Quad-Net code is publicly available at https://github.com/longzilicart/Quad-Net.
患者体内的金属植入物和其他高密度物体会在 CT 图像中引入严重的条纹伪影，从而影响图像质量和诊断性能。尽管过去几十年来开发了各种减少 CT 金属伪影的方法，包括最新的双域深度网络，但在许多情况下，残留的金属伪影在临床上仍然具有挑战性。在这里，我们将最先进的双域深度网络方法扩展到四域对应物，以便协同正弦图、图像及其相应傅里叶域中的所有特征，以最佳方式消除金属伪影，而不影响结构的微妙之处。我们提出的 MAR 四域网络（称为 Quad-Net）几乎不需要额外的计算成本，因为傅里叶变换非常高效，并且跨四个感受野工作以学习全局和局部特征及其关系。具体来说，我们首先在正弦图域及其傅立叶空间中设计正弦图-傅立叶恢复网络（SFR-Net），以忠实地修复金属损坏的痕迹。然后，我们将 SFR-Net 与图像傅里叶细化网络 (IFR-Net) 结合起来，该网络采用图像及其傅里叶频谱来改进使用跨域上下文信息从 SFR-Net 输出重建的 CT 图像。 Quad-Net 在临床数据集上进行训练，以最小化复合损失函数。 Quad-Net不需要精确的金属掩模，这在临床实践中非常重要。我们的实验结果在定量、视觉和统计方面证明了 Quad-Net 相对于最先进的 MAR 方法的优越性。 Quad-Net 代码可在 https://github.com/longzilicart/Quad-Net 上公开获取。

AU Vray, Guillaume Tomar, Devavrat Bozorgtabar, Behzad Thiran, Jean-Philippe
AU Vray、纪尧姆·托马尔、Devavrat Bozorgtabar、Behzad Thiran、Jean-Philippe

Distill-SODA: Distilling Self-Supervised Vision Transformer for Source-Free Open-Set Domain Adaptation in Computational Pathology
Distill-SODA：蒸馏自监督视觉变压器，用于计算病理学中的无源开放集域适应

Developing computational pathology models is essential for reducing manual tissue typing from whole slide images, transferring knowledge from the source domain to an unlabeled, shifted target domain, and identifying unseen categories. We propose a practical setting by addressing the above-mentioned challenges in one fell swoop, i.e., source-free open-set domain adaptation. Our methodology focuses on adapting a pre-trained source model to an unlabeled target dataset and encompasses both closed-set and open-set classes. Beyond addressing the semantic shift of unknown classes, our framework also deals with a covariate shift, which manifests as variations in color appearance between source and target tissue samples. Our method hinges on distilling knowledge from a self-supervised vision transformer (ViT), drawing guidance from either robustly pre-trained transformer models or histopathology datasets, including those from the target domain. In pursuit of this, we introduce a novel style-based adversarial data augmentation, serving as hard positives for self-training a ViT, resulting in highly contextualized embeddings. Following this, we cluster semantically akin target images, with the source model offering weak pseudo-labels, albeit with uncertain confidence. To enhance this process, we present the closed-set affinity score (CSAS), aiming to correct the confidence levels of these pseudo-labels and to calculate weighted class prototypes within the contextualized embedding space. Our approach establishes itself as state-of-the-art across three public histopathological datasets for colorectal cancer assessment. Notably, our self-training method seamlessly integrates with open-set detection methods, resulting in enhanced performance in both closed-set and open-set recognition tasks.
开发计算病理学模型对于减少整个幻灯片图像的手动组织分型、将知识从源域转移到未标记的、转移的目标域以及识别看不见的类别至关重要。我们提出了一种实用的设置，通过一举解决上述挑战，即无源开放集域适应。我们的方法侧重于使预训练的源模型适应未标记的目标数据集，并涵盖封闭集和开放集类。除了解决未知类别的语义转变之外，我们的框架还处理协变量转变，这表现为源组织样本和目标组织样本之间颜色外观的变化。我们的方法取决于从自监督视觉转换器（ViT）中提取知识，从经过严格预训练的转换器模型或组织病理学数据集（包括来自目标域的数据集）中获取指导。为了实现这一目标，我们引入了一种新颖的基于风格的对抗性数据增强，作为自我训练 ViT 的硬性积极因素，从而产生高度情境化的嵌入。接下来，我们对语义上相似的目标图像进行聚类，源模型提供弱伪标签，尽管置信度不确定。为了增强这个过程，我们提出了闭集亲和力评分（CSAS），旨在纠正这些伪标签的置信水平并计算上下文嵌入空间内的加权类原型。我们的方法在结直肠癌评估的三个公共组织病理学数据集中确立了最先进的方法。值得注意的是，我们的自我训练方法与开放集检测方法无缝集成，从而提高了封闭集和开放集识别任务的性能。

AU Gui, Shuangchun Wang, Zhenkun Chen, Jixiang Zhou, Xun Zhang, Chen Cao, Yi
区桂、王双春、陈振坤、周吉祥、张迅、曹晨、易

MT4MTL-KD: A Multi-Teacher Knowledge Distillation Framework for Triplet Recognition
MT4MTL-KD：用于三元组识别的多教师知识蒸馏框架

The recognition of surgical triplets plays a critical role in the practical application of surgical videos. It involves the sub-tasks of recognizing instruments, verbs, and targets, while establishing precise associations between them. Existing methods face two significant challenges in triplet recognition: 1) the imbalanced class distribution of surgical triplets may lead to spurious task association learning, and 2) the feature extractors cannot reconcile local and global context modeling. To overcome these challenges, this paper presents a novel multi-teacher knowledge distillation framework for multi-task triplet learning, known as MT4MTL-KD. MT4MTL-KD leverages teacher models trained on less imbalanced sub-tasks to assist multi-task student learning for triplet recognition. Moreover, we adopt different categories of backbones for the teacher and student models, facilitating the integration of local and global context modeling. To further align the semantic knowledge between the triplet task and its sub-tasks, we propose a novel feature attention module (FAM). This module utilizes attention mechanisms to assign multi-task features to specific sub-tasks. We evaluate the performance of MT4MTL-KD on both the 5-fold cross-validation and the CholecTriplet challenge splits of the CholecT45 dataset. The experimental results consistently demonstrate the superiority of our framework over state-of-the-art methods, achieving significant improvements of up to 6.4% on the cross-validation split.
手术三联体的识别在手术视频的实际应用中起着至关重要的作用。它涉及识别仪器、动词和目标，同时在它们之间建立精确关联的子任务。现有方法在三元组识别中面临两个重大挑战：1）手术三元组的不平衡类别分布可能导致虚假的任务关联学习，2）特征提取器无法协调局部和全局上下文建模。为了克服这些挑战，本文提出了一种用于多任务三元组学习的新型多教师知识蒸馏框架，称为 MT4MTL-KD。 MT4MTL-KD 利用在不平衡性较小的子任务上训练的教师模型来协助多任务学生学习三联体识别。此外，我们为教师和学生模型采用不同类别的骨干网，促进本地和全局上下文建模的集成。为了进一步协调三元组任务及其子任务之间的语义知识，我们提出了一种新颖的特征注意模块（FAM）。该模块利用注意力机制将多任务特征分配给特定的子任务。我们评估了 MT4MTL-KD 在 CholecT45 数据集的 5 倍交叉验证和 CholecTriplet 挑战分割上的性能。实验结果一致证明了我们的框架相对于最先进的方法的优越性，在交叉验证分割上实现了高达 6.4% 的显着改进。

AU Ma, Kai Wen, Xuyun Zhu, Qi Zhang, Daoqiang
区马、文凯、朱旭云、张琪、道强

Ordinal Pattern Tree: A New Representation Method for Brain Network Analysis
序数模式树：一种新的脑网络分析表示方法

Brain networks, describing the functional or structural interactions of brain with graph theory, have been widely used for brain imaging analysis. Currently, several network representation methods have been developed for describing and analyzing brain networks. However, most of these methods ignored the valuable weighted information of the edges in brain networks. In this paper, we propose a new representation method (i.e., ordinal pattern tree) for brain network analysis. Compared with the existing network representation methods, the proposed ordinal pattern tree (OPT) can not only leverage the weighted information of the edges but also express the hierarchical relationships of nodes in brain networks. On OPT, nodes are connected by ordinal edges which are constructed by using the ordinal pattern relationships of weighted edges. We represent brain networks as OPTs and further develop a new graph kernel called optimal transport (OT) based ordinal pattern tree (OT-OPT) kernel to measure the similarity between paired brain networks. In OT-OPT kernel, the OT distances are used to calculate the transport costs between the nodes on the OPTs. Based on these OT distances, we use exponential function to calculate OT-OPT kernel which is proved to be positive definite. To evaluate the effectiveness of the proposed method, we perform classification and regression experiments on ADHD-200, ABIDE and ADNI datasets. The experimental results demonstrate that our proposed method outperforms the state-of-the-art graph methods in the classification and regression tasks.
脑网络用图论描述大脑的功能或结构相互作用，已广泛用于脑成像分析。目前，已经开发了几种网络表示方法来描述和分析大脑网络。然而，这些方法大多数都忽略了大脑网络中边缘的有价值的加权信息。在本文中，我们提出了一种用于脑网络分析的新表示方法（即序数模式树）。与现有的网络表示方法相比，所提出的序数模式树（OPT）不仅可以利用边缘的加权信息，还可以表达脑网络中节点的层次关系。在 OPT 上，节点通过序数边连接，序数边是使用加权边的序数模式关系构造的。我们将大脑网络表示为 OPT，并进一步开发了一种新的图内核，称为基于最优传输 (OT) 的序数模式树 (OT-OPT) 内核，以测量配对大脑网络之间的相似性。在OT-OPT内核中，OT距离用于计算OPT上节点之间的传输成本。基于这些OT距离，我们使用指数函数来计算OT-OPT核，并证明该核是正定的。为了评估所提出方法的有效性，我们在 ADHD-200、ABIDE 和 ADNI 数据集上进行分类和回归实验。实验结果表明，我们提出的方法在分类和回归任务中优于最先进的图方法。

AU Li, Zihan Li, Yunxiang Li, Qingde Wang, Puyang Guo, Dazhou Lu, Le Jin, Dakai Zhang, You Hong, Qingqi
李AU、李子涵、李云翔、王庆德、郭濮阳、路大舟、金乐、张大开、尤红、庆琪

LViT: Language Meets Vision Transformer in Medical Image Segmentation
LViT：医学图像分割中语言与视觉转换器的结合

Deep learning has been widely used in medical image segmentation and other aspects. However, the performance of existing medical image segmentation models has been limited by the challenge of obtaining sufficient high-quality labeled data due to the prohibitive data annotation cost. To alleviate this limitation, we propose a new text-augmented medical image segmentation model LViT (Language meets Vision Transformer). In our LViT model, medical text annotation is incorporated to compensate for the quality deficiency in image data. In addition, the text information can guide to generate pseudo labels of improved quality in the semi-supervised learning. We also propose an Exponential Pseudo label Iteration mechanism (EPI) to help the Pixel-Level Attention Module (PLAM) preserve local image features in semi-supervised LViT setting. In our model, LV (Language-Vision) loss is designed to supervise the training of unlabeled images using text information directly. For evaluation, we construct three multimodal medical segmentation datasets (image + text) containing X-rays and CT images. Experimental results show that our proposed LViT has superior segmentation performance in both fully-supervised and semi-supervised setting. The code and datasets are available at https://github.com/HUANGLIZI/LViT.
深度学习已广泛应用于医学图像分割等方面。然而，由于数据注释成本过高，现有医学图像分割模型的性能受到获取足够高质量标记数据的挑战的限制。为了缓解这一限制，我们提出了一种新的文本增强医学图像分割模型 LViT（Language meet Vision Transformer）。在我们的 LViT 模型中，结合了医学文本注释来弥补图像数据的质量缺陷。此外，文本信息可以指导在半监督学习中生成质量提高的伪标签。我们还提出了指数伪标签迭代机制（EPI）来帮助像素级注意力模块（PLAM）在半监督 LViT 设置中保留局部图像特征。在我们的模型中，LV（语言-视觉）损失旨在直接使用文本信息监督未标记图像的训练。为了进行评估，我们构建了三个包含 X 射线和 CT 图像的多模态医学分割数据集（图像 + 文本）。实验结果表明，我们提出的 LViT 在全监督和半监督设置中都具有优异的分割性能。代码和数据集可在 https://github.com/HUANGLIZI/LViT 获取。

AU Zhang, Yinglin Xi, Ruiling Zeng, Lingxi Towey, Dave Bai, Ruibin Higashita, Risa Liu, Jiang
AU 张、奚英林、曾瑞玲、Lingxi Towey、Dave Bai、Ruibin Higashita、Risa Liu、Jiang

Structural Priors Guided Network for the Corneal Endothelial Cell Segmentation
用于角膜内皮细胞分割的结构先验引导网络

The segmentation of blurred cell boundaries in cornea endothelium microscope images is challenging, which affects the clinical parameter estimation accuracy. Existing deep learning methods only consider pixel-wise classification accuracy and lack of utilization of cell structure knowledge. Therefore, the segmentation of the blurred cell boundary is discontinuous. This paper proposes a structural prior guided network (SPG-Net) for corneal endothelium cell segmentation. We first employ a hybrid transformer convolution backbone to capture more global context. Then, we use Feature Enhancement (FE) module to improve the representation ability of features and Local Affinity-based Feature Fusion (LAFF) module to propagate structural information among hierarchical features. Finally, we introduce the joint loss based on cross entropy and structure similarity index measure (SSIM) to supervise the training process under pixel and structure levels. We compare the SPG-Net with various state-of-the-art methods on four corneal endothelial datasets. The experiment results suggest that the SPG-Net can alleviate the problem of discontinuous cell boundary segmentation and balance the pixel-wise accuracy and structure preservation. We also evaluate the agreement of parameter estimation between ground truth and the prediction of SPG-Net. The statistical analysis results show a good agreement and correlation.
角膜内皮显微镜图像中模糊细胞边界的分割具有挑战性，这影响了临床参数估计的准确性。现有的深度学习方法仅考虑像素级分类精度，缺乏对细胞结构知识的利用。因此，模糊细胞边界的分割是不连续的。本文提出了一种用于角膜内皮细胞分割的结构先验引导网络（SPG-Net）。我们首先采用混合变压器卷积主干来捕获更多全局上下文。然后，我们使用特征增强（FE）模块来提高特征的表示能力，并使用基于局部亲和力的特征融合（LAFF）模块来在层次特征之间传播结构信息。最后，我们引入基于交叉熵和结构相似性指数度量（SSIM）的联合损失来监督像素和结构级别下的训练过程。我们在四个角膜内皮数据集上将 SPG-Net 与各种最先进的方法进行比较。实验结果表明，SPG-Net 可以缓解不连续的细胞边界分割问题，并平衡像素精度和结构保留。我们还评估了真实值与 SPG-Net 预测之间参数估计的一致性。统计分析结果显示出良好的一致性和相关性。

AU Liu, Zhentao Fang, Yu Li, Changjian Wu, Han Liu, Yuan Shen, Dinggang Cui, Zhiming
刘AU、方振涛、李宇、吴昌建、刘涵、沉远、崔定刚、志明

Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction.
用于稀疏视图 CBCT 重建的几何感知衰减学习。

Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging. Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image, leading to considerable radiation exposure. This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses. While recent advances, including deep learning and neural rendering algorithms, have made strides in this area, these methods either produce unsatisfactory results or suffer from time inefficiency of individual optimization. In this paper, we introduce a novel geometry-aware encoder-decoder framework to solve this problem. Our framework starts by encoding multi-view 2D features from various 2D X-ray projections with a 2D CNN encoder. Leveraging the geometry of CBCT scanning, it then back-projects the multi-view 2D features into the 3D space to formulate a comprehensive volumetric feature map, followed by a 3D CNN decoder to recover 3D CBCT image. Importantly, our approach respects the geometric relationship between 3D CBCT image and its 2D X-ray projections during feature back projection stage, and enjoys the prior knowledge learned from the data population. This ensures its adaptability in dealing with extremely sparse view inputs without individual training, such as scenarios with only 5 or 10 X-ray projections. Extensive evaluations on two simulated datasets and one real-world dataset demonstrate exceptional reconstruction quality and time efficiency of our method.
锥形束计算机断层扫描（CBCT）在临床成像中发挥着至关重要的作用。传统方法通常需要数百次 2D X 射线投影才能重建高质量的 3D CBCT 图像，从而导致相当大的辐射暴露。这导致人们对稀疏视图 CBCT 重建以减少辐射剂量越来越感兴趣。尽管深度学习和神经渲染算法等最新进展在这一领域取得了长足的进步，但这些方法要么产生不令人满意的结果，要么遭受个体优化时间效率低下的困扰。在本文中，我们引入了一种新颖的几何感知编码器-解码器框架来解决这个问题。我们的框架首先使用 2D CNN 编码器对来自各种 2D X 射线投影的多视图 2D 特征进行编码。然后，利用 CBCT 扫描的几何结构，将多视图 2D 特征反投影到 3D 空间中，以制定全面的体积特征图，然后通过 3D CNN 解码器恢复 3D CBCT 图像。重要的是，我们的方法在特征反投影阶段尊重 3D CBCT 图像与其 2D X 射线投影之间的几何关系，并享受从数据群体中学到的先验知识。这确保了其无需单独训练即可处理极其稀疏的视图输入的适应性，例如只有 5 或 10 个 X 射线投影的场景。对两个模拟数据集和一个真实数据集的广泛评估证明了我们的方法具有出色的重建质量和时间效率。

AU Xu, Shicheng Li, Wei Li, Zuoyong Zhao, Tiesong Zhang, Bob
徐AU、李世成、李伟、赵作勇、张铁松、Bob

Facing Differences of Similarity: Intra- and Inter-Correlation Unsupervised Learning for Chest X-Ray Anomaly Detection.
面对相似性的差异：胸部 X 射线异常检测的内相关和互相关无监督学习。

Anomaly detection can significantly aid doctors in interpreting chest X-rays. The commonly used strategy involves utilizing the pre-trained network to extract features from normal data to establish feature representations. However, when a pre-trained network is applied to more detailed X-rays, differences of similarity can limit the robustness of these feature representations. Therefore, we propose an intra- and inter-correlation learning framework for chest X-ray anomaly detection. Firstly, to better leverage the similar anatomical structure information in chest X-rays, we introduce the Anatomical-Feature Pyramid Fusion Module for feature fusion. This module aims to obtain fusion features with both local details and global contextual information. These fusion features are initialized by a trainable feature mapper and stored in a feature bank to serve as centers for learning. Furthermore, to Facing Differences of Similarity (FDS) introduced by the pre-trained network, we propose an intra- and inter-correlation learning strategy: (1) We use intra-correlation learning to establish intra-correlation between mapped features of individual images and semantic centers, thereby initially discovering lesions; (2) We employ inter-correlation learning to establish inter-correlation between mapped features of different images, further mitigating the differences of similarity introduced by the pre-trained network, and achieving effective detection results even in diverse chest disease environments. Finally, a comparison with 18 state-of-the-art methods on three datasets demonstrates the superiority and effectiveness of the proposed method across various scenarios.
异常检测可以极大地帮助医生解读胸部 X 光片。常用的策略是利用预训练的网络从正常数据中提取特征来建立特征表示。然而，当预训练的网络应用于更详细的 X 射线时，相似性的差异可能会限制这些特征表示的鲁棒性。因此，我们提出了一种用于胸部 X 射线异常检测的内相关和互相关学习框架。首先，为了更好地利用胸部X光片中相似的解剖结构信息，我们引入了解剖特征金字塔融合模块进行特征融合。该模块旨在获得具有局部细节和全局上下文信息的融合特征。这些融合特征由可训练的特征映射器初始化，并存储在特征库中作为学习中心。此外，针对预训练网络引入的相似性差异（FDS），我们提出了一种内相关和互相关学习策略：（1）我们使用内相关学习来建立各个图像的映射特征之间的内相关和语义中心，从而初步发现病变；（2）我们采用互相关学习来建立不同图像的映射特征之间的互相关性，进一步减轻预训练网络引入的相似性差异，即使在不同的胸部疾病环境中也能获得有效的检测结果。最后，在三个数据集上与 18 种最先进的方法进行比较，证明了所提出的方法在各种场景下的优越性和有效性。

AU Song, Haofei Mao, Xintian Yu, Jing Li, Qingli Wang, Yan
AU宋、毛浩飞、于心田、李静、王庆利、严

I3Net: Inter-Intra-Slice Interpolation Network for Medical Slice Synthesis
I3Net：用于医学切片合成的切片内插值网络

Medical imaging is limited by acquisition time and scanning equipment. CT and MR volumes, reconstructed with thicker slices, are anisotropic with high in-plane resolution and low through-plane resolution. We reveal an intriguing phenomenon that due to the mentioned nature of data, performing slice-wise interpolation from the axial view can yield greater benefits than performing super-resolution from other views. Based on this observation, we propose an Inter-Intra-slice Interpolation Network (I(3)Net), which fully explores information from high in-plane resolution and compensates for low through-plane resolution. The through-plane branch supplements the limited information contained in low through-plane resolution from high in-plane resolution and enables continual and diverse feature learning. In-plane branch transforms features to the frequency domain and enforces an equal learning opportunity for all frequency bands in a global context learning paradigm. We further propose a cross-view block to take advantage of the information from all three views online. Extensive experiments on two public datasets demonstrate the effectiveness of I(3)Net, and noticeably outperforms state-of-the-art super-resolution, video frame interpolation and slice interpolation methods by a large margin. We achieve 43.90dB in PSNR, with at least 1.14dB improvement under the upscale factor of x2 on MSD dataset with faster inference. Code is available at https://github.com/DeepMedLab-ECNU/Medical-Image-Reconstruction.
医学成像受到采集时间和扫描设备的限制。用较厚的切片重建的 CT 和 MR 体积具有各向异性，具有高平面内分辨率和低平面分辨率。我们揭示了一个有趣的现象，由于所提到的数据性质，从轴向视图执行切片插值可以比从其他视图执行超分辨率产生更大的好处。基于这一观察，我们提出了一种片内插值网络（I(3)Net），它充分探索高平面内分辨率的信息并补偿低平面分辨率。贯通平面分支从高平面内分辨率中补充了低贯通平面分辨率中包含的有限信息，并实现了持续且多样化的特征学习。平面内分支将特征转换到频域，并在全局上下文学习范式中为所有频段强制提供平等的学习机会。我们进一步提出了一个跨视图块来利用来自所有三个在线视图的信息。对两个公共数据集的大量实验证明了 I(3)Net 的有效性，并且明显优于最先进的超分辨率、视频帧插值和切片插值方法。我们在 MSD 数据集上实现了 43.90dB 的 PSNR，在 x2 的放大因子下至少提高了 1.14dB，推理速度更快。代码可在 https://github.com/DeepMedLab-ECNU/Medical-Image-Reconstruction 获取。

AU Miller, David A. Grannonico, Marta Liu, Mingna Savier, Elise McHaney, Kara Erisir, Alev Netland, Peter A. Cang, Jianhua Liu, Xiaorong Zhang, Hao F.
AU Miller、David A. Grannonico、Marta Liu、Mingna Savier、Elise McHaney、Kara Erisir、Alev Netland、Peter A. Cang、Jianhua Liu、张晓蓉、Hao F.

Visible-Light Optical Coherence Tomography Fibergraphy of the Tree Shrew Retinal Ganglion Cell Axon Bundles
树鼩视网膜神经节细胞轴突束的可见光光学相干断层扫描纤维成像

We seek to develop techniques for high-resolution imaging of the tree shrew retina for visualizing and parameterizing retinal ganglion cell (RGC) axon bundles in vivo. We applied visible-light optical coherence tomography fibergraphy (vis-OCTF) and temporal speckle averaging (TSA) to visualize individual RGC axon bundles in the tree shrew retina. For the first time, we quantified individual RGC bundle width, height, and cross-sectional area and applied vis-OCT angiography (vis-OCTA) to visualize the retinal microvasculature in tree shrews. Throughout the retina, as the distance from the optic nerve head (ONH) increased from 0.5 mm to 2.5 mm, bundle width increased by 30%, height decreased by 67%, and cross-sectional area decreased by 36%. We also showed that axon bundles become vertically elongated as they converge toward the ONH. Ex vivo confocal microscopy of retinal flat-mounts immunostained with Tuj1 confirmed our in vivo vis-OCTF findings.
我们寻求开发树鼩视网膜高分辨率成像技术，用于体内视网膜神经节细胞（RGC）轴突束的可视化和参数化。我们应用可见光光学相干断层扫描纤维成像 (vis-OCTF) 和颞散斑平均 (TSA) 来可视化树鼩视网膜中的单个 RGC 轴突束。我们首次量化了单个 RGC 束的宽度、高度和横截面积，并应用 vis-OCT 血管造影 (vis-OCTA) 来可视化树鼩的视网膜微脉管系统。在整个视网膜中，随着距视神经头（ONH）的距离从0.5毫米增加到2.5毫米，束宽度增加30％，高度减少67％，横截面积减少36％。我们还表明，轴突束在向 ONH 汇聚时会垂直拉长。用 Tuj1 免疫染色的视网膜平片的离体共聚焦显微镜证实了我们的体内 vis-OCTF 研究结果。

AU Zhang, Yuhan Ma, Xiao Huang, Kun Li, Mingchao Heng, Pheng-Ann
张AU、马雨涵、小黄、李坤、衡明超、彭安

Semantic-Oriented Visual Prompt Learning for Diabetic Retinopathy Grading on Fundus Images
眼底图像糖尿病视网膜病变分级的面向语义的视觉提示学习

Diabetic retinopathy (DR) is a serious ocular condition that requires effective monitoring and treatment by ophthalmologists. However, constructing a reliable DR grading model remains a challenging and costly task, heavily reliant on high-quality training sets and adequate hardware resources. In this paper, we investigate the knowledge transferability of large-scale pre-trained models (LPMs) to fundus images based on prompt learning to construct a DR grading model efficiently. Unlike full-tuning which fine-tunes all parameters of LPMs, prompt learning only involves a minimal number of additional learnable parameters while achieving a competitive effect as full-tuning. Inspired by visual prompt tuning, we propose Semantic-oriented Visual Prompt Learning (SVPL) to enhance the semantic perception ability for better extracting task-specific knowledge from LPMs, without any additional annotations. Specifically, SVPL assigns a group of learnable prompts for each DR level to fit the complex pathological manifestations and then aligns each prompt group to task-specific semantic space via a contrastive group alignment (CGA) module. We also propose a plug-and-play adapter module, Hierarchical Semantic Delivery (HSD), which allows the semantic transition of prompt groups from shallow to deep layers to facilitate efficient knowledge mining and model convergence. Our extensive experiments on three public DR grading datasets demonstrate that SVPL achieves superior results compared to other transfer tuning and DR grading methods. Further analysis suggests that the generalized knowledge from LPMs is advantageous for constructing the DR grading model on fundus images.
糖尿病视网膜病变（DR）是一种严重的眼部疾病，需要眼科医生的有效监测和治疗。然而，构建可靠的 DR 分级模型仍然是一项具有挑战性且成本高昂的任务，严重依赖高质量的训练集和充足的硬件资源。在本文中，我们研究了基于即时学习的大规模预训练模型（LPM）到眼底图像的知识可迁移性，以有效地构建 DR 分级模型。与对 LPM 的所有参数进行微调的全调优不同，即时学习仅涉及最少数量的额外可学习参数，同时实现与全调优一样的竞争效果。受视觉提示调整的启发，我们提出了面向语义的视觉提示学习（SVPL）来增强语义感知能力，以便更好地从 LPM 中提取特定于任务的知识，而无需任何额外的注释。具体来说，SVPL 为每个 DR 级别分配一组可学习的提示，以适应复杂的病理表现，然后通过对比组对齐（CGA）模块将每个提示组与特定任务的语义空间对齐。我们还提出了一种即插即用的适配器模块——分层语义传递（HSD），它允许提示组从浅层到深层的语义转换，以促进高效的知识挖掘和模型收敛。我们对三个公共 DR 分级数据集进行的广泛实验表明，与其他传输调整和 DR 分级方法相比，SVPL 取得了优异的结果。进一步分析表明，LPM 的广义知识有利于构建眼底图像的 DR 分级模型。

AU Liu, Aohan Guo, Yuchen Yong, Jun-Hai Xu, Feng
刘AU、郭敖汉、勇雨辰、徐俊海、冯

Multi-Grained Radiology Report Generation With Sentence-Level Image-Language Contrastive Learning
通过句子级图像语言对比学习生成多粒度放射学报告

The automatic generation of accurate radiology reports is of great clinical importance and has drawn growing research interest. However, it is still a challenging task due to the imbalance between normal and abnormal descriptions and the multi-sentence and multi-topic nature of radiology reports. These features result in significant challenges to generating accurate descriptions for medical images, especially the important abnormal findings. Previous methods to tackle these problems rely heavily on extra manual annotations, which are expensive to acquire. We propose a multi-grained report generation framework incorporating sentence-level image-sentence contrastive learning, which does not require any extra labeling but effectively learns knowledge from the image-report pairs. We first introduce contrastive learning as an auxiliary task for image feature learning. Different from previous contrastive methods, we exploit the multi-topic nature of imaging reports and perform fine-grained contrastive learning by extracting sentence topics and contents and contrasting between sentence contents and refined image contents guided by sentence topics. This forces the model to learn distinct abnormal image features for each specific topic. During generation, we use two decoders to first generate coarse sentence topics and then the fine-grained text of each sentence. We directly supervise the intermediate topics using sentence topics learned by our contrastive objective. This strengthens the generation constraint and enables independent fine-tuning of the decoders using reinforcement learning, which further boosts model performance. Experiments on two large-scale datasets MIMIC-CXR and IU-Xray demonstrate that our approach outperforms existing state-of-the-art methods, evaluated by both language generation metrics and clinical accuracy.
自动生成准确的放射学报告具有重要的临床意义，并引起了越来越多的研究兴趣。然而，由于正常和异常描述之间的不平衡以及放射学报告的多句和多主题性质，这仍然是一项具有挑战性的任务。这些特征给生成医学图像的准确描述带来了重大挑战，尤其是重要的异常发现。以前解决这些问题的方法严重依赖额外的手动注释，而获取这些注释的成本很高。我们提出了一种结合句子级图像句子对比学习的多粒度报告生成框架，它不需要任何额外的标签，但可以有效地从图像报告对中学习知识。我们首先引入对比学习作为图像特征学习的辅助任务。与以往的对比方法不同，我们利用成像报告的多主题性质，通过提取句子主题和内容，并将句子内容与句子主题引导的细化图像内容进行对比来进行细粒度的对比学习。这迫使模型学习每个特定主题的明显异常图像特征。在生成过程中，我们使用两个解码器首先生成粗粒度的句子主题，然后生成每个句子的细粒度文本。我们使用对比目标学习的句子主题直接监督中间主题。这增强了生成约束，并能够使用强化学习对解码器进行独立微调，从而进一步提高模型性能。在两个大型数据集 MIMIC-CXR 和 IU-Xray 上进行的实验表明，通过语言生成指标和临床准确性进行评估，我们的方法优于现有的最先进方法。

AU Shi, Yongyi Xia, Wenjun Wang, Ge Mou, Xuanqin
区石、夏永义、王文军、葛某、玄勤

Blind CT Image Quality Assessment Using DDPM-derived Content and Transformer-based Evaluator.
使用 DDPM 衍生内容和基于 Transformer 的评估器进行盲 CT 图像质量评估。

Lowering radiation dose per view and utilizing sparse views per scan are two common CT scan modes, albeit often leading to distorted images characterized by noise and streak artifacts. Blind image quality assessment (BIQA) strives to evaluate perceptual quality in alignment with what radiologists perceive, which plays an important role in advancing low-dose CT reconstruction techniques. An intriguing direction involves developing BIQA methods that mimic the operational characteristic of the human visual system (HVS). The internal generative mechanism (IGM) theory reveals that the HVS actively deduces primary content to enhance comprehension. In this study, we introduce an innovative BIQA metric that emulates the active inference process of IGM. Initially, an active inference module, implemented as a denoising diffusion probabilistic model (DDPM), is constructed to anticipate the primary content. Then, the dissimilarity map is derived by assessing the interrelation between the distorted image and its primary content. Subsequently, the distorted image and dissimilarity map are combined into a multi-channel image, which is inputted into a transformer-based image quality evaluator. By leveraging the DDPM-derived primary content, our approach achieves competitive performance on a low-dose CT dataset.
降低每次视图的辐射剂量和利用每次扫描的稀疏视图是两种常见的 CT 扫描模式，尽管这通常会导致以噪声和条纹伪影为特征的图像失真。盲图像质量评估 (BIQA) 致力于评估与放射科医生感知一致的感知质量，这在推进低剂量 CT 重建技术方面发挥着重要作用。一个有趣的方向涉及开发模仿人类视觉系统 (HVS) 操作特征的 BIQA 方法。内部生成机制（IGM）理论揭示，HVS 主动推断主要内容以增强理解力。在本研究中，我们引入了一种创新的 BIQA 指标，可以模拟 IGM 的主动推理过程。最初，构建一个作为去噪扩散概率模型（DDPM）实现的主动推理模块来预测主要内容。然后，通过评估失真图像与其主要内容之间的相互关系来导出相异图。随后，将失真图像和相异图组合成多通道图像，并将其输入到基于变换器的图像质量评估器中。通过利用 DDPM 衍生的主要内容，我们的方法在低剂量 CT 数据集上实现了具有竞争力的性能。

AU Xia, Jinqiu Zhou, Yiwen Deng, Wenxin Kang, Jing Wu, Wangjiang Qi, Mengke Zhou, Linghong Ma, Jianhui Xu, Yuan
夏区、周金秋、邓一文、康文欣、吴静、齐王江、周孟克、马令红、徐建辉、袁

PND-Net: Physics-Inspired Non-Local Dual-Domain Network for Metal Artifact Reduction
PND-Net：物理启发的非本地双域网络，用于减少金属伪影

Metal artifacts caused by the presence of metallic implants tremendously degrade the quality of reconstructed computed tomography (CT) images and therefore affect the clinical diagnosis or reduce the accuracy of organ delineation and dose calculation in radiotherapy. Although various deep learning methods have been proposed for metal artifact reduction (MAR), most of them aim to restore the corrupted sinogram within the metal trace, which removes beam hardening artifacts but ignores other components of metal artifacts. In this paper, based on the physical property of metal artifacts which is verified via Monte Carlo (MC) simulation, we propose a novel physics-inspired non-local dual-domain network (PND-Net) for MAR in CT imaging. Specifically, we design a novel non-local sinogram decomposition network (NSD-Net) to acquire the weighted artifact component and develop an image restoration network (IR-Net) to reduce the residual and secondary artifacts in the image domain. To facilitate the generalization and robustness of our method on clinical CT images, we employ a trainable fusion network (F-Net) in the artifact synthesis path to achieve unpaired learning. Furthermore, we design an internal consistency loss to ensure the data fidelity of anatomical structures in the image domain and introduce the linear interpolation sinogram as prior knowledge to guide sinogram decomposition. NSD-Net, IR-Net, and F-Net are jointly trained so that they can benefit from one another. Extensive experiments on simulation and clinical data demonstrate that our method outperforms state-of-the-art MAR methods.
金属植入物的存在引起的金属伪影极大地降低了重建计算机断层扫描（CT）图像的质量，从而影响临床诊断或降低放射治疗中器官描绘和剂量计算的准确性。尽管已经提出了各种深度学习方法来减少金属伪影（MAR），但大多数方法的目的是恢复金属迹线内损坏的正弦图，从而消除束硬化伪影，但忽略金属伪影的其他组成部分。在本文中，基于通过蒙特卡罗（MC）模拟验证的金属伪影的物理特性，我们提出了一种新颖的物理启发的非局部双域网络（PND-Net），用于CT成像中的MAR。具体来说，我们设计了一种新颖的非局部正弦图分解网络（NSD-Net）来获取加权伪影分量，并开发图像恢复网络（IR-Net）来减少图像域中的残留和二次伪影。为了促进我们的方法在临床 CT 图像上的泛化和鲁棒性，我们在工件合成路径中采用可训练的融合网络（F-Net）来实现不配对学习。此外，我们设计了内部一致性损失以确保图像域中解剖结构的数据保真度，并引入线性插值正弦图作为先验知识来指导正弦图分解。 NSD-Net、IR-Net 和 F-Net 是联合训练的，因此它们可以相互受益。大量的模拟和临床数据实验表明，我们的方法优于最先进的 MAR 方法。

AU Lin, Jingyin Xie, Wende Kang, Li Wu, Huisi
AU Lin、谢静银、康文德、吴莉、慧思

Dynamic-guided Spatiotemporal Attention for Echocardiography Video Segmentation.
用于超声心动图视频分割的动态引导时空注意力。

Left ventricle (LV) endocardium segmentation in echocardiography video has received much attention as an important step in quantifying LV ejection fraction. Most existing methods are dedicated to exploiting temporal information on top of 2D convolutional networks. In addition to single appearance semantic learning, some research attempted to introduce motion cues through the optical flow estimation (OFE) task to enhance temporal consistency modeling. However, OFE in these methods is tightly coupled to LV endocardium segmentation, resulting in noisy inter-frame flow prediction, and post-optimization based on these flows accumulates errors. To address these drawbacks, we propose dynamic-guided spatiotemporal attention (DSA) for semi-supervised echocardiography video segmentation. We first fine-tune the off-the-shelf OFE network RAFT on echocardiography data to provide dynamic information. Taking inter-frame flows as additional input, we use a dual-encoder structure to extract motion and appearance features separately. Based on the connection between dynamic continuity and semantic consistency, we propose a bilateral feature calibration module to enhance both features. For temporal consistency modeling, the DSA is proposed to aggregate neighboring frame context using deformable attention that is realized by offsets grid attention. Dynamic information is introduced into DSA through a bilateral offset estimation module to effectively combine with appearance semantics and predict attention offsets, thereby guiding semantic-based spatiotemporal attention. We evaluated our method on two popular echocardiography datasets, CAMUS and EchoNet-Dynamic, and achieved state-of-the-art.
超声心动图视频中的左心室 (LV) 心内膜分割作为量化 LV 射血分数的重要步骤而受到广泛关注。大多数现有方法致力于利用 2D 卷积网络之上的时间信息。除了单一外观语义学习之外，一些研究还尝试通过光流估计（OFE）任务引入运动线索来增强时间一致性建模。然而，这些方法中的 OFE 与左心室心内膜分割紧密耦合，导致帧间血流预测存在噪声，并且基于这些血流的后优化会累积误差。为了解决这些缺点，我们提出了用于半监督超声心动图视频分割的动态引导时空注意力（DSA）。我们首先根据超声心动图数据对现成的 OFE 网络 RAFT 进行微调，以提供动态信息。以帧间流作为附加输入，我们使用双编码器结构分别提取运动和外观特征。基于动态连续性和语义一致性之间的联系，我们提出了双边特征校准模块来增强这两个特征。对于时间一致性建模，提出了 DSA 使用通过偏移网格注意力实现的可变形注意力来聚合相邻帧上下文。通过双边偏移估计模块将动态信息引入DSA，有效结合外观语义并预测注意力偏移，从而指导基于语义的时空注意力。我们在两个流行的超声心动图数据集 CAMUS 和 EchoNet-Dynamic 上评估了我们的方法，并达到了最先进的水平。

AU Mandot, Shubham Zannoni, Elena M. Cai, Ling Nie, Xingchen La Riviere, Patrick J. Wilson, Matthew D. Meng, Ling Jian
AU Mandot、Shubham Zannoni、Elena M. Cai、聂凌、Xingchen La Riviere、Patrick J. Wilson、Matthew D.Meng、凌健

A High-Sensitivity Benchtop X-Ray Fluorescence Emission Tomography (XFET) System With a Full-Ring of X-Ray Imaging-Spectrometers and a Compound-Eye Collimation Aperture
具有全环 X 射线成像光谱仪和复眼准直孔径的高灵敏度台式 X 射线荧光发射断层扫描 (XFET) 系统

The advent of metal-based drugs and metal nanoparticles as therapeutic agents in anti-tumor treatment has motivated the advancement of X-ray fluorescence computed tomography (XFCT) techniques. An XFCT imaging modality can detect, quantify, and image the biodistribution of metal elements using the X-ray fluorescence signal emitted upon X-ray irradiation. However, the majority of XFCT imaging systems and instrumentation developed so far rely on a single or a small number of detectors. This work introduces the first full-ring benchtop X-ray fluorescence emission tomography (XFET) system equipped with 24 solid-state detectors arranged in a hexagonal geometry and a 96-pinhole compound-eye collimator. We experimentally demonstrate the system's sensitivity and its capability of multi-element detection and quantification by performing imaging studies on an animal-sized phantom. In our preliminary studies, the phantom was irradiated with a pencil beam of X-rays produced using a low-powered polychromatic X-ray source (90kVp and 60W max power). This investigation shows a significant enhancement in the detection limit of gadolinium to as low as 0.1 mg/mL concentration. The results also illustrate the unique capabilities of the XFET system to simultaneously determine the spatial distribution and accurately quantify the concentrations of multiple metal elements.
金属基药物和金属纳米颗粒作为抗肿瘤治疗药物的出现推动了X射线荧光计算机断层扫描（XFCT）技术的进步。 XFCT 成像模式可以使用 X 射线照射时发出的 X 射线荧光信号来检测、量化和成像金属元素的生物分布。然而，迄今为止开发的大多数 XFCT 成像系统和仪器都依赖于单个或少量探测器。这项工作介绍了第一个全环台式X射线荧光发射断层扫描（XFET）系统，配备有24个呈六边形几何形状排列的固态探测器和一个96针孔复眼准直器。我们通过对动物大小的模型进行成像研究，通过实验证明了系统的灵敏度及其多元素检测和量化的能力。在我们的初步研究中，用低功率多色 X 射线源（90kVp 和 60W 最大功率）产生的笔形 X 射线束照射模型。这项研究表明钆的检测限显着提高至低至 0.1 mg/mL 浓度。结果还说明了 XFET 系统同时确定空间分布并准确量化多种金属元素浓度的独特能力。

AU Pang, Yan Liang, Jiaming Huang, Teng Chen, Hao Li, Yunhao Li, Dan Huang, Lin Wang, Qiong
AU庞、梁彦、黄家明、陈腾、李浩、李云浩、黄丹、王林、琼

Slim UNETR: Scale Hybrid Transformers to Efficient 3D Medical Image Segmentation Under Limited Computational Resources
Slim UNETR：在有限的计算资源下扩展混合变压器以实现高效的 3D 医学图像分割

Hybrid transformer-based segmentation approaches have shown great promise in medical image analysis. However, they typically require considerable computational power and resources during both training and inference stages, posing a challenge for resource-limited medical applications common in the field. To address this issue, we present an innovative framework called Slim UNETR, designed to achieve a balance between accuracy and efficiency by leveraging the advantages of both convolutional neural networks and transformers. Our method features the Slim UNETR Block as a core component, which effectively enables information exchange through self-attention mechanism decomposition and cost-effective representation aggregation. Additionally, we utilize the throughput metric as an efficiency indicator to provide feedback on model resource consumption. Our experiments demonstrate that Slim UNETR outperforms state-of-the-art models in terms of accuracy, model size, and efficiency when deployed on resource-constrained devices. Remarkably, Slim UNETR achieves 92.44% dice accuracy on BraTS2021 while being 34.6x smaller and 13.4x faster during inference compared to Swin UNETR.
基于混合变压器的分割方法在医学图像分析中显示出了巨大的前景。然而，它们在训练和推理阶段通常需要大量的计算能力和资源，这对该领域常见的资源有限的医疗应用提出了挑战。为了解决这个问题，我们提出了一个名为 Slim UNETR 的创新框架，旨在通过利用卷积神经网络和 Transformer 的优势来实现准确性和效率之间的平衡。我们的方法以 Slim UNETR Block 作为核心组件，通过自注意力机制分解和经济有效的表示聚合有效地实现信息交换。此外，我们利用吞吐量指标作为效率指标来提供有关模型资源消耗的反馈。我们的实验表明，当部署在资源受限的设备上时，Slim UNETR 在准确性、模型大小和效率方面优于最先进的模型。值得注意的是，与 Swin UNETR 相比，Slim UNETR 在 BraTS2021 上的骰子准确度达到了 92.44%，同时体积缩小了 34.6 倍，推理速度加快了 13.4 倍。

AU Park, Mi-Ae Zaha, Vlad G. Badawi, Ramsey D. Bowen, Spencer L.
AU Park、Mi-Ae Zaha、Vlad G. Badawi、Ramsey D. Bowen、Spencer L.

Supplemental Transmission Aided Attenuation Correction for Quantitative Cardiac PET
定量心脏 PET 的补充传输辅助衰减校正

Quantitative PET attenuation correction (AC) for cardiac PET/CT and PET/MR is a challenging problem. We propose and evaluate an AC approach that uses coincidences from a relatively weak and physically fixed sparse external source, in combination with that from the patient, to reconstruct $\mu $ -maps based on physics principles alone. The low 30 cm3 volume of the source makes it easy to fill and place, and the method does not use prior image data or attenuation map assumptions. Our supplemental transmission aided maximum likelihood reconstruction of attenuation and activity (sTX-MLAA) algorithm contains an attenuation map update that maximizes the likelihood of terms representing coincidences originating from tracer in the patient and a weighted expression of counts segmented from the external source alone. Both external source and patient scatter and randoms are fully corrected. We evaluated performance of sTX-MLAA compared to reference standard CT-based AC with FDG PET/CT phantom studies; including modeling a patient with myocardial inflammation. Through an ROI analysis we measured <= 5 % bias in activity concentrations for PET images generated with sTX-MLAA and a TX source strength >= 12.7$ MBq, relative to CT-AC. PET background variability (from noise and sparse sampling) was substantially reduced with sTX-MLAA compared to using counts segmented from the transmission source alone for AC. Results suggest that sTX-MLAA will enable quantitative PET during cardiac PET/CT and PET/MR of human patients.
心脏 PET/CT 和 PET/MR 的定量 PET 衰减校正 (AC) 是一个具有挑战性的问题。我们提出并评估了一种 AC 方法，该方法使用来自相对较弱且物理固定的稀疏外部源的巧合，结合来自患者的巧合，仅基于物理原理来重建 $\mu $ -图。源的体积仅为 30 cm3，因此易于填充和放置，并且该方法不使用先前的图像数据或衰减图假设。我们的补充传输辅助衰减和活动的最大似然重建 (sTX-MLAA) 算法包含衰减图更新，可最大化表示源自患者示踪剂的重合项的可能性以及仅从外部源分段的计数的加权表达式。外部源和患者分散和随机均已完全校正。我们通过 FDG PET/CT 模型研究评估了 sTX-MLAA 与基于 CT 的参考标准 AC 的性能；包括对患有心肌炎症的患者进行建模。通过 ROI 分析，我们测量到使用 sTX-MLAA 生成的 PET 图像的活性浓度存在 <= 5% 偏差，TX 源强度 >= 12.7$ MBq（相对于 CT-AC）。与仅使用从 AC 传输源分段的计数相比，sTX-MLAA 显着降低了 PET 背景变异性（来自噪声和稀疏采样）。结果表明，sTX-MLAA 将在人类患者的心脏 PET/CT 和 PET/MR 期间实现定量 PET。

AU Zhang, Yi Li, Jiayue Li, Xinyang Xie, Min Islam, Md. Tauhidul Zhang, Haixian
AU 张、李毅、李家跃、谢欣阳、Min Islam、Md. Tauhidul 张、海贤

FAOT-Net: A 1.5-Stage Framework for 3D Pelvic Lymph Node Detection With Online Candidate Tuning
FAOT-Net：通过在线候选调整进行 3D 盆腔淋巴结检测的 1.5 阶段框架

Accurate and automatic detection of pelvic lymph nodes in computed tomography (CT) scans is critical for diagnosing lymph node metastasis in colorectal cancer, which in turn plays a crucial role in its staging, treatment planning, surgical guidance, and postoperative follow-up of colorectal cancer. However, achieving high detection sensitivity and specificity poses a challenge due to the small and variable sizes of these nodes, as well as the presence of numerous similar signals within the complex pelvic CT image. To tackle these issues, we propose a 3D feature-aware online-tuning network (FAOT-Net) that introduces a novel 1.5-stage structure to seamlessly integrate detection and refinement via our online candidate tuning process and takes advantage of multi-level information through the tailored feature flow. Furthermore, we redesign the anchor fitting and anchor matching strategies to further improve detection performance in a nearly hyperparameter-free manner. Our framework achieves the FROC score of 52.8 and the sensitivity of 91.7% with 16 false positives per scan on the PLNDataset.
计算机断层扫描（CT）扫描中准确、自动检测盆腔淋巴结对于诊断结直肠癌淋巴结转移至关重要，而这对于结直肠癌的分期、治疗计划、手术指导和术后随访起着至关重要的作用。癌症。然而，由于这些节点较小且尺寸可变，以及复杂的盆腔 CT 图像中存在大量相似信号，实现高检测灵敏度和特异性提出了挑战。为了解决这些问题，我们提出了一个 3D 特征感知在线调整网络（FAOT-Net），它引入了一种新颖的 1.5 阶段结构，通过我们的在线候选调整过程无缝集成检测和细化，并通过定制的功能流程。此外，我们重新设计了锚点拟合和锚点匹配策略，以近乎无超参数的方式进一步提高检测性能。我们的框架在 PLN 数据集上的每次扫描出现 16 个误报，FROC 得分为 52.8，灵敏度为 91.7%。

AU Bian, Chenyuan Xia, Nan Xie, Anmu Cong, Shan Dong, Qian
卞卞、夏晨元、谢楠、丛安木、山东、钱

Adversarially Trained Persistent Homology Based Graph Convolutional Network for Disease Identification Using Brain Connectivity
基于对抗性训练的持久同源图卷积网络，利用大脑连接进行疾病识别

Brain disease propagation is associated with characteristic alterations in the structural and functional connectivity networks of the brain. To identify disease-specific network representations, graph convolutional networks (GCNs) have been used because of their powerful graph embedding ability to characterize the non-Euclidean structure of brain networks. However, existing GCNs generally focus on learning the discriminative region of interest (ROI) features, often ignoring important topological information that enables the integration of connectome patterns of brain activity. In addition, most methods fail to consider the vulnerability of GCNs to perturbations in network properties of the brain, which considerably degrades the reliability of diagnosis results. In this study, we propose an adversarially trained persistent homology-based graph convolutional network (ATPGCN) to capture disease-specific brain connectome patterns and classify brain diseases. First, the brain functional/structural connectivity is constructed using different neuroimaging modalities. Then, we develop a novel strategy that concatenates the persistent homology features from a brain algebraic topology analysis with readout features of the global pooling layer of a GCN model to collaboratively learn the individual-level representation. Finally, we simulate the adversarial perturbations by targeting the risk ROIs from clinical prior, and incorporate them into a training loop to evaluate the robustness of the model. The experimental results on three independent datasets demonstrate that ATPGCN outperforms existing classification methods in disease identification and is robust to minor perturbations in network architecture. Our code is available at https://github.com/CYB08/ATPGCN.
脑部疾病的传播与大脑结构和功能连接网络的特征改变有关。为了识别特定疾病的网络表示，图卷积网络（GCN）已被使用，因为它们具有强大的图嵌入能力来表征大脑网络的非欧几里德结构。然而，现有的 GCN 通常专注于学习区分性感兴趣区域 (ROI) 特征，常常忽略能够整合大脑活动连接组模式的重要拓扑信息。此外，大多数方法未能考虑 GCN 对大脑网络特性扰动的脆弱性，这大大降低了诊断结果的可靠性。在这项研究中，我们提出了一种经过对抗性训练的基于同源性的持久图卷积网络（ATPGCN）来捕获特定疾病的大脑连接组模式并对大脑疾病进行分类。首先，使用不同的神经影像模式构建大脑功能/结构连接。然后，我们开发了一种新颖的策略，将大脑代数拓扑分析中的持久同源特征与 GCN 模型的全局池化层的读出特征连接起来，以协作学习个体级别的表示。最后，我们通过针对临床先验的风险投资回报率来模拟对抗性扰动，并将其纳入训练循环中以评估模型的稳健性。三个独立数据集上的实验结果表明，ATPGCN 在疾病识别方面优于现有的分类方法，并且对网络架构中的微小扰动具有鲁棒性。我们的代码可在 https://github.com/CYB08/ATPGCN 获取。

AU Ju, Lie Yu, Zhen Wang, Lin Zhao, Xin Wang, Xin Bonnington, Paul Ge, Zongyuan
AU Ju, 烈宇, 王震, 赵林, 王鑫, Xin Bonnington, Paul Ge, 宗源

Hierarchical Knowledge Guided Learning for Real-World Retinal Disease Recognition
分层知识引导学习现实世界的视网膜疾病识别

In the real world, medical datasets often exhibit a long-tailed data distribution (i.e., a few classes occupy the majority of the data, while most classes have only a limited number of samples), which results in a challenging long-tailed learning scenario. Some recently published datasets in ophthalmology AI consist of more than 40 kinds of retinal diseases with complex abnormalities and variable morbidity. Nevertheless, more than 30 conditions are rarely seen in global patient cohorts. From a modeling perspective, most deep learning models trained on these datasets may lack the ability to generalize to rare diseases where only a few available samples are presented for training. In addition, there may be more than one disease for the presence of the retina, resulting in a challenging label co-occurrence scenario, also known as multi-label, which can cause problems when some re-sampling strategies are applied during training. To address the above two major challenges, this paper presents a novel method that enables the deep neural network to learn from a long-tailed fundus database for various retinal disease recognition. Firstly, we exploit the prior knowledge in ophthalmology to improve the feature representation using a hierarchy-aware pre-training. Secondly, we adopt an instance-wise class-balanced sampling strategy to address the label co-occurrence issue under the long-tailed medical dataset scenario. Thirdly, we introduce a novel hybrid knowledge distillation to train a less biased representation and classifier. We conducted extensive experiments on four databases, including two public datasets and two in-house databases with more than one million fundus images. The experimental results demonstrate the superiority of our proposed methods with recognition accuracy outperforming the state-of-the-art competitors, especially for these rare diseases.
在现实世界中，医学数据集通常表现出长尾数据分布（即少数类别占据大部分数据，而大多数类别只有有限数量的样本），这导致了具有挑战性的长尾学习场景。最近发布的一些眼科人工智能数据集包含 40 多种视网膜疾病，这些疾病具有复杂的异常情况和不同的发病率。然而，超过 30 种病症在全球患者队列中很少见。从建模的角度来看，大多数在这些数据集上训练的深度学习模型可能缺乏泛化到罕见疾病的能力，因为只有少数可用样本可供训练。此外，视网膜的存在可能存在不止一种疾病，从而导致具有挑战性的标签共现场景，也称为多标签，在训练期间应用一些重采样策略时可能会导致问题。为了解决上述两个主要挑战，本文提出了一种新方法，使深度神经网络能够从长尾眼底数据库中学习，以进行各种视网膜疾病识别。首先，我们利用眼科的先验知识，使用层次结构感知预训练来改进特征表示。其次，我们采用实例级类平衡采样策略来解决长尾医疗数据集场景下的标签共现问题。第三，我们引入了一种新颖的混合知识蒸馏来训练偏差较小的表示和分类器。我们对四个数据库进行了广泛的实验，包括两个公共数据集和两个内部数据库，拥有超过一百万张眼底图像。实验结果证明了我们提出的方法的优越性，其识别精度优于最先进的竞争对手，特别是对于这些罕见疾病。

AU Lagogiannis, Ioannis Meissen, Felix Kaissis, Georgios Rueckert, Daniel
AU Lagogiannis、Ioannis Meissen、Felix Kaissis、Georgios Rueckert、Daniel

Unsupervised Pathology Detection: A Deep Dive Into the State of the Art
无监督病理检测：深入研究最先进的技术

Deep unsupervised approaches are gathering increased attention for applications such as pathology detection and segmentation in medical images since they promise to alleviate the need for large labeled datasets and are more generalizable than their supervised counterparts in detecting any kind of rare pathology. As the Unsupervised Anomaly Detection (UAD) literature continuously grows and new paradigms emerge, it is vital to continuously evaluate and benchmark new methods in a common framework, in order to reassess the state-of-the-art (SOTA) and identify promising research directions. To this end, we evaluate a diverse selection of cutting-edge UAD methods on multiple medical datasets, comparing them against the established SOTA in UAD for brain MRI. Our experiments demonstrate that newly developed feature-modeling methods from the industrial and medical literature achieve increased performance compared to previous work and set the new SOTA in a variety of modalities and datasets. Additionally, we show that such methods are capable of benefiting from recently developed self-supervised pre-training algorithms, further increasing their performance. Finally, we perform a series of experiments in order to gain further insights into some unique characteristics of selected models and datasets. Our code can be found under https://github.com/iolag/UPD_study/.
深度无监督方法越来越受到医学图像中病理检测和分割等应用的关注，因为它们有望减轻对大型标记数据集的需求，并且在检测任何类型的罕见病理方面比有监督方法更通用。随着无监督异常检测（UAD）文献的不断增长和新范式的出现，在通用框架中不断评估和基准测试新方法至关重要，以便重新评估最先进的（SOTA）并识别有前途的研究方向。为此，我们在多个医学数据集上评估了多种选择的尖端 UAD 方法，并将它们与脑 MRI UAD 中已建立的 SOTA 进行比较。我们的实验表明，与之前的工作相比，来自工业和医学文献的新开发的特征建模方法实现了更高的性能，并在各种模式和数据集中设置了新的 SOTA。此外，我们表明此类方法能够受益于最近开发的自监督预训练算法，进一步提高其性能。最后，我们进行了一系列实验，以便进一步了解所选模型和数据集的一些独特特征。我们的代码可以在 https://github.com/iolag/UPD_study/ 下找到。

AU Li, Jiajia Zhang, Pingping Wang, Teng Zhu, Lei Liu, Ruhan Yang, Xia Wang, Kaixuan Shen, Dinggang Sheng, Bin
AU 李、张佳佳、王萍萍、朱腾、刘雷、杨如涵、王霞、沉凯旋、盛定刚、斌

DSMT-Net: Dual Self-Supervised Multi-Operator Transformation for Multi-Source Endoscopic Ultrasound Diagnosis
DSMT-Net：多源内窥镜超声诊断的双重自监督多操作员改造

Pancreatic cancer has the worst prognosis of all cancers. The clinical application of endoscopic ultrasound (EUS) for the assessment of pancreatic cancer risk and of deep learning for the classification of EUS images have been hindered by inter-grader variability and labeling capability. One of the key reasons for these difficulties is that EUS images are obtained from multiple sources with varying resolutions, effective regions, and interference signals, making the distribution of the data highly variable and negatively impacting the performance of deep learning models. Additionally, manual labeling of images is time-consuming and requires significant effort, leading to the desire to effectively utilize a large amount of unlabeled data for network training. To address these challenges, this study proposes the Dual Self-supervised Multi-Operator Transformation Network (DSMT-Net) for multi-source EUS diagnosis. The DSMT-Net includes a multi-operator transformation approach to standardize the extraction of regions of interest in EUS images and eliminate irrelevant pixels. Furthermore, a transformer-based dual self-supervised network is designed to integrate unlabeled EUS images for pre-training the representation model, which can be transferred to supervised tasks such as classification, detection, and segmentation. A large-scale EUS-based pancreas image dataset (LEPset) has been collected, including 3,500 pathologically proven labeled EUS images (from pancreatic and non-pancreatic cancers) and 8,000 unlabeled EUS images for model development. The self-supervised method has also been applied to breast cancer diagnosis and was compared to state-of-the-art deep learning models on both datasets. The results demonstrate that the DSMT-Net significantly improves the accuracy of pancreatic and breast cancer diagnosis.
胰腺癌是所有癌症中预后最差的。用于评估胰腺癌风险的内窥镜超声 (EUS) 和用于分类 EUS 图像的深度学习的临床应用受到年级间差异和标记能力的阻碍。造成这些困难的关键原因之一是 EUS 图像是从多个来源获得的，具有不同的分辨率、有效区域和干扰信号，使得数据的分布高度可变，并对深度学习模型的性能产生负面影响。此外，手动标记图像非常耗时且需要付出巨大的努力，因此需要有效利用大量未标记的数据进行网络训练。为了应对这些挑战，本研究提出了用于多源 EUS 诊断的双自监督多操作员转换网络（DSMT-Net）。 DSMT-Net 包括一种多算子变换方法，用于标准化 EUS 图像中感兴趣区域的提取并消除不相关的像素。此外，基于变压器的双自监督网络被设计用于集成未标记的 EUS 图像来预训练表示模型，该模型可以转移到分类、检测和分割等监督任务。已经收集了基于 EUS 的大规模胰腺图像数据集 (LEPset)，包括 3,500 张经病理学证明的标记 EUS 图像（来自胰腺癌和非胰腺癌）和 8,000 张用于模型开发的未标记 EUS 图像。这种自我监督方法也已应用于乳腺癌诊断，并与两个数据集上最先进的深度学习模型进行了比较。结果表明 DSMT-Net 显着提高了胰腺癌和乳腺癌诊断的准确性。

AU Wang, Chong Chen, Yuanhong Liu, Fengbei Elliott, Michael Kwok, Chun Fung Pena-Solorzano, Carlos Frazer, Helen McCarthy, Davis James Carneiro, Gustavo
AU Wang、Chong Chen、Yuanhong Liu、Fengbei Elliott、Michael Kwok、Chun Fung Pena-Solorzano、Carlos Frazer、Helen McCarthy、Davis James Carneiro、Gustavo

An Interpretable and Accurate Deep-Learning Diagnosis Framework Modeled With Fully and Semi-Supervised Reciprocal Learning
通过全监督和半监督交互学习建模的可解释且准确的深度学习诊断框架

The deployment of automated deep-learning classifiers in clinical practice has the potential to streamline the diagnosis process and improve the diagnosis accuracy, but the acceptance of those classifiers relies on both their accuracy and interpretability. In general, accurate deep-learning classifiers provide little model interpretability, while interpretable models do not have competitive classification accuracy. In this paper, we introduce a new deep-learning diagnosis framework, called InterNRL, that is designed to be highly accurate and interpretable. InterNRL consists of a student-teacher framework, where the student model is an interpretable prototype-based classifier (ProtoPNet) and the teacher is an accurate global image classifier (GlobalNet). The two classifiers are mutually optimised with a novel reciprocal learning paradigm in which the student ProtoPNet learns from optimal pseudo labels produced by the teacher GlobalNet, while GlobalNet learns from ProtoPNet's classification performance and pseudo labels. This reciprocal learning paradigm enables InterNRL to be flexibly optimised under both fully- and semi-supervised learning scenarios, reaching state-of-the-art classification performance in both scenarios for the tasks of breast cancer and retinal disease diagnosis. Moreover, relying on weakly-labelled training images, InterNRL also achieves superior breast cancer localisation and brain tumour segmentation results than other competing methods.
在临床实践中部署自动化深度学习分类器有可能简化诊断过程并提高诊断准确性，但这些分类器的接受程度取决于其准确性和可解释性。一般来说，准确的深度学习分类器提供的模型可解释性很小，而可解释模型不具有竞争性的分类准确性。在本文中，我们介绍了一种新的深度学习诊断框架，称为 InterNRL，其设计高度准确且可解释。 InterNRL 由学生-教师框架组成，其中学生模型是可解释的基于原型的分类器 (ProtoPNet)，教师模型是精确的全局图像分类器 (GlobalNet)。这两个分类器通过一种新颖的互惠学习范式相互优化，其中学生 ProtoPNet 从教师 GlobalNet 生成的最佳伪标签中学习，而 GlobalNet 从 ProtoPNet 的分类性能和伪标签中学习。这种互惠学习范式使 InterNRL 能够在全监督和半监督学习场景下进行灵活优化，在乳腺癌和视网膜疾病诊断任务的两种场景下都达到最先进的分类性能。此外，依靠弱标记的训练图像，InterNRL 还取得了比其他竞争方法更优异的乳腺癌定位和脑肿瘤分割结果。

AU Zhi, Shaohua Wang, Yinghui Xiao, Haonan Bai, Ti Li, Bing Tang, Yunsong Liu, Chenyang Li, Wen Li, Tian Ge, Hong Cai, Jing
区志、王少华、肖英辉、白浩南、李体、唐兵、刘云松、李晨阳、李文、戈天、蔡宏、静

Coarse-Super-Resolution-Fine Network (CoSF-Net): A Unified End-to-End Neural Network for 4D-MRI With Simultaneous Motion Estimation and Super-Resolution
粗超分辨率精细网络 (CoSF-Net)：用于 4D-MRI 的统一端到端神经网络，具有同步运动估计和超分辨率

Four-dimensional magnetic resonance imaging (4D-MRI) is an emerging technique for tumor motion management in image-guided radiation therapy (IGRT). However, current 4D-MRI suffers from low spatial resolution and strong motion artifacts owing to the long acquisition time and patients' respiratory variations. If not managed properly, these limitations can adversely affect treatment planning and delivery in IGRT. In this study, we developed a novel deep learning framework called the coarse-super-resolution-fine network (CoSF-Net) to achieve simultaneous motion estimation and super-resolution within a unified model. We designed CoSF-Net by fully excavating the inherent properties of 4D-MRI, with consideration of limited and imperfectly matched training datasets. We conducted extensive experiments on multiple real patient datasets to assess the feasibility and robustness of the developed network. Compared with existing networks and three state-of-the-art conventional algorithms, CoSF-Net not only accurately estimated the deformable vector fields between the respiratory phases of 4D-MRI but also simultaneously improved the spatial resolution of 4D-MRI, enhancing anatomical features and producing 4D-MR images with high spatiotemporal resolution.
四维磁共振成像（4D-MRI）是图像引导放射治疗（IGRT）中肿瘤运动管理的新兴技术。然而，由于采集时间长和患者呼吸变化，当前的 4D-MRI 存在空间分辨率低和运动伪影强的问题。如果管理不当，这些限制可能会对 IGRT 的治疗计划和实施产生不利影响。在这项研究中，我们开发了一种称为粗超分辨率精细网络（CoSF-Net）的新型深度学习框架，以在统一模型内实现同时运动估计和超分辨率。我们充分挖掘4D-MRI的固有特性，并考虑到有限且不完全匹配的训练数据集，设计了CoSF-Net。我们对多个真实患者数据集进行了广泛的实验，以评估所开发网络的可行性和稳健性。与现有网络和三种最先进的传统算法相比，CoSF-Net不仅准确估计了4D-MRI呼吸阶段之间的可变形矢量场，而且同时提高了4D-MRI的空间分辨率，增强了解剖特征并生成具有高时空分辨率的 4D-MR 图像。

AU Zhang, Qixiang Li, Yi Xue, Cheng Wang, Haonan Li, Xiaomeng
张AU、李启翔、薛毅、王成、李浩南、小萌

GlandSAM: Injecting Morphology Knowledge into Segment Anything Model for Label-free Gland Segmentation.
GlandSAM：将形态学知识注入 Segment Anything 模型中，以实现无标记腺体分割。

This paper presents a label-free gland segmentation, GlandSAM, which achieves comparable performance with supervised methods while no label is required during its training or inference phase. We observe that the Segment Anything model produces sub-optimal results on gland dataset: It either over-segments a gland into many fractions or under-segments the gland regions by confusing many of them with the background, due to the complex morphology of glands and lack of sufficient labels. To address this challenge, our GlandSAM innovatively injects two clues about gland morphology into SAM to guide the segmentation process: (1) Heterogeneity within glands and (2) Similarity with the background. Initially, we leverage the clues to decompose the intricate glands by selectively extracting a proposal for each gland sub-region of heterogeneous appearances. Then, we inject the morphology clues into SAM in a fine-tuning manner with a novel morphology-aware semantic grouping module that explicitly groups the high-level semantics of gland sub-regions. In this way, our GlandSAM could capture comprehensive knowledge about gland morphology, and produce well-delineated and complete segmentation results. Extensive experiments conducted on the GlaS dataset and the CRAG dataset reveal that GlandSAM outperforms state-of-the-art label-free methods by a significant margin. Notably, our GlandSAM even surpasses several fully-supervised methods that require pixel-wise labels for training, which highlights the remarkable performance and potential of GlandSAM in the realm of gland segmentation.
本文提出了一种无标签腺体分割 GlandSAM，它实现了与监督方法相当的性能，而在训练或推理阶段不需要标签。我们观察到，Segment Anything 模型在腺体数据集上产生次优结果：由于腺体的复杂形态，它要么将腺体过度分割成许多部分，要么通过将其中许多部分与背景混淆来对腺体区域进行欠分割。缺乏足够的标签。为了应对这一挑战，我们的 GlandSAM 创新地将有关腺体形态的两条线索注入 SAM 中以指导分割过程：(1) 腺体内的异质性和 (2) 与背景的相似性。最初，我们利用线索通过选择性地提取具有异质外观的每个腺体子区域的建议来分解复杂的腺体。然后，我们使用新颖的形态感知语义分组模块以微调方式将形态线索注入 SAM，该模块明确对腺体子区域的高级语义进行分组。通过这种方式，我们的 GlandSAM 可以捕获有关腺体形态的全面知识，并产生清晰且完整的分割结果。在 GlaS 数据集和 CRAG 数据集上进行的大量实验表明，GlandSAM 的性能明显优于最先进的无标签方法。值得注意的是，我们的 GlandSAM 甚至超越了几种需要逐像素标签进行训练的完全监督方法，这凸显了 GlandSAM 在腺体分割领域的卓越性能和潜力。

AU Huang, Yanyan Zhao, Weiqin Fu, Yu Zhu, Lingting Yu, Lequan
黄AU, 赵艳艳, 付伟勤, 朱宇, 于凌婷, 乐泉

Unleash the Power of State Space Model for Whole Slide Image with Local Aware Scanning and Importance Resampling.
通过局部感知扫描和重要性重采样，释放整个幻灯片图像的状态空间模型的力量。

Whole slide image (WSI) analysis is gaining prominence within the medical imaging field. However, previous methods often fall short of efficiently processing entire WSIs due to their gigapixel size. Inspired by recent developments in state space models, this paper introduces a new Pathology Mamba (PAM) for more accurate and robust WSI analysis. PAM includes three carefully designed components to tackle the challenges of enormous image size, the utilization of local and hierarchical information, and the mismatch between the feature distributions of training and testing during WSI analysis. Specifically, we design a Bi-directional Mamba Encoder to process the extensive patches present in WSIs effectively and efficiently, which can handle large-scale pathological images while achieving high performance and accuracy. To further harness the local information and inherent hierarchical structure of WSI, we introduce a novel Local-aware Scanning module, which employs a local-aware mechanism alongside hierarchical scanning to adeptly capture both the local information and the overarching structure within WSIs. Moreover, to alleviate the patch feature distribution misalignment between training and testing, we propose a Test-time Importance Resampling module to conduct testing patch resampling to ensure consistency of feature distribution between the training and testing phases, and thus enhance model prediction. Extensive evaluation on nine WSI datasets with cancer subtyping and survival prediction tasks demonstrates that PAM outperforms current state-of-the-art methods and also its enhanced capability in modeling discriminative areas within WSIs. The source code is available at https://github.com/HKU-MedAI/PAM.
全幻灯片图像 (WSI) 分析在医学成像领域越来越受到重视。然而，由于 WSI 的大小为十亿像素，以前的方法通常无法有效地处理整个 WSI。受状态空间模型最新发展的启发，本文引入了新的 Pathology Mamba (PAM)，以实现更准确、更稳健的 WSI 分析。 PAM 包括三个精心设计的组件，以应对巨大图像尺寸、本地和分层信息的利用以及 WSI 分析期间训练和测试的特征分布之间不匹配的挑战。具体来说，我们设计了一种双向 Mamba 编码器来有效且高效地处理 WSI 中存在的大量斑块，它可以处理大规模病理图像，同时实现高性能和准确性。为了进一步利用 WSI 的本地信息和固有的层次结构，我们引入了一种新颖的本地感知扫描模块，该模块采用本地感知机制和层次扫描来熟练地捕获 WSI 内的本地信息和总体结构。此外，为了缓解训练和测试之间的补丁特征分布不一致，我们提出了测试时重要性重采样模块来进行测试补丁重采样，以确保训练和测试阶段之间特征分布的一致性，从而增强模型预测。对包含癌症亚型和生存预测任务的 9 个 WSI 数据集进行的广泛评估表明，PAM 优于当前最先进的方法，并且其在 WSI 内的判别区域建模方面的能力也得到了增强。源代码可在 https://github.com/HKU-MedAI/PAM 获取。

AU Thies, Mareike Wagner, Fabian Maul, Noah Yu, Haijun Meier, Manuela Goldmann Schneider, Linda-Sophie Gu, Mingxuan Mei, Siyuan Folle, Lukas Preuhs, Alexander Manhart, Michael Maier, Andreas
AU Thies、Mareike Wagner、Fabian Maul、Noah Yu、海军梅尔、Manuela Goldmann Schneider、Linda-Sophie Gu、梅明轩、思源·福勒、Lukas Preuhs、Alexander Manhart、Michael Maier、Andreas

A gradient-based approach to fast and accurate head motion compensation in cone-beam CT.
一种基于梯度的方法，可在锥束 CT 中实现快速、准确的头部运动补偿。

Cone-beam computed tomography (CBCT) systems, with their flexibility, present a promising avenue for direct point-of-care medical imaging, particularly in critical scenarios such as acute stroke assessment. However, the integration of CBCT into clinical workflows faces challenges, primarily linked to long scan duration resulting in patient motion during scanning and leading to image quality degradation in the reconstructed volumes. This paper introduces a novel approach to CBCT motion estimation using a gradient-based optimization algorithm, which leverages generalized derivatives of the backprojection operator for cone-beam CT geometries. Building on that, a fully differentiable target function is formulated which grades the quality of the current motion estimate in reconstruction space. We drastically accelerate motion estimation yielding a 19-fold speed-up compared to existing methods. Additionally, we investigate the architecture of networks used for quality metric regression and propose predicting voxel-wise quality maps, favoring autoencoder-like architectures over contracting ones. This modification improves gradient flow, leading to more accurate motion estimation. The presented method is evaluated through realistic experiments on head anatomy. It achieves a reduction in reprojection error from an initial average of 3 mm to 0.61 mm after motion compensation and consistently demonstrates superior performance compared to existing approaches. The analytic Jacobian for the backprojection operation, which is at the core of the proposed method, is made publicly available. In summary, this paper contributes to the advancement of CBCT integration into clinical workflows by proposing a robust motion estimation approach that enhances efficiency and accuracy, addressing critical challenges in time-sensitive scenarios.
锥形束计算机断层扫描 (CBCT) 系统凭借其灵活性，为直接护理点医学成像提供了一种有前景的途径，特别是在急性中风评估等关键情况下。然而，将 CBCT 集成到临床工作流程中面临着挑战，主要与长扫描持续时间有关，导致扫描期间患者运动并导致重建体积中的图像质量下降。本文介绍了一种使用基于梯度的优化算法进行 CBCT 运动估计的新方法，该算法利用锥束 CT 几何形状的反投影算子的广义导数。在此基础上，制定了一个完全可微的目标函数，该函数对重建空间中当前运动估计的质量进行分级。我们极大地加速了运动估计，与现有方法相比，速度提高了 19 倍。此外，我们研究了用于质量度量回归的网络架构，并提出预测体素质量图，与收缩架构相比，更倾向于类似自动编码器的架构。此修改改进了梯度流，从而实现更准确的运动估计。通过头部解剖学的真实实验来评估所提出的方法。运动补偿后，它可以将重投影误差从最初的平均 3 毫米减少到 0.61 毫米，并且始终表现出比现有方法更优越的性能。作为所提出方法的核心的反投影运算的解析雅可比行列式是公开的。总之，本文提出了一种强大的运动估计方法，可提高效率和准确性，解决时间敏感场景中的关键挑战，有助于推动 CBCT 融入临床工作流程。

AU Weng, Li Zhu, Zhoule Dai, Kaixin Zheng, Zhe Zhu, Junming Wu, Hemmings
翁翁、朱丽、戴周乐、郑凯欣、朱哲、吴俊明、Hemmings

Reduced-Reference Learning for Target Localization in Deep Brain Stimulation
脑深部刺激中目标定位的减少参考学习

This work proposes a supervised machine learning method for target localization in deep brain stimulation (DBS). DBS is a recognized treatment for essential tremor. The effects of DBS significantly depend on the precise implantation of electrodes. Recent research on diffusion tensor imaging shows that the optimal target for essential tremor is related to the dentato-rubro-thalamic tract (DRTT), thus DRTT targeting has become a promising direction. The tractography-based targeting is more accurate than conventional ones, but still too complicated for clinical scenarios, where only structural magnetic resonance imaging (sMRI) data is available. In order to improve efficiency and utility, we consider target localization as a non-linear regression problem in a reduced-reference learning framework, and solve it with convolutional neural networks (CNNs). The proposed method is an efficient two-step framework, and consists of two image-based networks: one for classification and the other for localization. We model the basic workflow as an image retrieval process and define relevant performance metrics. Using DRTT as pseudo groundtruths, we show that individualized tractography-based optimal targets can be inferred from sMRI data with high accuracy. For two datasets of ${280}\times {220}/{272}\times {227}$ (0.7/0.8 mm slice thickness) sMRI input, our model achieves an average posterior localization error of 2.3/1.2 mm, and a median of 1.7/1.02 mm. The proposed framework is a novel application of reduced-reference learning, and a first attempt to localize DRTT from sMRI. It significantly outperforms existing methods using 3D-CNN, anatomical and DRTT atlas, and may serve as a new baseline for general target localization problems.
这项工作提出了一种用于深部脑刺激（DBS）中目标定位的监督机器学习方法。 DBS 是公认的特发性震颤治疗方法。 DBS 的效果很大程度上取决于电极的精确植入。最近的弥散张量成像研究表明，特发性震颤的最佳靶点与齿状红丘脑束（DRTT）有关，因此DRTT靶向已成为一个有前途的方向。基于纤维束成像的靶向比传统靶向更准确，但对于只能获得结构磁共振成像（sMRI）数据的临床场景来说仍然过于复杂。为了提高效率和实用性，我们将目标定位视为减少参考学习框架中的非线性回归问题，并用卷积神经网络（CNN）来解决它。所提出的方法是一种有效的两步框架，由两个基于图像的网络组成：一个用于分类，另一个用于定位。我们将基本工作流程建模为图像检索过程，并定义相关的性能指标。使用 DRTT 作为伪事实，我们表明可以从 sMRI 数据中高精度地推断出基于个体化纤维束成像的最佳目标。对于 ${280}\times {220}/{272}\times {227}$ (0.7/0.8 mm 切片厚度) sMRI 输入的两个数据集，我们的模型实现了 2.3/1.2 mm 的平均后定位误差，并且中位数为 1.7/1.02 毫米。所提出的框架是减少参考学习的新颖应用，也是从 sMRI 定位 DRTT 的首次尝试。它明显优于使用 3D-CNN、解剖学和 DRTT 图集的现有方法，并且可以作为一般目标定位问题的新基线。

AU Xu, Chi Xu, Haozheng Giannarou, Stamatia
AU 徐、徐驰、Haozheng Giannarou、Stamatia

Distance Regression Enhanced with Temporal Information Fusion and Adversarial Training for Robot-Assisted Endomicroscopy.
通过时间信息融合和机器人辅助内镜检查的对抗训练增强距离回归。

Probe-based confocal laser endomicroscopy (pCLE) has a role in characterising tissue intraoperatively to guide tumour resection during surgery. To capture good quality pCLE data which is important for diagnosis, the probe-tissue contact needs to be maintained within a working range of micrometre scale. This can be achieved through micro-surgical robotic manipulation which requires the automatic estimation of the probe-tissue distance. In this paper, we propose a novel deep regression framework composed of the Deep Regression Generative Adversarial Network (DR-GAN) and a Sequence Attention (SA) module. The aim of DR-GAN is to train the network using an enhanced image-based supervision approach. It extents the standard generator by using a well-defined function for image generation, instead of a learnable decoder. Also, DR-GAN uses a novel learnable neural perceptual loss which combines for the first time spatial and frequency domain features. This effectively suppresses the adverse effects of noise in the pCLE data. To incorporate temporal information, we've designed the SA module which is a cross-attention module, enhanced with Radial Basis Function based encoding (SA-RBF). Furthermore, to train the regression framework, we designed a multi-step training mechanism. During inference, the trained network is used to generate data representations which are fused along time in the SA-RBF module to boost the regression stability. Our proposed network advances SOTA networks by addressing the challenge of excessive noise in the pCLE data and enhancing regression stability. It outperforms SOTA networks applied on the pCLE Regression dataset (PRD) in terms of accuracy, data quality and stability.
基于探针的共焦激光内窥镜（pCLE）在术中表征组织以指导手术期间的肿瘤切除方面发挥着作用。为了捕获对诊断很重要的高质量 pCLE 数据，探针与组织的接触需要保持在微米级的工作范围内。这可以通过显微外科机器人操作来实现，这需要自动估计探针与组织的距离。在本文中，我们提出了一种新颖的深度回归框架，由深度回归生成对抗网络（DR-GAN）和序列注意（SA）模块组成。 DR-GAN 的目标是使用增强的基于图像的监督方法来训练网络。它通过使用定义明确的图像生成函数而不是可学习的解码器来扩展标准生成器。此外，DR-GAN 使用了一种新颖的可学习神经感知损失，首次结合了空间和频域特征。这有效地抑制了pCLE数据中噪声的不利影响。为了合并时间信息，我们设计了 SA 模块，它是一个交叉注意力模块，并通过基于径向基函数的编码 (SA-RBF) 进行了增强。此外，为了训练回归框架，我们设计了一个多步骤训练机制。在推理过程中，经过训练的网络用于生成数据表示，这些数据表示在 SA-RBF 模块中随时间融合，以提高回归稳定性。我们提出的网络通过解决 pCLE 数据中过多噪声的挑战并增强回归稳定性来推进 SOTA 网络。它在准确性、数据质量和稳定性方面优于应用于 pCLE 回归数据集（PRD）的 SOTA 网络。

AU Tan, Zuopeng Zhang, Lihe Lv, Yanan Ma, Yili Lu, Huchuan
谭AU、张作鹏、吕礼合、马亚男、路一立、胡川

GroupMorph: Medical Image Registration via Grouping Network with Contextual Fusion.
GroupMorph：通过具有上下文融合的分组网络进行医学图像配准。

Pyramid-based deformation decomposition is a promising registration framework, which gradually decomposes the deformation field into multi-resolution subfields for precise registration. However, most pyramid-based methods directly produce one subfield per resolution level, which does not fully depict the spatial deformation. In this paper, we propose a novel registration model, called GroupMorph. Different from typical pyramid-based methods, we adopt the grouping-combination strategy to predict deformation field at each resolution. Specifically, we perform group-wise correlation calculation to measure the similarities of grouped features. After that, n groups of deformation subfields with different receptive fields are predicted in parallel. By composing these subfields, a deformation field with multi-receptive field ranges is formed, which can effectively identify both large and small deformations. Meanwhile, a contextual fusion module is designed to fuse the contextual features and provide the inter-group information for the field estimator of the next level. By leveraging the inter-group correspondence, the synergy among deformation subfields is enhanced. Extensive experiments on four public datasets demonstrate the effectiveness of GroupMorph. Code is available at https://github.com/TVayne/GroupMorph.
基于金字塔的变形分解是一种有前途的配准框架，它将变形场逐渐分解为多分辨率子场以进行精确配准。然而，大多数基于金字塔的方法直接为每个分辨率级别生成一个子场，这不能完全描述空间变形。在本文中，我们提出了一种新颖的注册模型，称为 GroupMorph。与典型的基于金字塔的方法不同，我们采用分组组合策略来预测每个分辨率下的变形场。具体来说，我们执行分组相关性计算来测量分组特征的相似性。之后，并行预测n组具有不同感受野的形变子场。通过组合这些子场，形成具有多个感受野范围的变形场，可以有效地识别大变形和小变形。同时，设计了上下文融合模块来融合上下文特征，为下一级的场估计器提供组间信息。通过利用组间对应关系，增强了变形子场之间的协同作用。对四个公共数据集的大量实验证明了 GroupMorph 的有效性。代码可在 https://github.com/TVayne/GroupMorph 获取。

EI 1558-254X DA 2024-05-16 UT MEDLINE:38739510 PM 38739510 ER
EI 1558-254X DA 2024-05-16 UT MEDLINE：38739510 PM 38739510 ER

AU Ambrosanio, Michele Bevacqua, Martina Teresa Lovetri, Joe Pascazio, Vito Isernia, Tommaso
AU Ambrosanio、米歇尔·贝瓦夸、玛蒂娜·特蕾莎·洛夫特里、乔·帕斯卡齐奥、维托·伊塞尔尼亚、托马索

In-Vivo Electrical Properties Estimation of Biological Tissues by Means of a Multi-Step Microwave Tomography Approach
通过多步微波断层扫描方法估计生物组织的体内电特性

The accurate quantitative estimation of the electromagnetic properties of tissues can serve important diagnostic and therapeutic medical purposes. Quantitative microwave tomography is an imaging modality that can provide maps of the in-vivo electromagnetic properties of the imaged tissues, i.e. both the permittivity and the electric conductivity. A multi-step microwave tomography approach is proposed for the accurate retrieval of such spatial maps of biological tissues. The underlying idea behind the new imaging approach is to progressively add details to the maps in a step-wise fashion starting from single-frequency qualitative reconstructions. Multi-frequency microwave data is utilized strategically in the final stage. The approach results in improved accuracy of the reconstructions compared to inversion of the data in a single step. As a case study, the proposed workflow was tested on an experimental microwave data set collected for the imaging of the human forearm. The human forearm is a good test case as it contains several soft tissues as well as bone, exhibiting a wide range of values for the electrical properties.
组织电磁特性的准确定量估计可以服务于重要的诊断和治疗医学目的。定量微波断层扫描是一种成像方式，可以提供成像组织的体内电磁特性图，即介电常数和电导率。提出了一种多步骤微波断层扫描方法，用于准确检索生物组织的此类空间图。新成像方法背后的基本思想是从单频定性重建开始，以逐步的方式逐步向地图添加细节。在最后阶段战略性地利用多频微波数据。与单步数据反演相比，该方法提高了重建的准确性。作为案例研究，所提出的工作流程在为人类前臂成像而收集的实验微波数据集上进行了测试。人类前臂是一个很好的测试用例，因为它包含多个软组织和骨骼，表现出广泛的电特性值。

AU Guo, Rui Lin, Zhichao Xin, Jingyu Li, Maokun Yang, Fan Xu, Shenheng Abubakar, Aria
郭果、林睿、辛志超、李靖宇、杨茂琨、徐帆、申恒阿布巴卡、Aria

Three Dimensional Microwave Data Inversion in Feature Space for Stroke Imaging
中风成像特征空间中的三维微波数据反演

Microwave imaging is a promising method for early diagnosing and monitoring brain strokes. It is portable, non-invasive, and safe to the human body. Conventional techniques solve for unknown electrical properties represented as pixels or voxels, but often result in inadequate structural information and high computational costs. We propose to reconstruct the three dimensional (3D) electrical properties of the human brain in a feature space, where the unknowns are latent codes of a variational autoencoder (VAE). The decoder of the VAE, with prior knowledge of the brain, acts as a module of data inversion. The codes in the feature space are optimized by minimizing the misfit between measured and simulated data. A dataset of 3D heads characterized by permittivity and conductivity is constructed to train the VAE. Numerical examples show that our method increases structural similarity by 14% and speeds up the solution process by over 3 orders of magnitude using only 4.8% number of the unknowns compared to the voxel-based method. This high-resolution imaging of electrical properties leads to more accurate stroke diagnosis and offers new insights into the study of the human brain.
微波成像是早期诊断和监测脑中风的一种有前途的方法。它便携、无创、对人体安全。传统技术解决了以像素或体素表示的未知电特性，但通常会导致结构信息不足和计算成本高昂。我们建议在特征空间中重建人脑的三维（3D）电特性，其中未知数是变分自动编码器（VAE）的潜在代码。 VAE 的解码器具有大脑的先验知识，充当数据反演的模块。通过最小化测量数据和模拟数据之间的失配来优化特征空间中的代码。构建以介电常数和电导率为特征的 3D 头部数据集来训练 VAE。数值示例表明，与基于体素的方法相比，我们的方法仅使用 4.8% 的未知数，将结构相似性提高了 14%，并将求解过程加快了 3 个数量级以上。这种高分辨率的电特性成像可以实现更准确的中风诊断，并为人类大脑的研究提供新的见解。

AU Wu, Weiwen Pan, Jiayi Wang, Yanyang Wang, Shaoyu Zhang, Jianjia
吴AU、潘伟文、王佳怡、王艳阳、张少宇、健佳

Multi-channel Optimization Generative Model for Stable Ultra-Sparse-View CT Reconstruction.
用于稳定超稀疏视图 CT 重建的多通道优化生成模型。

Score-based generative model (SGM) has risen to prominence in sparse-view CT reconstruction due to its impressive generation capability. The consistency of data is crucial in guiding the reconstruction process in SGM-based reconstruction methods. However, the existing data consistency policy exhibits certain limitations. Firstly, it employs partial data from the reconstructed image of iteration process for image updates, which leads to secondary artifacts with compromising image quality. Moreover, the updates to the SGM and data consistency are considered as distinct stages, disregarding their interdependent relationship. Additionally, the reference image used to compute gradients in the reconstruction process is derived from intermediate result rather than ground truth. Motivated by the fact that a typical SGM yields distinct outcomes with different random noise inputs, we propose a Multi-channel Optimization Generative Model (MOGM) for stable ultra-sparse-view CT reconstruction by integrating a novel data consistency term into the stochastic differential equation model. Notably, the unique aspect of this data consistency component is its exclusive reliance on original data for effectively confining generation outcomes. Furthermore, we pioneer an inference strategy that traces back from the current iteration result to ground truth, enhancing reconstruction stability through foundational theoretical support. We also establish a multi-channel optimization reconstruction framework, where conventional iterative techniques are employed to seek the reconstruction solution. Quantitative and qualitative assessments on 23 views datasets from numerical simulation, clinical cardiac and sheep's lung underscore the superiority of MOGM over alternative methods. Reconstructing from just 10 and 7 views, our method consistently demonstrates exceptional performance.
基于评分的生成模型 (SGM) 因其令人印象深刻的生成能力而在稀疏视图 CT 重建中脱颖而出。在基于 SGM 的重建方法中，数据的一致性对于指导重建过程至关重要。然而，现有的数据一致性策略存在一定的局限性。首先，它利用迭代过程中重建图像的部分数据进行图像更新，这会导致二次伪影，从而影响图像质量。此外，SGM 的更新和数据一致性被视为不同的阶段，而不考虑它们之间的相互依赖关系。此外，在重建过程中用于计算梯度的参考图像是从中间结果而不是地面实况中得出的。受典型 SGM 在不同随机噪声输入下产生不同结果这一事实的启发，我们提出了一种多通道优化生成模型 (MOGM)，通过将新颖的数据一致性项集成到随机微分方程中，实现稳定的超稀疏视图 CT 重建模型。值得注意的是，该数据一致性组件的独特之处在于它完全依赖原始数据来有效限制发电结果。此外，我们首创了一种从当前迭代结果追溯到地面实况的推理策略，通过基础理论支持增强重建稳定性。我们还建立了多通道优化重建框架，采用传统的迭代技术来寻求重建解决方案。对数值模拟、临床心脏和羊肺的 23 个视图数据集进行的定量和定性评估强调了 MOGM 相对于其他方法的优越性。仅从 10 个和 7 个视图进行重建，我们的方法始终表现出卓越的性能。

AU Liu, Jingyu Cui, Weigang Chen, Yipeng Ma, Yulan Dong, Qunxi Cai, Ran Li, Yang Hu, Bin
刘AU、崔静宇、陈伟刚、马一鹏、董玉兰、蔡群喜、李然、胡杨、斌

Deep Fusion of Multi-Template Using Spatio-Temporal Weighted Multi-Hypergraph Convolutional Networks for Brain Disease Analysis
使用时空加权多超图卷积网络进行多模板深度融合进行脑疾病分析

Conventional functional connectivity network (FCN) based on resting-state fMRI (rs-fMRI) can only reflect the relationship between pairwise brain regions. Thus, the hyper-connectivity network (HCN) has been widely used to reveal high-order interactions among multiple brain regions. However, existing HCN models are essentially spatial HCN, which reflect the spatial relevance of multiple brain regions, but ignore the temporal correlation among multiple time points. Furthermore, the majority of HCN construction and learning frameworks are limited to using a single template, while the multi-template carries richer information. To address these issues, we first employ multiple templates to parcellate the rs-fMRI into different brain regions. Then, based on the multi-template data, we propose a spatio-temporal weighted HCN (STW-HCN) to capture more comprehensive high-order temporal and spatial properties of brain activity. Next, a novel deep fusion model of multi-template called spatio-temporal weighted multi-hypergraph convolutional network (STW-MHGCN) is proposed to fuse the STW-HCN of multiple templates, which extracts the deep interrelation information between different templates. Finally, we evaluate our method on the ADNI-2 and ABIDE-I datasets for mild cognitive impairment (MCI) and autism spectrum disorder (ASD) analysis. Experimental results demonstrate that the proposed method is superior to the state-of-the-art approaches in MCI and ASD classification, and the abnormal spatio-temporal hyper-edges discovered by our method have significant significance for the brain abnormalities analysis of MCI and ASD.
传统的基于静息态功能磁共振成像（rs-fMRI）的功能连接网络（FCN）只能反映成对大脑区域之间的关系。因此，超连接网络（HCN）已被广泛用于揭示多个大脑区域之间的高阶相互作用。然而，现有的HCN模型本质上是空间HCN，反映了多个大脑区域的空间相关性，但忽略了多个时间点之间的时间相关性。此外，大多数HCN构建和学习框架仅限于使用单个模板，而多模板承载了更丰富的信息。为了解决这些问题，我们首先使用多个模板将 rs-fMRI 分割到不同的大脑区域。然后，基于多模板数据，我们提出了时空加权 HCN（STW-HCN）来捕获更全面的大脑活动的高阶时空特性。接下来，提出了一种新颖的多模板深度融合模型，称为时空加权多超图卷积网络（STW-MHGCN）来融合多个模板的STW-HCN，提取不同模板之间的深层相互关系信息。最后，我们在 ADNI-2 和 ABIDE-I 数据集上评估我们的方法，以进行轻度认知障碍 (MCI) 和自闭症谱系障碍 (ASD) 分析。实验结果表明，该方法优于 MCI 和 ASD 分类中最先进的方法，并且我们的方法发现的异常时空超边缘对于 MCI 和 ASD 的大脑异常分析具有重要意义。

AU Razavi, Raha Plonka, Gerlind Rabbani, Hossein
AU Razavi、Raha Plonka、Gerlind Rabbani、Hossein

X-Let's Atom Combinations for Modeling and Denoising of OCT Images by Modified Morphological Component Analysis
X-让我们通过改进的形态成分分析对 OCT 图像进行建模和去噪的原子组合

An improved analysis of Optical Coherence Tomography (OCT) images of the retina is of essential importance for the correct diagnosis of retinal abnormalities. Unfortunately, OCT images suffer from noise arising from different sources. In particular, speckle noise caused by the scattering of light waves strongly degrades the quality of OCT image acquisitions. In this paper, we employ a Modified Morphological Component Analysis (MMCA) to provide a new method that separates the image into components that contain different features as texture, piecewise smooth parts, and singularities along curves. Each image component is computed as a sparse representation in a suitable dictionary. To create these dictionaries, we use non-data-adaptive multi-scale ( X -let) transforms which have been shown to be well suitable to extract the special OCT image features. In this way, we reach two goals at once. On the one hand, we achieve strongly improved denoising results by applying adaptive local thresholding techniques separately to each image component. The denoising performance outperforms other state-of-the-art denoising algorithms regarding the PSNR as well as no-reference image quality assessments. On the other hand, we obtain a decomposition of the OCT images in well-interpretable image components that can be exploited for further image processing tasks, such as classification.
改进视网膜光学相干断层扫描 (OCT) 图像的分析对于正确诊断视网膜异常至关重要。不幸的是，OCT 图像受到不同来源产生的噪声的影响。特别是，光波散射引起的散斑噪声严重降低了 OCT 图像采集的质量。在本文中，我们采用改进的形态成分分析（MMCA）来提供一种新方法，将图像分成包含不同特征的成分，如纹理、分段平滑部分和沿曲线的奇点。每个图像分量都被计算为合适字典中的稀疏表示。为了创建这些字典，我们使用非数据自适应多尺度（X-let）变换，该变换已被证明非常适合提取特殊的 OCT 图像特征。这样，我们就同时达到了两个目标。一方面，我们通过对每个图像分量分别应用自适应局部阈值技术，获得了显着改善的去噪结果。在 PSNR 以及无参考图像质量评估方面，去噪性能优于其他最先进的去噪算法。另一方面，我们获得了 OCT 图像在易于解释的图像成分中的分解，可用于进一步的图像处理任务，例如分类。

AU Li, Ziyu Miller, Karla L Chen, Xi Chiew, Mark Wu, Wenchuan
AU Li, Ziyu Miller, Karla L Chen, Xi Chiew, Mark Wu, 汶川

Self-navigated 3D diffusion MRI using an optimized CAIPI sampling and structured low-rank reconstruction estimated navigator.
使用优化的 CAIPI 采样和结构化低秩重建估计导航器的自导航 3D 扩散 MRI。

3D multi-slab acquisitions are an appealing approach for diffusion MRI because they are compatible with the imaging regime delivering optimal SNR efficiency. In conventional 3D multi-slab imaging, shot-to-shot phase variations caused by motion pose challenges due to the use of multi-shot k-space acquisition. Navigator acquisition after each imaging echo is typically employed to correct phase variations, which prolongs scan time and increases the specific absorption rate (SAR). The aim of this study is to develop a highly efficient, self-navigated method to correct for phase variations in 3D multi-slab diffusion MRI without explicitly acquiring navigators. The sampling of each shot is carefully designed to intersect with the central kz=0 plane of each slab, and the multi-shot sampling is optimized for self-navigation performance while retaining decent reconstruction quality. The kz=0 intersections from all shots are jointly used to reconstruct a 2D phase map for each shot using a structured low-rank constrained reconstruction that leverages the redundancy in shot and coil dimensions. The phase maps are used to eliminate the shot-to-shot phase inconsistency in the final 3D multi-shot reconstruction. We demonstrate the method's efficacy using retrospective simulations and prospectively acquired in-vivo experiments at 1.22 mm and 1.09 mm isotropic resolutions. Compared to conventional navigated 3D multi-slab imaging, the proposed self-navigated method achieves comparable image quality while shortening the scan time by 31.7% and improving the SNR efficiency by 15.5%. The proposed method produces comparable quality of DTI and white matter tractography to conventional navigated 3D multi-slab acquisition with a much shorter scan time.
3D 多板采集是扩散 MRI 的一种有吸引力的方法，因为它们与提供最佳 SNR 效率的成像机制兼容。在传统的 3D 多板成像中，由于使用多镜头 k 空间采集，由运动引起的镜头间相位变化带来了挑战。每次成像回波后的导航器采集通常用于校正相位变化，从而延长扫描时间并增加比吸收率 (SAR)。本研究的目的是开发一种高效的自导航方法来校正 3D 多板扩散 MRI 中的相位变化，而无需显式获取导航器。每个镜头的采样都经过精心设计，与每个板的中心 kz=0 平面相交，并且多镜头采样针对自导航性能进行了优化，同时保留了良好的重建质量。来自所有炮弹的 kz=0 交点被联合用于使用利用炮弹和线圈尺寸中的冗余的结构化低秩约束重建来重建每个炮弹的二维相位图。相位图用于消除最终 3D 多镜头重建中镜头与镜头之间的相位不一致。我们使用回顾性模拟和前瞻性体内实验在 1.22 毫米和 1.09 毫米各向同性分辨率下证明了该方法的有效性。与传统的导航3D多板成像相比，所提出的自导航方法实现了可比的图像质量，同时扫描时间缩短了31.7%，信噪比效率提高了15.5%。所提出的方法可产生与传统导航 3D 多板采集相当的 DTI 和白质纤维束成像质量，且扫描时间短得多。

AU Emre, Taha Chakravarty, Arunava Rivail, Antoine Lachinov, Dmitrii Leingang, Oliver Riedl, Sophie Mai, Julia Scholl, Hendrik P. N. Sivaprasad, Sobha Rueckert, Daniel Lotery, Andrew Schmidt-Erfurth, Ursula Bogunovic, Hrvoje CA PINNACLE Consortium
AU Emre、Taha Chakravarty、Arunava Rivail、Antoine Lachinov、Dmitrii Leingang、Oliver Riedl、Sophie Mai、Julia Scholl、Hendrik PN Sivaprasad、Sobha Rueckert、Daniel Lotery、Andrew Schmidt-Erfurth、Ursula Bogunovic、Hrvoje CA PINNACLE 联盟

3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease Progression From Longitudinal OCTs
3DTINC：通过纵向 OCT 预测疾病进展的时间等变非对比学习

Self-supervised learning (SSL) has emerged as a powerful technique for improving the efficiency and effectiveness of deep learning models. Contrastive methods are a prominent family of SSL that extract similar representations of two augmented views of an image while pushing away others in the representation space as negatives. However, the state-of-the-art contrastive methods require large batch sizes and augmentations designed for natural images that are impractical for 3D medical images. To address these limitations, we propose a new longitudinal SSL method, 3DTINC, based on non-contrastive learning. It is designed to learn perturbation-invariant features for 3D optical coherence tomography (OCT) volumes, using augmentations specifically designed for OCT. We introduce a new non-contrastive similarity loss term that learns temporal information implicitly from intra-patient scans acquired at different times. Our experiments show that this temporal information is crucial for predicting progression of retinal diseases, such as age-related macular degeneration (AMD). After pretraining with 3DTINC, we evaluated the learned representations and the prognostic models on two large-scale longitudinal datasets of retinal OCTs where we predict the conversion to wet-AMD within a six-month interval. Our results demonstrate that each component of our contributions is crucial for learning meaningful representations useful in predicting disease progression from longitudinal volumetric scans.
自监督学习（SSL）已成为提高深度学习模型效率和有效性的强大技术。对比方法是 SSL 的一个重要家族，它提取图像的两个增强视图的相似表示，同时将表示空间中的其他视图作为底片排除。然而，最先进的对比方法需要大批量大小和针对自然图像设计的增强，这对于 3D 医学图像来说是不切实际的。为了解决这些限制，我们提出了一种基于非对比学习的新的纵向 SSL 方法 3DTINC。它旨在使用专为 OCT 设计的增强功能来学习 3D 光学相干断层扫描 (OCT) 体积的扰动不变特征。我们引入了一种新的非对比相似性损失项，它可以从不同时间获取的患者体内扫描中隐式学习时间信息。我们的实验表明，这种时间信息对于预测视网膜疾病的进展至关重要，例如年龄相关性黄斑变性（AMD）。使用 3DTINC 进行预训练后，我们在两个大规模视网膜 OCT 纵向数据集上评估了学习到的表示和预后模型，我们预测在六个月的时间间隔内向湿性 AMD 的转化。我们的结果表明，我们贡献的每个组成部分对于学习有意义的表示至关重要，这些表示有助于通过纵向体积扫描预测疾病进展。

AU Kong, Youyong Zhang, Xiaotong Wang, Wenhan Zhou, Yue Li, Yueying Yuan, Yonggui
AU Kong、张友勇、王晓彤、周文瀚、李悦、袁月英、永贵

Multi-Scale Spatial-Temporal Attention Networks for Functional Connectome Classification.
用于功能连接组分类的多尺度时空注意力网络。

Many neuropsychiatric disorders are considered to be associated with abnormalities in the functional connectivity networks of the brain. The research on the classification of functional connectivity can therefore provide new perspectives for understanding the pathology of disorders and contribute to early diagnosis and treatment. Functional connectivity exhibits a nature of dynamically changing over time, however, the majority of existing methods are unable to collectively reveal the spatial topology and time-varying characteristics. Furthermore, despite the efforts of limited spatial-temporal studies to capture rich information across different spatial scales, they have not delved into the temporal characteristics among different scales. To address above issues, we propose a novel Multi-Scale Spatial-Temporal Attention Networks (MSSTAN) to exploit the multi-scale spatial-temporal information provided by functional connectome for classification. To fully extract spatial features of brain regions, we propose a Topology Enhanced Graph Transformer module to guide the attention calculations in the learning of spatial features by incorporating topology priors. A Multi-Scale Pooling Strategy is introduced to obtain representations of brain connectome at various scales. Considering the temporal dynamic characteristics between dynamic functional connectome, we employ Locality Sensitive Hashing attention to further capture long-term dependencies in time dynamics across multiple scales and reduce the computational complexity of the original attention mechanism. Experiments on three brain fMRI datasets of MDD and ASD demonstrate the superiority of our proposed approach. In addition, benefiting from the attention mechanism in Transformer, our results are interpretable, which can contribute to the discovery of biomarkers. The code is available at https://github.com/LIST-KONG/MSSTAN.
许多神经精神疾病被认为与大脑功能连接网络的异常有关。因此，对功能连接分类的研究可以为理解疾病的病理学提供新的视角，并有助于早期诊断和治疗。功能连通性表现出随时间动态变化的性质，然而，大多数现有方法无法共同揭示空间拓扑和时变特征。此外，尽管有限的时空研究努力捕捉不同空间尺度上的丰富信息，但他们并没有深入研究不同尺度之间的时间特征。为了解决上述问题，我们提出了一种新颖的多尺度时空注意力网络（MSSTAN），以利用功能连接组提供的多尺度时空信息进行分类。为了充分提取大脑区域的空间特征，我们提出了一个拓扑增强图转换器模块，通过结合拓扑先验来指导空间特征学习中的注意力计算。引入多尺度池化策略来获取不同尺度的大脑连接组的表示。考虑到动态功能连接组之间的时间动态特征，我们采用局部敏感哈希注意力来进一步捕获跨多个尺度的时间动态的长期依赖性，并降低原始注意力机制的计算复杂度。对 MDD 和 ASD 的三个大脑功能磁共振成像数据集的实验证明了我们提出的方法的优越性。此外，受益于 Transformer 中的注意力机制，我们的结果是可解释的，这可以有助于生物标志物的发现。该代码可从 https://github.com/LIST-KONG/MSSTAN 获取。

EI 1558-254X DA 2024-08-24 UT MEDLINE:39172603 PM 39172603 ER
EI 1558-254X DA 2024-08-24 UT MEDLINE：39172603 PM 39172603 ER

AU Chen, Ruifeng Zhang, Zhongliang Quan, Guotao Du, Yanfeng Chen, Yang Li, Yinsheng
陈AU、张瑞峰、全忠良、杜国涛、陈彦峰、李杨、银生

PRECISION: A Physics-Constrained and Noise-Controlled Diffusion Model for Photon Counting Computed Tomography.
精度：用于光子计数计算机断层扫描的物理约束和噪声控制扩散模型。

Recently, the use of photon counting detectors in computed tomography (PCCT) has attracted extensive attention. It is highly desired to improve the quality of material basis image and the quantitative accuracy of elemental composition, particularly when PCCT data is acquired at lower radiation dose levels. In this work, we develop a physics-constrained and noise-controlled diffusion model, PRECISION in short, to address the degraded quality of material basis images and inaccurate quantification of elemental composition mainly caused by imperfect noise model and/or hand-crafted regularization of material basis images, such as local smoothness and/or sparsity, leveraged in the existing direct material basis image reconstruction approaches. In stark contrast, PRECISION learns distribution-level regularization to describe the feature of ideal material basis images via training a noise-controlled spatial-spectral diffusion model. The optimal material basis images of each individual subject are sampled from this learned distribution under the constraint of the physical model of a given PCCT and the measured data obtained from the subject. PRECISION exhibits the potential to improve the quality of material basis images and the quantitative accuracy of elemental composition for PCCT.
最近，光子计数探测器在计算机断层扫描（PCCT）中的使用引起了广泛的关注。非常需要提高材料基础图像的质量和元素成分的定量精度，特别是在较低辐射剂量水平下采集 PCCT 数据时。在这项工作中，我们开发了一种物理约束和噪声控制的扩散模型，简称 PRECISION，以解决主要由不完善的噪声模型和/或手工正则化引起的材料基础图像质量下降和元素成分量化不准确的问题。在现有的直接物质基础图像重建方法中利用物质基础图像，例如局部平滑度和/或稀疏度。形成鲜明对比的是，PRECISION 通过训练噪声控制的空间光谱扩散模型来学习分布级正则化来描述理想材料基础图像的特征。在给定 PCCT 的物理模型和从受试者获得的测量数据的约束下，从该学习分布中采样每个受试者的最佳材料基础图像。 PRECISION 展现出提高 PCCT 材料基础图像质量和元素成分定量准确性的潜力。

AU Majumder, Sharmin Islam, Md. Tauhidul Taraballi, Francesca Righetti, Raffaella
AU Majumder、Sharmin Islam、Md. Tauhidul Taraballi、Francesca Righetti、Raffaella

Non-Invasive Imaging of Mechanical Properties of Cancers In Vivo Based on Transformations of the Eshelby's Tensor Using Compression Elastography
基于压缩弹性成像的埃谢尔比张量变换的体内癌症机械特性的非侵入性成像

Knowledge of the mechanical properties is of great clinical significance for diagnosis, prognosis and treatment of cancers. Recently, a new method based on Eshelby's theory to simultaneously assess Young's modulus (YM) and Poisson's ratio (PR) in tissues has been proposed. A significant limitation of this method is that accuracy of the reconstructed YM and PR is affected by the orientation/alignment of the tumor with the applied stress. In this paper, we propose a new method to reconstruct YM and PR in cancers that is invariant to the 3D orientation of the tumor with respect to the axis of applied stress. The novelty of the proposed method resides on the use of a tensor transformation to improve the robustness of Eshelby's theory and reconstruct YM and PR of tumors with high accuracy and in realistic experimental conditions. The method is validated using finite element simulations and controlled experiments using phantoms with known mechanical properties. The in vivo feasibility of the developed method is demonstrated in an orthotopic mouse model of breast cancer. Our results show that the proposed technique can estimate the YM and PR with overall accuracy of (97.06 +/- 2.42) % under all tested tumor orientations. Animal experimental data demonstrate the potential of the proposed methodology in vivo. The proposed method can significantly expand the range of applicability of the Eshelby's theory to tumors and provide new means to accurately image and quantify mechanical parameters of cancers in clinical conditions.
了解其机械性能对于癌症的诊断、预后和治疗具有重要的临床意义。最近，提出了一种基于埃谢尔比理论同时评估组织中杨氏模量（YM）和泊松比（PR）的新方法。该方法的一个显着限制是重建的 YM 和 PR 的准确性受到肿瘤与所施加应力的方向/对齐的影响。在本文中，我们提出了一种重建癌症中 YM 和 PR 的新方法，该方法对于肿瘤相对于施加应力轴的 3D 方向是不变的。该方法的新颖性在于使用张量变换来提高 Eshelby 理论的鲁棒性，并在真实的实验条件下高精度地重建肿瘤的 YM 和 PR。该方法通过有限元模拟和使用具有已知机械性能的模型的受控实验进行验证。所开发方法的体内可行性在乳腺癌原位小鼠模型中得到了证明。我们的结果表明，所提出的技术可以在所有测试的肿瘤方向下以 (97.06 +/- 2.42) % 的总体准确度估计 YM 和 PR。动物实验数据证明了所提出的方法在体内的潜力。所提出的方法可以显着扩展 Eshelby 理论对肿瘤的适用范围，并为临床条件下癌症的力学参数的精确成像和量化提供新的手段。

AU Xie, Zhiying Zeinstra, Nicole Kirby, Mitchell A. Le, Nhan Minh Murry, Charles E. Zheng, Ying Wang, Ruikang K.
AU Xie、Zhiying Zeinstra、Nicole Kirby、Mitchell A. Le、Nhan Minh Murry、Charles E. Cheng、Ying Wang、Ruikang K.

Quantifying Microvascular Structure in Healthy and Infarcted Rat Hearts Using Optical Coherence Tomography Angiography
使用光学相干断层扫描血管造影量化健康和梗塞大鼠心脏的微血管结构

Myocardial infarction (MI) is a life-threatening medical emergency resulting in coronary microvascular dysregulation and heart muscle damage. One of the primary characteristics of MI is capillary loss, which plays a significant role in the progression of this cardiovascular condition. In this study, we utilized optical coherence tomography angiography (OCTA) to image coronary microcirculation in fixed rat hearts, aiming to analyze coronary microvascular impairment post-infarction. Various angiographic metrics are presented to quantify vascular features, including the vessel area density, vessel complexity index, vessel tortuosity index, and flow impairment. Pathological differences identified from OCTA analysis are corroborated with histological analysis. The quantitative assessments reveal a significant decrease in microvascular density in the capillary-sized vessels and an enlargement for the arteriole/venule-sized vessels. Further, microvascular tortuosity and complexity exhibit an increase after myocardial infarction. The results underscore the feasibility of using OCTA to offer qualitative microvascular details and quantitative metrics, providing insights into coronary vascular network remodeling during disease progression and response to therapy.
心肌梗死 (MI) 是一种危及生命的医疗紧急情况，会导致冠状动脉微血管失调和心肌损伤。 MI 的主要特征之一是毛细血管损失，这在这种心血管疾病的进展中起着重要作用。在本研究中，我们利用光学相干断层扫描血管造影（OCTA）对固定大鼠心脏的冠状动脉微循环进行成像，旨在分析梗死后冠状动脉微血管损伤。提出了各种血管造影指标来量化血管特征，包括血管面积密度、血管复杂性指数、血管迂曲指数和血流损伤。 OCTA 分析确定的病理学差异得到组织学分析的证实。定量评估显示毛细血管大小的血管中微血管密度显着降低，而小动脉/微静脉大小的血管增大。此外，心肌梗塞后微血管的迂曲度和复杂性表现出增加。结果强调了使用 OCTA 提供定性微血管细节和定量指标的可行性，从而深入了解疾病进展和治疗反应期间的冠状血管网络重塑。

AU Deng, Shiyu Chen, Yinda Huang, Wei Zhang, Ruobing Xiong, Zhiwei
AU 邓、陈世宇、黄银达、张伟、熊若冰、志伟

Unsupervised Domain Adaptation for EM Image Denoising with Invertible Networks.
使用可逆网络进行电磁图像去噪的无监督域适应。

Electron microscopy (EM) image denoising is critical for visualization and subsequent analysis. Despite the remarkable achievements of deep learning-based non-blind denoising methods, their performance drops significantly when domain shifts exist between the training and testing data. To address this issue, unpaired blind denoising methods have been proposed. However, these methods heavily rely on image-to-image translation and neglect the inherent characteristics of EM images, limiting their overall denoising performance. In this paper, we propose the first unsupervised domain adaptive EM image denoising method, which is grounded in the observation that EM images from similar samples share common content characteristics. Specifically, we first disentangle the content representations and the noise components from noisy images and establish a shared domain-agnostic content space via domain alignment to bridge the synthetic images (source domain) and the real images (target domain). To ensure precise domain alignment, we further incorporate domain regularization by enforcing that: the pseudo-noisy images, reconstructed using both content representations and noise components, accurately capture the characteristics of the noisy images from which the noise components originate, all while maintaining semantic consistency with the noisy images from which the content representations originate. To guarantee lossless representation decomposition and image reconstruction, we introduce disentanglement-reconstruction invertible networks. Finally, the reconstructed pseudo-noisy images, paired with their corresponding clean counterparts, serve as valuable training data for the denoising network. Extensive experiments on synthetic and real EM datasets demonstrate the superiority of our method in terms of image restoration quality and downstream neuron segmentation accuracy. Our code is publicly available at https://github.com/sydeng99/DADn.
电子显微镜 (EM) 图像去噪对于可视化和后续分析至关重要。尽管基于深度学习的非盲去噪方法取得了显着的成就，但当训练数据和测试数据之间存在域转移时，其性能会显着下降。为了解决这个问题，提出了不成对的盲去噪方法。然而，这些方法严重依赖图像到图像的转换，忽略了电磁图像的固有特征，限制了它们的整体去噪性能。在本文中，我们提出了第一个无监督域自适应电磁图像去噪方法，该方法基于对来自相似样本的电磁图像具有共同内容特征的观察。具体来说，我们首先从噪声图像中分离出内容表示和噪声分量，并通过域对齐建立一个共享的与域无关的内容空间，以桥接合成图像（源域）和真实图像（目标域）。为了确保精确的域对齐，我们通过强制执行以下操作进一步合并域正则化：使用内容表示和噪声分量重建的伪噪声图像准确地捕获噪声分量所源自的噪声图像的特征，同时保持语义一致性与内容表示所源自的噪声图像。为了保证无损表示分解和图像重建，我们引入了解纠缠重建可逆网络。最后，重建的伪噪声图像与其相应的干净对应图像配对，作为去噪网络的宝贵训练数据。对合成和真实 EM 数据集的大量实验证明了我们的方法在图像恢复质量和下游神经元分割精度方面的优越性。我们的代码可在 https://github.com/sydeng99/DADn 上公开获取。

AU Wang, Zhenguo Wu, Yaping Xia, Zeheng Chen, Xinyi Li, Xiaochen Bai, Yan Zhou, Yun Liang, Dong Zheng, Hairong Yang, Yongfeng Wang, Shanshan Wang, Meiyun Sun, Tao
王AU、吴振国、夏亚平、陈泽恒、李欣怡、白晓晨、周彦、梁云、郑东、杨海荣、王永峰、王珊珊、孙美云、陶

Non-Invasive Quantification of the Brain [18F]FDG-PET Using Inferred Blood Input Function Learned From Total-Body Data With Physical Constraint
使用从具有物理约束的全身数据中学习的推断血液输入功能对大脑进行无创定量 [18F]FDG-PET

Full quantification of brain PET requires the blood input function (IF), which is traditionally achieved through an invasive and time-consuming arterial catheter procedure, making it unfeasible for clinical routine. This study presents a deep learning based method to estimate the input function (DLIF) for a dynamic brain FDG scan. A long short-term memory combined with a fully connected network was used. The dataset for training was generated from 85 total-body dynamic scans obtained on a uEXPLORER scanner. Time-activity curves from 8 brain regions and the carotid served as the input of the model, and labelled IF was generated from the ascending aorta defined on CT image. We emphasize the goodness-of-fitting of kinetic modeling as an additional physical loss to reduce the bias and the need for large training samples. DLIF was evaluated together with existing methods in terms of RMSE, area under the curve, regional and parametric image quantifications. The results revealed that the proposed model can generate IFs that closer to the reference ones in terms of shape and amplitude compared with the IFs generated using existing methods. All regional kinetic parameters calculated using DLIF agreed with reference values, with the correlation coefficient being 0.961 (0.913) and relative bias being 1.68 +/- 8.74% (0.37 +/- 4.93%) for K-i (K-1). In terms of the visual appearance and quantification, parametric images were also highly identical to the reference images. In conclusion, our experiments indicate that a trained model can infer an image-derived IF from dynamic brain PET data, which enables subsequent reliable kinetic modeling.
脑PET的全面量化需要血液输入功能（IF），传统上这是通过侵入性且耗时的动脉导管手术来实现的，这使得其在临床常规中不可行。本研究提出了一种基于深度学习的方法来估计动态脑 FDG 扫描的输入函数 (DLIF)。使用了长短期记忆与完全连接的网络相结合。用于训练的数据集是通过 uEXPLORER 扫描仪获得的 85 次全身动态扫描生成的。来自 8 个脑区和颈动脉的时间-活动曲线作为模型的输入，标记的 IF 由 CT 图像上定义的升主动脉生成。我们强调动力学建模的拟合优度作为额外的物理损失，以减少偏差和对大型训练样本的需求。 DLIF 与现有方法一起在 RMSE、曲线下面积、区域和参数图像量化方面进行评估。结果表明，与使用现有方法生成的中频相比，所提出的模型可以生成在形状和幅度方面更接近参考中频的中频。使用 DLIF 计算的所有区域动力学参数均与参考值一致，Ki (K-1) 的相关系数为 0.961 (0.913)，相对偏差为 1.68 +/- 8.74% (0.37 +/- 4.93%)。在视觉外观和量化方面，参数图像也与参考图像高度一致。总之，我们的实验表明，经过训练的模型可以从动态大脑 PET 数据中推断出图像衍生的 IF，从而实现后续可靠的动力学建模。

C1 Chinese Acad Sci, Paul C Lauterbur Res Ctr Biomed Imaging, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China C1 Zhengzhou Univ, Henan Prov Peoples Hosp, Zhengzhou Peoples Hosp, Zhengzhou 450001, Peoples R China C1 United Imaging Healthcare Grp Co Ltd, Cent Res Inst, Shanghai 201815, Peoples R China C3 United Imaging Healthcare Grp Co Ltd SN 0278-0062 EI 1558-254X DA 2024-07-22 UT WOS:001263692100016 PM 38386580 ER
C1 中国科学院，Paul C Lauterbur Res Ctr Biomed Imaging，深圳先进技术研究院，深圳 518055，人民医院 C1 河南省郑州大学人民医院，郑州人民医院，郑州 450001，人民医院 C1 联合影像医疗集团有限公司，中心研究所，上海 201815，人民 R 中国 C3 联影医疗集团有限公司 SN 0278-0062 EI 1558-254X DA 2024-07-22 UT WOS:001263692100016 PM 38386580 ER

AU Fan, Jiansong Lv, Tianxu Wang, Pei Hong, Xiaoyan Liu, Yuan Jiang, Chunjuan Ni, Jianming Li, Lihua Pan, Xiang
AU Fan、吕建松、王天旭、洪沛、刘晓燕、蒋媛、倪春娟、李建明、潘丽华、项

DCDiff: Dual-Granularity Cooperative Diffusion Models for Pathology Image Analysis.
DCDiff：用于病理图像分析的双粒度协作扩散模型。

Whole Slide Images (WSIs) are paramount in the medical field, with extensive applications in disease diagnosis and treatment. Recently, many deep-learning methods have been used to classify WSIs. However, these methods are inadequate for accurately analyzing WSIs as they treat regions in WSIs as isolated entities and ignore contextual information. To address this challenge, we propose a novel Dual-Granularity Cooperative Diffusion Model (DCDiff) for the precise classification of WSIs. Specifically, we first design a cooperative forward and reverse diffusion strategy, utilizing fine-granularity and coarse-granularity to regulate each diffusion step and gradually improve context awareness. To exchange information between granularities, we propose a coupled U-Net for dual-granularity denoising, which efficiently integrates dual-granularity consistency information using the designed Fine- and Coarse-granularity Cooperative Aware (FCCA) model. Ultimately, the cooperative diffusion features extracted by DCDiff can achieve cross-sample perception from the reconstructed distribution of training samples. Experiments on three public WSI datasets show that the proposed method can achieve superior performance over state-of-the-art methods. The code is available at https://github.com/hemo0826/DCDiff.
全幻灯片图像（WSI）在医学领域至关重要，在疾病诊断和治疗中有着广泛的应用。最近，许多深度学习方法已被用于对 WSI 进行分类。然而，这些方法不足以准确分析 WSI，因为它们将 WSI 中的区域视为孤立的实体并忽略上下文信息。为了应对这一挑战，我们提出了一种新颖的双粒度协作扩散模型（DCDiff），用于 WSI 的精确分类。具体来说，我们首先设计了一种协作的正向和反向扩散策略，利用细粒度和粗粒度来调节每个扩散步骤并逐渐提高上下文感知。为了在粒度之间交换信息，我们提出了一种用于双粒度去噪的耦合 U-Net，它使用设计的细粒度和粗粒度协作感知（FCCA）模型有效地集成了双粒度一致性信息。最终，DCDiff提取的协作扩散特征可以从训练样本的重构分布中实现跨样本感知。在三个公共 WSI 数据集上的实验表明，所提出的方法可以实现优于最先进方法的性能。该代码可在 https://github.com/hemo0826/DCDiff 获取。

AU Azampour, Mohammad Farid Mach, Kristina Fatemizadeh, Emad Demiray, Beatrice Westenfelder, Kay Steiger, Katja Eiber, Matthias Wendler, Thomas Kainz, Bernhard Navab, Nassir
AU Azampour、穆罕默德·法里德·马赫、克里斯蒂娜·法特米扎德、埃马德·德米雷、比阿特丽斯·韦斯滕菲尔德、凯·斯泰格、卡佳·艾伯、马蒂亚斯·温德勒、托马斯·凯恩斯、伯恩哈德·纳瓦布、纳西尔

Multitask Weakly Supervised Generative Network for MR-US Registration.
用于 MR-US 注册的多任务弱监督生成网络。

Registering pre-operative modalities, such as magnetic resonance imaging or computed tomography, to ultrasound images is crucial for guiding clinicians during surgeries and biopsies. Recently, deep-learning approaches have been proposed to increase the speed and accuracy of this registration problem. However, all of these approaches need expensive supervision from the ultrasound domain. In this work, we propose a multitask generative framework that needs weak supervision only from the pre-operative imaging domain during training. To perform a deformable registration, the proposed framework translates a magnetic resonance image to the ultrasound domain while preserving the structural content. To demonstrate the efficacy of the proposed method, we tackle the registration problem of pre-operative 3D MR to transrectal ultrasonography images as necessary for targeted prostate biopsies. We use an in-house dataset of 600 patients, divided into 540 for training, 30 for validation, and the remaining for testing. An expert manually segmented the prostate in both modalities for validation and test sets to assess the performance of our framework. The proposed framework achieves a 3.58 mm target registration error on the expert-selected landmarks, 89.2% in the Dice score, and 1.81 mm 95th percentile Hausdorff distance on the prostate masks in the test set. Our experiments demonstrate that the proposed generative model successfully translates magnetic resonance images into the ultrasound domain. The translated image contains the structural content and fine details due to an ultrasound-specific two-path design of the generative model. The proposed framework enables training learning-based registration methods while only weak supervision from the pre-operative domain is available.
将磁共振成像或计算机断层扫描等术前模式与超声图像配准对于指导临床医生进行手术和活检至关重要。最近，人们提出了深度学习方法来提高配准问题的速度和准确性。然而，所有这些方法都需要来自超声领域的昂贵监督。在这项工作中，我们提出了一个多任务生成框架，在训练期间仅需要来自术前成像领域的弱监督。为了执行可变形配准，所提出的框架将磁共振图像转换到超声域，同时保留结构内容。为了证明所提出方法的有效性，我们解决了术前 3D MR 与经直肠超声图像的配准问题，这是目标前列腺活检所必需的。我们使用包含 600 名患者的内部数据集，其中 540 名用于训练，30 名用于验证，其余用于测试。专家以验证和测试集的方式手动分割前列腺，以评估我们框架的性能。所提出的框架在专家选择的地标上实现了 3.58 毫米的目标配准误差，在 Dice 得分中实现了 89.2%，在测试集中的前列腺面罩上实现了 1.81 毫米的第 95 个百分位 Hausdorff 距离。我们的实验表明，所提出的生成模型成功地将磁共振图像转换到超声领域。由于生成模型的超声特定双路径设计，翻译后的图像包含结构内容和精细细节。所提出的框架能够训练基于学习的配准方法，而仅可使用来自术前域的弱监督。

AU Xu, Mengya Islam, Mobarakol Bai, Long Ren, Hongliang
AU Xu、Mengya Islam、白莫巴拉科尔、任龙、洪亮

Privacy-Preserving Synthetic Continual Semantic Segmentation for Robotic Surgery
用于机器人手术的隐私保护综合连续语义分割

Deep Neural Networks (DNNs) based semantic segmentation of the robotic instruments and tissues can enhance the precision of surgical activities in robot-assisted surgery. However, in biological learning, DNNs cannot learn incremental tasks over time and exhibit catastrophic forgetting, which refers to the sharp decline in performance on previously learned tasks after learning a new one. Specifically, when data scarcity is the issue, the model shows a rapid drop in performance on previously learned instruments after learning new data with new instruments. The problem becomes worse when it limits releasing the dataset of the old instruments for the old model due to privacy concerns and the unavailability of the data for the new or updated version of the instruments for the continual learning model. For this purpose, we develop a privacy-preserving synthetic continual semantic segmentation framework by blending and harmonizing (i) open-source old instruments foreground to the synthesized background without revealing real patient data in public and (ii) new instruments foreground to extensively augmented real background. To boost the balanced logit distillation from the old model to the continual learning model, we design overlapping class-aware temperature normalization (CAT) by controlling model learning utility. We also introduce multi-scale shifted-feature distillation (SD) to maintain long and short-range spatial relationships among the semantic objects where conventional short-range spatial features with limited information reduce the power of feature distillation. We demonstrate the effectiveness of our framework on the EndoVis 2017 and 2018 instrument segmentation dataset with a generalized continual learning setting. Code is available at https://github.com/XuMengyaAmy/Synthetic_CAT_SD.
基于深度神经网络（DNN）的机器人器械和组织语义分割可以提高机器人辅助手术中手术活动的精度。然而，在生物学习中，DNN 无法随着时间的推移学习增量任务，并表现出灾难性遗忘，这是指在学习新任务后，先前学习的任务的性能急剧下降。具体来说，当数据稀缺成为问题时，该模型在使用新仪器学习新数据后，在之前学习的仪器上表现出性能迅速下降。当由于隐私问题以及持续学习模型的新版本或更新版本的工具的数据不可用而限制发布旧模型的旧工具的数据集时，问题会变得更糟。为此，我们开发了一个保护隐私的合成连续语义分割框架，通过混合和协调（i）开源旧仪器前景到合成背景，而不在公共场合透露真实的患者数据，以及（ii）新仪器前景到广泛增强的真实数据背景。为了促进从旧模型到持续学习模型的平衡逻辑蒸馏，我们通过控制模型学习效用来设计重叠的类感知温度归一化（CAT）。我们还引入了多尺度移位特征蒸馏（SD）来维持语义对象之间的长程和短程空间关系，其中信息有限的传统短程空间特征降低了特征蒸馏的能力。我们通过广义持续学习设置在 EndoVis 2017 和 2018 仪器分割数据集上展示了我们的框架的有效性。代码可在 https://github 上获取。com/XuMengyaAmy/Synthetic_CAT_SD.

AU Liu, Pei Ji, Luping Zhang, Xinyu Ye, Feng
刘AU、季沛、张路平、叶新宇、冯

Pseudo-Bag Mixup Augmentation for Multiple Instance Learning-Based Whole Slide Image Classification
基于多实例学习的整个幻灯片图像分类的伪袋混合增强

Given the special situation of modeling gigapixel images, multiple instance learning (MIL) has become one of the most important frameworks for Whole Slide Image (WSI) classification. In current practice, most MIL networks often face two unavoidable problems in training: i) insufficient WSI data and ii) the sample memorization inclination inherent in neural networks. These problems may hinder MIL models from adequate and efficient training, suppressing the continuous performance promotion of classification models on WSIs. Inspired by the basic idea of Mixup, this paper proposes a new Pseudo-bag Mixup (PseMix) data augmentation scheme to improve the training of MIL models. This scheme generalizes the Mixup strategy for general images to special WSIs via pseudo-bags so as to be applied in MIL-based WSI classification. Cooperated by pseudo-bags, our PseMix fulfills the critical size alignment and semantic alignment in Mixup strategy. Moreover, it is designed as an efficient and decoupled method, neither involving time-consuming operations nor relying on MIL model predictions. Comparative experiments and ablation studies are specially designed to evaluate the effectiveness and advantages of our PseMix. Experimental results show that PseMix could often assist state-of-the-art MIL networks to refresh their classification performance on WSIs. Besides, it could also boost the generalization performance of MIL models in special test scenarios, and promote their robustness to patch occlusion and label noise. Our source code is available at https://github.com/liupei101/PseMix.
鉴于十亿像素图像建模的特殊情况，多实例学习（MIL）已成为全幻灯片图像（WSI）分类最重要的框架之一。在当前实践中，大多数MIL网络在训练中经常面临两个不可避免的问题：i）WSI数据不足和ii）神经网络固有的样本记忆倾向。这些问题可能会阻碍MIL模型充分有效的训练，抑制WSI上分类模型的持续性能提升。受Mixup基本思想的启发，本文提出了一种新的伪袋混合（PseMix）数据增强方案来改进MIL模型的训练。该方案通过伪袋将一般图像的 Mixup 策略推广到特殊的 WSI，从而应用于基于 MIL 的 WSI 分类。在伪袋的配合下，我们的 PseMix 实现了 Mixup 策略中的关键尺寸对齐和语义对齐。此外，它被设计为一种高效、解耦的方法，既不涉及耗时的操作，也不依赖于 MIL 模型预测。比较实验和消融研究是专门为评估我们的 PseMix 的有效性和优势而设计的。实验结果表明，PseMix 通常可以帮助最先进的 MIL 网络刷新其在 WSI 上的分类性能。此外，它还可以提高MIL模型在特殊测试场景中的泛化性能，并提高其对补丁遮挡和标签噪声的鲁棒性。我们的源代码位于 https://github.com/liupei101/PseMix。

AU Ma, Yuxi Wang, Jiacheng Yang, Jing Wang, Liansheng
区马、王雨曦、杨家成、王静、连胜

Model-Heterogeneous Semi-Supervised Federated Learning for Medical Image Segmentation
用于医学图像分割的模型异构半监督联邦学习

Medical image segmentation is crucial in clinical diagnosis, helping physicians identify and analyze medical conditions. However, this task is often accompanied by challenges like sensitive data, privacy concerns, and expensive annotations. Current research focuses on personalized collaborative training of medical segmentation systems, ignoring that obtaining segmentation annotations is time-consuming and laborious. Achieving a perfect balance between annotation cost and segmentation performance while ensuring local model personalization has become a valuable direction. Therefore, this study introduces a novel Model-Heterogeneous Semi-Supervised Federated (HSSF) Learning framework. It proposes Regularity Condensation and Regularity Fusion to transfer autonomously selective knowledge to ensure the personalization between sites. In addition, to efficiently utilize unlabeled data and reduce the annotation burden, it proposes a Self-Assessment (SA) module and a Reliable Pseudo-Label Generation (RPG) module. The SA module generates self-assessment confidence in real-time based on model performance, and the RPG module generates reliable pseudo-label based on SA confidence. We evaluate our model separately on the Skin Lesion and Polyp Lesion datasets. The results show that our model performs better than other methods characterized by heterogeneity. Moreover, it exhibits highly commendable performance even in homogeneous designs, most notably in region-based metrics. The full range of resources can be readily accessed through the designated repository located at HSSF(github.com) on the platform of GitHub.
医学图像分割在临床诊断中至关重要，可以帮助医生识别和分析医疗状况。然而，这项任务通常伴随着敏感数据、隐私问题和昂贵的注释等挑战。目前的研究主要集中在医学分割系统的个性化协同训练上，忽略了获取分割标注的耗时费力。在保证本地模型个性化的同时，在标注成本和分割性能之间实现完美平衡已成为一个有价值的方向。因此，本研究引入了一种新颖的模型-异构半监督联邦（HSSF）学习框架。它提出规则性压缩和规则性融合来传输自主选择的知识，以确保站点之间的个性化。此外，为了有效利用未标记的数据并减少注释负担，它提出了自我评估（SA）模块和可靠的伪标签生成（RPG）模块。 SA模块根据模型性能实时生成自我评估置信度，RPG模块根据SA置信度生成可靠的伪标签。我们在皮肤病变和息肉病变数据集上分别评估我们的模型。结果表明，我们的模型比其他具有异质性的方法表现得更好。此外，即使在同质设计中，它也表现出高度值得称赞的性能，尤其是在基于区域的指标中。全方位的资源可以通过GitHub平台上位于HSSF(github.com)的指定存储库轻松访问。

AU Bontempo, Gianpaolo Bolelli, Federico Porrello, Angelo Calderara, Simone Ficarra, Elisa
AU Bontempo、Gianpaolo Bolelli、Federico Porrello、Angelo Calderara、Simone Ficarra、Elisa

A Graph-Based Multi-Scale Approach With Knowledge Distillation for WSI Classification
用于 WSI 分类的基于图的多尺度知识蒸馏方法

The usage of Multi Instance Learning (MIL) for classifying Whole Slide Images (WSIs) has recently increased. Due to their gigapixel size, the pixel-level annotation of such data is extremely expensive and time-consuming, practically unfeasible. For this reason, multiple automatic approaches have been raised in the last years to support clinical practice and diagnosis. Unfortunately, most state-of-the-art proposals apply attention mechanisms without considering the spatial instance correlation and usually work on a single-scale resolution. To leverage the full potential of pyramidal structured WSI, we propose a graph-based multi-scale MIL approach, DAS-MIL. Our model comprises three modules: i) a self-supervised feature extractor, ii) a graph-based architecture that precedes the MIL mechanism and aims at creating a more contextualized representation of the WSI structure by considering the mutual (spatial) instance correlation both inter and intra-scale. Finally, iii) a (self) distillation loss between resolutions is introduced to compensate for their informative gap and significantly improve the final prediction. The effectiveness of the proposed framework is demonstrated on two well-known datasets, where we outperform SOTA on WSI classification, gaining a +2.7% AUC and +3.7% accuracy on the popular Camelyon16 benchmark.
最近，用于对整个幻灯片图像 (WSI) 进行分类的多实例学习 (MIL) 的使用有所增加。由于其大小为十亿像素，此类数据的像素级注释极其昂贵且耗时，实际上是不可行的。因此，近年来提出了多种自动方法来支持临床实践和诊断。不幸的是，大多数最先进的提案都应用注意力机制，而不考虑空间实例相关性，并且通常适用于单尺度分辨率。为了充分利用金字塔结构 WSI 的潜力，我们提出了一种基于图的多尺度 MIL 方法，DAS-MIL。我们的模型包含三个模块：i）自监督特征提取器，ii）基于图的架构，它先于 MIL 机制，旨在通过考虑相互（空间）实例相关性来创建 WSI 结构的更加上下文化的表示。和尺度内。最后，iii）引入分辨率之间的（自）蒸馏损失，以补偿它们的信息差距并显着改善最终预测。所提出的框架的有效性在两个著名的数据集上得到了证明，其中我们在 WSI 分类上的表现优于 SOTA，在流行的 Camelyon16 基准上获得了 +2.7% 的 AUC 和 +3.7% 的准确率。

AU Li, Jiawen Cheng, Junru Meng, Lingqin Yan, Hui He, Yonghong Shi, Huijuan Guan, Tian Han, Anjia
AU Li、程嘉文、孟俊如、严令勤、何慧、施永红、关慧娟、田汉、安佳

DeepTree: Pathological Image Classification Through Imitating Tree-Like Strategies of Pathologists
DeepTree：通过模仿病理学家的树状策略进行病理图像分类

Digitization of pathological slides has promoted the research of computer-aided diagnosis, in which artificial intelligence analysis of pathological images deserves attention. Appropriate deep learning techniques in natural images have been extended to computational pathology. Still, they seldom take into account prior knowledge in pathology, especially the analysis process of lesion morphology by pathologists. Inspired by the diagnosis decision of pathologists, we design a novel deep learning architecture based on tree-like strategies called DeepTree. It imitates pathological diagnosis methods, designed as a binary tree structure, to conditionally learn the correlation between tissue morphology, and optimizes branches to finetune the performance further. To validate and benchmark DeepTree, we build a dataset of frozen lung cancer tissues and design experiments on a public dataset of breast tumor subtypes and our dataset. Results show that the deep learning architecture based on tree-like strategies makes the pathological image classification more accurate, transparent, and convincing. Simultaneously, prior knowledge based on diagnostic strategies yields superior representation ability compared to alternative methods. Our proposed methodology helps improve the trust of pathologists in artificial intelligence analysis and promotes the practical clinical application of pathology-assisted diagnosis.
病理切片的数字化推动了计算机辅助诊断的研究，其中病理图像的人工智能分析值得关注。自然图像中适当的深度学习技术已扩展到计算病理学。尽管如此，他们很少考虑病理学的先验知识，尤其是病理学家对病变形态的分析过程。受到病理学家诊断决策的启发，我们设计了一种基于树状策略的新型深度学习架构，称为 DeepTree。它模仿病理诊断方法，设计为二叉树结构，有条件地学习组织形态之间的相关性，并优化分支以进一步微调性能。为了验证和基准 DeepTree，我们构建了冷冻肺癌组织的数据集，并在乳腺肿瘤亚型的公共数据集和我们的数据集上设计了实验。结果表明，基于树状策略的深度学习架构使得病理图像分类更加准确、透明和令人信服。同时，与其他方法相比，基于诊断策略的先验知识具有卓越的表示能力。我们提出的方法有助于提高病理学家对人工智能分析的信任，促进病理辅助诊断的实际临床应用。

AU Zhang, Huimin Ren, Mingyang Wang, Yu Jin, Zhiyuan Zhang, Shanxiang Liu, Jiaqian Fu, Jia Qin, Huan
张AU、任惠民、王明阳、金宇、张志远、刘善祥、付家谦、秦家、欢

In Vivo Microwave-Induced Thermoacoustic Endoscopy for Colorectal Tumor Detection in Deep Tissue
体内微波诱导热声内窥镜用于深部组织结直肠肿瘤检测

Optical endoscopy, as one of the common clinical diagnostic modalities, provides irreplaceable advantages in the diagnosis and treatment of internal organs. However, the approach is limited to the characterization of superficial tissues due to the strong optical scattering properties of tissue. In this work, a microwave-induced thermoacoustic (TA) endoscope (MTAE) was developed and evaluated. The MTAE system integrated a homemade monopole sleeve antenna (diameter = 7 mm) for providing homogenized pulsed microwave irradiation to induce a TA signal in the colorectal cavity and a side-viewing focus ultrasonic transducer (diameter = 3 mm) for detecting the TA signal in the ultrasonic spectrum to construct the image. Our MTAE, system combined microwave excitation and acoustic detection; produced images with dielectric contrast and high spatial resolution at several centimeters deep in soft tissues, overcome the current limitations of the imaging depth of optical endoscopy and mechanical wave-based imaging contrast of ultrasound endoscopy, and had the ability to extract complete features for deep location tumors that could be infiltrating and invading adjacent structures. The practical feasibility of the MTAE system was evaluated i n vivo with rabbits having colorectal tumors. The results demonstrated that colorectal tumor progression could be visualized from the changes in electromagnetic parameters of the tissue via MTAE, showing its potential clinical application.
光学内窥镜作为临床常见的诊断手段之一，在内脏器官的诊断和治疗中具有不可替代的优势。然而，由于组织的强光学散射特性，该方法仅限于浅表组织的表征。在这项工作中，开发并评估了微波诱导热声（TA）内窥镜（MTAE）。 MTAE系统集成了自制单极套筒天线（直径 = 7 mm），用于提供均匀脉冲微波辐射，以在结直肠腔中感应 TA 信号；以及侧视聚焦超声换能器（直径 = 3 mm），用于检测结直肠腔内的 TA 信号。超声波频谱来构建图像。我们的 MTAE，系统结合了微波激发和声学检测；在软组织深处产生了介电对比度和高空间分辨率的图像，克服了目前光学内窥镜成像深度和超声内窥镜基于机械波的成像对比度的限制，并具有提取完整特征进行深层定位的能力可能浸润和侵入邻近结构的肿瘤。 MTAE 系统的实际可行性在患有结直肠肿瘤的兔子身上进行了体内评估。结果表明，通过 MTAE 可以从组织电磁参数的变化中可视化结直肠肿瘤的进展，显示出其潜在的临床应用。

C1 South China Normal Univ, Coll Biophoton, MOE Key Lab Laser Life Sci, Guangzhou Key Lab Spectral Anal & Funct Probes,Gua, Guangzhou 510631, Peoples R China C1 South China Normal Univ, Inst Laser Life Sci, Guangzhou 510631, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-07-02 UT WOS:001196733400008 PM 38113149 ER
C1 华南师范大学，生物光子学院，教育部激光生命科学重点实验室，广州市光谱分析与功能探针重点实验室，广州，广州 510631 C1 华南师范大学，激光生命科学研究所，广州 510631 SN 0278-0062 EI 1558-254X DA 2024-07-02 UT WOS:001196733400008 PM 38113149 ER

AU Wu, Huisi Zhang, Baiming Chen, Cheng Qin, Jing
吴AU、张惠思、陈百明、秦程、静

Federated Semi-Supervised Medical Image Segmentation via Prototype-Based Pseudo-Labeling and Contrastive Learning
通过基于原型的伪标签和对比学习进行联合半监督医学图像分割

Existing federated learning works mainly focus on the fully supervised training setting. In realistic scenarios, however, most clinical sites can only provide data without annotations due to the lack of resources or expertise. In this work, we are concerned with the practical yet challenging federated semi-supervised segmentation (FSSS), where labeled data are only with several clients and other clients can just provide unlabeled data. We take an early attempt to tackle this problem and propose a novel FSSS method with prototype-based pseudo-labeling and contrastive learning. First, we transmit a labeled-aggregated model, which is obtained based on prototype similarity, to each unlabeled client, to work together with the global model for debiased pseudo labels generation via a consistency- and entropy-aware selection strategy. Second, we transfer image-level prototypes from labeled datasets to unlabeled clients and conduct prototypical contrastive learning on unlabeled models to enhance their discriminative power. Finally, we perform the dynamic model aggregation with a designed consistency-aware aggregation strategy to dynamically adjust the aggregation weights of each local model. We evaluate our method on COVID-19 X-ray infected region segmentation, COVID-19 CT infected region segmentation and colorectal polyp segmentation, and experimental results consistently demonstrate the effectiveness of our proposed method. Codes areavailable at https://github.com/zhangbaiming/FedSemiSeg.
现有的联邦学习工作主要集中在完全监督的训练设置上。然而，在现实场景中，由于缺乏资源或专业知识，大多数临床站点只能提供没有注释的数据。在这项工作中，我们关注实用但具有挑战性的联邦半监督分割（FSSS），其中标记数据仅存在于多个客户端，而其他客户端只能提供未标记数据。我们早期尝试解决这个问题，并提出了一种新颖的 FSSS 方法，具有基于原型的伪标签和对比学习。首先，我们将基于原型相似性获得的标记聚合模型传输给每个未标记的客户端，以通过一致性和熵感知的选择策略与全局模型一起生成去偏伪标签。其次，我们将图像级原型从标记数据集转移到未标记的客户端，并对未标记的模型进行原型对比学习，以增强其判别能力。最后，我们使用设计的一致性感知聚合策略来执行动态模型聚合，以动态调整每个本地模型的聚合权重。我们在 COVID-19 X 射线感染区域分割、COVID-19 CT 感染区域分割和结直肠息肉分割上评估了我们的方法，实验结果一致证明了我们提出的方法的有效性。代码可在 https://github.com/zhangbaiming/FedSemiSeg 获取。

AU Park, Jungkyu Chledowski, Jakub Jastrzebski, Stanislaw Witowski, Jan Xu, Yanqi Du, Linda Gaddam, Sushma Kim, Eric Lewin, Alana Parikh, Ujas Plaunova, Anastasia Chen, Sardius Millet, Alexandra Park, James Pysarenko, Kristine Patel, Shalin Goldberg, Julia Wegener, Melanie Moy, Linda Heacock, Laura Reig, Beatriu Geras, Krzysztof J.
AU Park、Jungkyu Chledowski、Jakub Jastrzebski、Stanislaw Witowski、Jan Xu、Yanqi Du、Linda Gaddam、Sushma Kim、Eric Lewin、Alana Parikh、Ujas Plaunova、Anastasia Chen、Sardius Millet、Alexandra Park、James Pysarenko、Kristine Patel、Shalin Goldberg , 朱莉娅·韦格纳, 梅兰妮·莫伊, 琳达·希考克, 劳拉·雷格, 贝阿特留·杰拉斯, 克日什托夫·J.

An Efficient Deep Neural Network to Classify Large 3D Images With Small Objects
用于对大型 3D 图像和小物体进行分类的高效深度神经网络

3D imaging enables accurate diagnosis by providing spatial information about organ anatomy. However, using 3D images to train AI models is computationally challenging because they consist of 10x or 100x more pixels than their 2D counterparts. To be trained with high-resolution 3D images, convolutional neural networks resort to downsampling them or projecting them to 2D. We propose an effective alternative, a neural network that enables efficient classification of full-resolution 3D medical images. Compared to off-the-shelf convolutional neural networks, our network, 3D Globally-Aware Multiple Instance Classifier (3D-GMIC), uses 77.98%-90.05% less GPU memory and 91.23%-96.02% less computation. While it is trained only with image-level labels, without segmentation labels, it explains its predictions by providing pixel-level saliency maps. On a dataset collected at NYU Langone Health, including 85,526 patients with full-field 2D mammography (FFDM), synthetic 2D mammography, and 3D mammography, 3D-GMIC achieves an AUC of 0.831 (95% CI: 0.769-0.887) in classifying breasts with malignant findings using 3D mammography. This is comparable to the performance of GMIC on FFDM (0.816, 95% CI: 0.737-0.878) and synthetic 2D (0.826, 95% CI: 0.754-0.884), which demonstrates that 3D-GMIC successfully classified large 3D images despite focusing computation on a smaller percentage of its input compared to GMIC. Therefore, 3D-GMIC identifies and utilizes extremely small regions of interest from 3D images consisting of hundreds of millions of pixels, dramatically reducing associated computational challenges. 3D-GMIC generalizes well to BCS-DBT, an external dataset from Duke University Hospital, achieving an AUC of 0.848 (95% CI: 0.798-0.896).
3D 成像通过提供有关器官解剖结构的空间信息来实现准确诊断。然而，使用 3D 图像来训练 AI 模型在计算上具有挑战性，因为它们包含的像素比 2D 图像多 10 倍或 100 倍。为了使用高分辨率 3D 图像进行训练，卷积神经网络会对其进行下采样或将其投影为 2D。我们提出了一种有效的替代方案，即一种能够对全分辨率 3D 医学图像进行有效分类的神经网络。与现成的卷积神经网络相比，我们的网络 3D 全局感知多实例分类器 (3D-GMIC) 使用的 GPU 内存减少了 77.98%-90.05%，计算量减少了 91.23%-96.02%。虽然它仅使用图像级标签进行训练，没有分割标签，但它通过提供像素级显着性图来解释其预测。在 NYU Langone Health 收集的数据集上，包括 85,526 名接受全视野 2D 乳房 X 光检查 (FFDM)、合成 2D 乳房 X 光检查和 3D 乳房 X 光检查的患者，3D-GMIC 在乳房分类方面的 AUC 为 0.831 (95% CI: 0.769-0.887)使用 3D 乳房 X 光检查发现恶性结果。这与 GMIC 在 FFDM（0.816，95% CI：0.737-0.878）和合成 2D（0.826，95% CI：0.754-0.884）上的性能相当，这表明 3D-GMIC 尽管进行聚焦计算，仍成功对大型 3D 图像进行分类与 GMIC 相比，其投入的比例较小。因此，3D-GMIC 可识别并利用由数亿像素组成的 3D 图像中极小的感兴趣区域，从而显着减少相关的计算挑战。 3D-GMIC 可以很好地推广到杜克大学医院的外部数据集 BCS-DBT，其 AUC 为 0.848（95% CI：0.798-0.896）。

AU Zhu, Meilu Liao, Jing Liu, Jun Yuan, Yixuan
朱AU、廖美璐、刘静、袁俊、艺轩

FedOSS: Federated Open Set Recognition via Inter-Client Discrepancy and Collaboration
FedOSS：通过客户端差异和协作进行联合开放集识别

Open set recognition (OSR) aims to accurately classify known diseases and recognize unseen diseases as the unknown class in medical scenarios. However, in existing OSR approaches, gathering data from distributed sites to construct large-scale centralized training datasets usually leads to high privacy and security risk, which could be alleviated elegantly via the popular cross-site training paradigm, federated learning (FL). To this end, we represent the first effort to formulate federated open set recognition (FedOSR), and meanwhile propose a novel Federated Open Set Synthesis (FedOSS) framework to address the core challenge of FedOSR: the unavailability of unknown samples for all anticipated clients during the training phase. The proposed FedOSS framework mainly leverages two modules, i.e., Discrete Unknown Sample Synthesis (DUSS) and Federated Open Space Sampling (FOSS), to generate virtual unknown samples for learning decision boundaries between known and unknown classes. Specifically, DUSS exploits inter-client knowledge inconsistency to recognize known samples near decision boundaries and then pushes them beyond decision boundaries to synthesize discrete virtual unknown samples. FOSS unites these generated unknown samples from different clients to estimate the class-conditional distributions of open data space near decision boundaries and further samples open data, thereby improving the diversity of virtual unknown samples. Additionally, we conduct comprehensive ablation experiments to verify the effectiveness of DUSS and FOSS. FedOSS shows superior performance on public medical datasets in comparison with state-of-the-art approaches.
开放集识别（OSR）旨在对已知疾病进行准确分类，并将未见过的疾病识别为医疗场景中的未知类别。然而，在现有的 OSR 方法中，从分布式站点收集数据来构建大规模集中式训练数据集通常会导致较高的隐私和安全风险，而这可以通过流行的跨站点训练范式——联邦学习（FL）来优雅地缓解。为此，我们首次提出了联邦开放集识别（FedOSR），同时提出了一种新颖的联邦开放集综合（FedOSS）框架来解决FedOSR的核心挑战：在整个过程中，所有预期客户都无法获得未知样本。训练阶段。所提出的FedOSS框架主要利用两个模块，即离散未知样本合成（DUSS）和联合开放空间采样（FOSS）来生成虚拟未知样本，用于学习已知类和未知类之间的决策边界。具体来说，DUSS利用客户端间的知识不一致来识别决策边界附近的已知样本，然后将它们推到决策边界之外以合成离散的虚拟未知样本。 FOSS将这些来自不同客户端的生成的未知样本联合起来，估计决策边界附近开放数据空间的类条件分布，并对开放数据进行进一步采样，从而提高虚拟未知样本的多样性。此外，我们还进行了全面的消融实验来验证 DUSS 和 FOSS 的有效性。与最先进的方法相比，FedOSS 在公共医疗数据集上显示出卓越的性能。

AU Xu, Chenchu Zhang, Tong Zhang, Dong Zhang, Dingwen Han, Junwei
徐AU、张晨初、张桐、张栋、韩丁文、俊伟

Deep Generative Adversarial Reinforcement Learning for Semi-Supervised Segmentation of Low-Contrast and Small Objects in Medical Images
用于医学图像中低对比度和小物体半监督分割的深度生成对抗强化学习

Deep reinforcement learning (DRL) has demonstrated impressive performance in medical image segmentation, particularly for low-contrast and small medical objects. However, current DRL-based segmentation methods face limitations due to the optimization of error propagation in two separate stages and the need for a significant amount of labeled data. In this paper, we propose a novel deep generative adversarial reinforcement learning (DGARL) approach that, for the first time, enables end-to-end semi-supervised medical image segmentation in the DRL domain. DGARL ingeniously establishes a pipeline that integrates DRL and generative adversarial networks (GANs) to optimize both detection and segmentation tasks holistically while mutually enhancing each other. Specifically, DGARL introduces two innovative components to facilitate this integration in semi-supervised settings. First, a task-joint GAN with two discriminators links the detection results to the GAN's segmentation performance evaluation, allowing simultaneous joint evaluation and feedback. This ensures that DRL and GAN can be directly optimized based on each other's results. Second, a bidirectional exploration DRL integrates backward exploration and forward exploration to ensure the DRL agent explores the correct direction when forward exploration is disabled due to lack of explicit rewards. This mitigates the issue of unlabeled data being unable to provide rewards and rendering DRL unexplorable. Comprehensive experiments on three generalization datasets, comprising a total of 640 patients, demonstrate that our novel DGARL achieves 85.02% Dice and improves at least 1.91% for brain tumors, achieves 73.18% Dice and improves at least 4.28% for liver tumors, and achieves 70.85% Dice and improves at least 2.73% for pancreas compared to the ten most recent advanced methods, our results attest to the superiority of DGARL. Code is available at GitHub.
深度强化学习（DRL）在医学图像分割方面表现出了令人印象深刻的性能，特别是对于低对比度和小型医疗对象。然而，由于两个独立阶段的错误传播优化以及需要大量标记数据，当前基于 DRL 的分割方法面临局限性。在本文中，我们提出了一种新颖的深度生成对抗强化学习（DGARL）方法，该方法首次在 DRL 领域实现端到端的半监督医学图像分割。 DGARL 巧妙地建立了一个集成 DRL 和生成对抗网络（GAN）的管道，以整体优化检测和分割任务，同时相互增强。具体来说，DGARL 引入了两个创新组件来促进半监督环境中的这种集成。首先，具有两个判别器的任务联合 GAN 将检测结果与 GAN 的分割性能评估联系起来，从而允许同时进行联合评估和反馈。这确保了 DRL 和 GAN 可以直接基于彼此的结果进行优化。其次，双向探索 DRL 集成了后向探索和前向探索，以确保当由于缺乏显式奖励而禁用前向探索时，DRL 代理探索正确的方向。这缓解了未标记数据无法提供奖励以及导致 DRL 无法探索的问题。对三个泛化数据集（总共包括 640 名患者）的综合实验表明，我们的新型 DGARL 对于脑肿瘤实现了 85.02% Dice，改善了至少 1.91%，对于肝脏肿瘤实现了 73.18% Dice，改善了至少 4.28%，并且实现了 70.85 % 骰子并提高至少 2。与十种最新先进方法相比，胰腺的死亡率为 73%，我们的结果证明了 DGARL 的优越性。代码可在 GitHub 上获取。

AU Liu, Mingxin Liu, Yunzan Xu, Pengbo Cui, Hui Ke, Jing Ma, Jiquan
刘AU、刘明欣、徐云赞、崔鹏波、柯辉、马晶、吉泉

Exploiting Geometric Features via Hierarchical Graph Pyramid Transformer for Cancer Diagnosis Using Histopathological Images
通过分层图金字塔变压器利用几何特征使用组织病理学图像进行癌症诊断

Cancer is widely recognized as the primary cause of mortality worldwide, and pathology analysis plays a pivotal role in achieving accurate cancer diagnosis. The intricate representation of features in histopathological images encompasses abundant information crucial for disease diagnosis, regarding cell appearance, tumor microenvironment, and geometric characteristics. However, recent deep learning methods have not adequately exploited geometric features for pathological image classification due to the absence of effective descriptors that can capture both cell distribution and gathering patterns, which often serve as potent indicators. In this paper, inspired by clinical practice, a Hierarchical Graph Pyramid Transformer (HGPT) is proposed to guide pathological image classification by effectively exploiting a geometric representation of tissue distribution which was ignored by existing state-of-the-art methods. First, a graph representation is constructed according to morphological feature of input pathological image and learn geometric representation through the proposed multi-head graph aggregator. Then, the image and its graph representation are feed into the transformer encoder layer to model long-range dependency. Finally, a locality feature enhancement block is designed to enhance the 2D local representation of feature embedding, which is not well explored in the existing vision transformers. An extensive experimental study is conducted on Kather-5K, MHIST, NCT-CRC-HE, and GasHisSDB for binary or multi-category classification of multiple cancer types. Results demonstrated that our method is capable of consistently reaching superior classification outcomes for histopathological images, which provide an effective diagnostic tool for malignant tumors in clinical practice.
癌症被广泛认为是全世界死亡的主要原因，病理分析在实现准确的癌症诊断中发挥着关键作用。组织病理学图像中特征的复杂表示包含对疾病诊断至关重要的丰富信息，包括细胞外观、肿瘤微环境和几何特征。然而，由于缺乏可以捕获细胞分布和聚集模式的有效描述符，而这些描述符通常可以作为有效的指标，因此最近的深度学习方法尚未充分利用几何特征进行病理图像分类。在本文中，受临床实践的启发，提出了一种层次图金字塔变换器（HGPT），通过有效利用现有最先进方法忽略的组织分布的几何表示来指导病理图像分类。首先，根据输入病理图像的形态特征构建图表示，并通过所提出的多头图聚合器学习几何表示。然后，图像及其图形表示形式被输入到变压器编码器层以对远程依赖性进行建模。最后，设计了局部特征增强块来增强特征嵌入的二维局部表示，这在现有的视觉变换器中没有得到很好的探索。在 Kather-5K、MHIST、NCT-CRC-HE 和 GasHisSDB 上进行了广泛的实验研究，用于多种癌症类型的二元或多类别分类。结果表明，我们的方法能够始终如一地达到组织病理学图像的优异分类结果，为临床实践中的恶性肿瘤提供有效的诊断工具。

AU Sun, Jiarui Li, Qiuxuan Liu, Yuhao Liu, Yichuan Coatrieux, Gouenou Coatrieux, Jean-Louis Chen, Yang Lu, Jie
AU Sun、李嘉瑞、刘秋轩、刘宇豪、Yichuan Coatrieux、Gouenou Coatrieux、Jean-Louis Chen、杨路、杰

Pathological Asymmetry-Guided Progressive Learning for Acute Ischemic Stroke Infarct Segmentation.
病理不对称引导的急性缺血性中风梗塞分割的渐进式学习。

Quantitative infarct estimation is crucial for diagnosis, treatment and prognosis in acute ischemic stroke (AIS) patients. As the early changes of ischemic tissue are subtle and easily confounded by normal brain tissue, it remains a very challenging task. However, existing methods often ignore or confuse the contribution of different types of anatomical asymmetry caused by intrinsic and pathological changes to segmentation. Further, inefficient domain knowledge utilization leads to mis-segmentation for AIS infarcts. Inspired by this idea, we propose a pathological asymmetry-guided progressive learning (PAPL) method for AIS infarct segmentation. PAPL mimics the step-by-step learning patterns observed in humans, including three progressive stages: knowledge preparation stage, formal learning stage, and examination improvement stage. First, knowledge preparation stage accumulates the preparatory domain knowledge of the infarct segmentation task, helping to learn domain-specific knowledge representations to enhance the discriminative ability for pathological asymmetries by constructed contrastive learning task. Then, formal learning stage efficiently performs end-to-end training guided by learned knowledge representations, in which the designed feature compensation module (FCM) can leverage the anatomy similarity between adjacent slices from the volumetric medical image to help aggregate rich anatomical context information. Finally, examination improvement stage encourages improving the infarct prediction from the previous stage, where the proposed perception refinement strategy (RPRS) further exploits the bilateral difference comparison to correct the mis-segmentation infarct regions by adaptively regional shrink and expansion. Extensive experiments on public and in-house NCCT datasets demonstrated the superiority of the proposed PAPL, which is promising to help better stroke evaluation and treatment.
定量梗塞评估对于急性缺血性卒中（AIS）患者的诊断、治疗和预后至关重要。由于缺血组织的早期变化很微妙，很容易与正常脑组织混淆，因此这仍然是一项非常具有挑战性的任务。然而，现有的方法经常忽略或混淆由内在和病理变化引起的不同类型的解剖不对称对分割的贡献。此外，低效的领域知识利用会导致 AIS 梗塞的错误分割。受这个想法的启发，我们提出了一种用于 AIS 梗塞分割的病理不对称引导渐进学习（PAPL）方法。 PAPL模仿人类观察到的循序渐进的学习模式，包括三个渐进阶段：知识准备阶段、正式学习阶段和考试改进阶段。首先，知识准备阶段积累梗塞分割任务的准备领域知识，通过构建对比学习任务帮助学习特定领域的知识表示，以增强对病理不对称的判别能力。然后，正式学习阶段在学习的知识表示的指导下有效地执行端到端训练，其中设计的特征补偿模块（FCM）可以利用体积医学图像中相邻切片之间的解剖相似性来帮助聚合丰富的解剖上下文信息。最后，检查改进阶段鼓励改进前一阶段的梗塞预测，其中提出的感知细化策略（RPRS）进一步利用双边差异比较，通过自适应区域收缩和扩展来纠正错误分割的梗塞区域。对公共和内部 NCCT 数据集的广泛实验证明了所提出的 PAPL 的优越性，它有望帮助更好的中风评估和治疗。

AU Wang, Hongqiu Chen, Jian Zhang, Shichen He, Yuan Xu, Jinfeng Wu, Mengwan He, Jinlan Liao, Wenjun Luo, Xiangde
王AU、陈红秋、张健、何世辰、徐媛、吴金峰、何梦万、廖金兰、罗文君、祥德

Dual-Reference Source-Free Active Domain Adaptation for Nasopharyngeal Carcinoma Tumor Segmentation across Multiple Hospitals.
跨多个医院鼻咽癌肿瘤分割的双参考无源主动域适应。

Nasopharyngeal carcinoma (NPC) is a prevalent and clinically significant malignancy that predominantly impacts the head and neck area. Precise delineation of the Gross Tumor Volume (GTV) plays a pivotal role in ensuring effective radiotherapy for NPC. Despite recent methods that have achieved promising results on GTV segmentation, they are still limited by lacking carefully-annotated data and hard-to-access data from multiple hospitals in clinical practice. Although some unsupervised domain adaptation (UDA) has been proposed to alleviate this problem, unconditionally mapping the distribution distorts the underlying structural information, leading to inferior performance. To address this challenge, we devise a novel Sourece-Free Active Domain Adaptation framework to facilitate domain adaptation for the GTV segmentation task. Specifically, we design a dual reference strategy to select domain-invariant and domain-specific representative samples from a specific target domain for annotation and model fine-tuning without relying on source-domain data. Our approach not only ensures data privacy but also reduces the workload for oncologists as it just requires annotating a few representative samples from the target domain and does not need to access the source data. We collect a large-scale clinical dataset comprising 1057 NPC patients from five hospitals to validate our approach. Experimental results show that our method outperforms the previous active learning (e.g., AADA and MHPL) and UDA (e.g., Tent and CPR) methods, and achieves comparable results to the fully supervised upper bound, even with few annotations, highlighting the significant medical utility of our approach. In addition, there is no public dataset about multi-center NPC segmentation, we will release code and dataset for future research (Git).
鼻咽癌（NPC）是一种常见且具有临床意义的恶性肿瘤，主要影响头颈部区域。精确描绘大体肿瘤体积（GTV）对于确保鼻咽癌的有效放射治疗起着关键作用。尽管最近的方法在 GTV 分割方面取得了可喜的结果，但它们仍然受到临床实践中缺乏仔细注释的数据和难以访问来自多个医院的数据的限制。尽管已经提出了一些无监督域适应（UDA）来缓解这个问题，但无条件映射分布会扭曲底层结构信息，导致性能较差。为了应对这一挑战，我们设计了一种新颖的无源主动域适应框架，以促进 GTV 分割任务的域适应。具体来说，我们设计了一种双重参考策略，从特定目标域中选择域不变和域特定的代表性样本进行注释和模型微调，而不依赖于源域数据。我们的方法不仅确保了数据隐私，还减少了肿瘤学家的工作量，因为它只需要注释目标域中的一些代表性样本，而不需要访问源数据。我们收集了来自五家医院的 1057 名鼻咽癌患者的大规模临床数据集来验证我们的方法。实验结果表明，我们的方法优于之前的主动学习（例如，AADA 和 MHPL）和 UDA（例如，Tent 和 CPR）方法，并且即使没有很少的注释，也能达到与完全监督上限相当的结果，凸显了重要的医疗效用我们的方法。另外，目前还没有关于多中心NPC分割的公共数据集，我们将发布代码和数据集以供将来的研究（Git）。

AU Yang, Yan Yu, Jun Fu, Zhenqi Zhang, Ke Yu, Ting Wang, Xianyun Jiang, Hanliang Lv, Junhui Huang, Qingming Han, Weidong
欧阳、余彦、付军、张振奇、于柯、王婷、蒋先云、吕汉良、黄俊辉、韩清明、卫东

Token-Mixer: Bind Image and Text in One Embedding Space for Medical Image Reporting.
令牌混合器：将图像和文本绑定在一个嵌入空间中，用于医学图像报告。

Medical image reporting focused on automatically generating the diagnostic reports from medical images has garnered growing research attention. In this task, learning cross-modal alignment between images and reports is crucial. However, the exposure bias problem in autoregressive text generation poses a notable challenge, as the model is optimized by a word-level loss function using the teacher-forcing strategy. To this end, we propose a novel Token-Mixer framework that learns to bind image and text in one embedding space for medical image reporting. Concretely, Token-Mixer enhances the cross-modal alignment by matching image-to-text generation with text-to-text generation that suffers less from exposure bias. The framework contains an image encoder, a text encoder and a text decoder. In training, images and paired reports are first encoded into image tokens and text tokens, and these tokens are randomly mixed to form the mixed tokens. Then, the text decoder accepts image tokens, text tokens or mixed tokens as prompt tokens and conducts text generation for network optimization. Furthermore, we introduce a tailored text decoder and an alternative training strategy that well integrate with our Token-Mixer framework. Extensive experiments across three publicly available datasets demonstrate Token-Mixer successfully enhances the image-text alignment and thereby attains a state-of-the-art performance. Related codes are available at https://github.com/yangyan22/Token-Mixer.
专注于从医学图像自动生成诊断报告的医学图像报告已引起越来越多的研究关注。在此任务中，学习图像和报告之间的跨模式对齐至关重要。然而，自回归文本生成中的暴露偏差问题提出了一个显着的挑战，因为该模型是使用教师强制策略通过单词级损失函数进行优化的。为此，我们提出了一种新颖的令牌混合器框架，该框架学习将图像和文本绑定在一个嵌入空间中以进行医学图像报告。具体来说，Token-Mixer 通过将图像到文本生成与受曝光偏差影响较小的文本到文本生成相匹配来增强跨模式对齐。该框架包含图像编码器、文本编码器和文本解码器。在训练中，图像和配对报告首先被编码为图像标记和文本标记，并且这些标记被随机混合以形成混合标记。然后，文本解码器接受图像标记、文本标记或混合标记作为提示标记，并进行文本生成以进行网络优化。此外，我们引入了定制的文本解码器和替代训练策略，它们与我们的令牌混合器框架很好地集成。在三个公开可用的数据集上进行的广泛实验表明，Token-Mixer 成功地增强了图像文本对齐，从而获得了最先进的性能。相关代码可参见https://github.com/yangyan22/Token-Mixer。

AU Zhang, Yumin Li, Hongliu Gao, Yajun Duan, Haoran Huang, Yawen Zheng, Yefeng
张AU、李玉民、高红柳、段亚军、黄浩然、郑亚文、叶峰

Prototype Correlation Matching and Class-Relation Reasoning for Few-Shot Medical Image Segmentation.
少镜头医学图像分割的原型相关匹配和类关系推理。

Few-shot medical image segmentation has achieved great progress in improving accuracy and efficiency of medical analysis in the biomedical imaging field. However, most existing methods cannot explore inter-class relations among base and novel medical classes to reason unseen novel classes. Moreover, the same kind of medical class has large intra-class variations brought by diverse appearances, shapes and scales, thus causing ambiguous visual characterization to degrade generalization performance of these existing methods on unseen novel classes. To address the above challenges, in this paper, we propose a Prototype correlation Matching and Class-relation Reasoning (i.e., PMCR) model. The proposed model can effectively mitigate false pixel correlation matches caused by large intra-class variations while reasoning inter-class relations among different medical classes. Specifically, in order to address false pixel correlation match brought by large intra-class variations, we propose a prototype correlation matching module to mine representative prototypes that can characterize diverse visual information of different appearances well. We aim to explore prototypelevel rather than pixel-level correlation matching between support and query features via optimal transport algorithm to tackle false matches caused by intra-class variations. Meanwhile, in order to explore inter-class relations, we design a class-relation reasoning module to segment unseen novel medical objects via reasoning inter-class relations between base and novel classes. Such inter-class relations can be well propagated to semantic encoding of local query features to improve few-shot segmentation performance. Quantitative comparisons illustrates the large performance improvement of our model over other baseline methods.
小样本医学图像分割在提高生物医学成像领域医学分析的准确性和效率方面取得了巨大进步。然而，大多数现有方法无法探索基础医学类和新医学类之间的类间关系来推理未见过的新类。此外，同一类型的医学类由于不同的外观、形状和尺度而具有较大的类内差异，从而导致模糊的视觉表征，从而降低了这些现有方法对未知新类的泛化性能。为了解决上述挑战，在本文中，我们提出了原型相关匹配和类关系推理（即 PMCR）模型。所提出的模型可以有效地减轻由大的类内变化引起的错误像素相关匹配，同时推理不同医学类别之间的类间关系。具体来说，为了解决较大的类内变化带来的错误像素相关匹配问题，我们提出了一种原型相关匹配模块来挖掘能够很好地表征不同外观的各种视觉信息的代表性原型。我们的目标是通过最佳传输算法探索支持和查询特征之间的原型级而不是像素级相关匹配，以解决由类内变化引起的错误匹配。同时，为了探索类间关系，我们设计了一个类关系推理模块，通过推理基础类和新类之间的类间关系来分割未见过的新颖医疗对象。这种类间关系可以很好地传播到本地查询特征的语义编码，以提高少样本分割性能。定量比较表明我们的模型相对于其他基线方法有巨大的性能改进。

AU Li, Kang Zhu, Yu Yu, Lequan Heng, Pheng-Ann
区莉、朱康、余宇、恒乐泉、彭安

A Dual Enrichment Synergistic Strategy to Handle Data Heterogeneity for Domain Incremental Cardiac Segmentation
处理域增量心脏分割数据异质性的双重丰富协同策略

Upon remarkable progress in cardiac image segmentation, contemporary studies dedicate to further upgrading model functionality toward perfection, through progressively exploring the sequentially delivered datasets over time by domain incremental learning. Existing works mainly concentrated on addressing the heterogeneous style variations, but overlooked the critical shape variations across domains hidden behind the sub-disease composition discrepancy. In case the updated model catastrophically forgets the sub-diseases that were learned in past domains but are no longer present in the subsequent domains, we proposed a dual enrichment synergistic strategy to incrementally broaden model competence for a growing number of sub-diseases. The data-enriched scheme aims to diversify the shape composition of current training data via displacement-aware shape encoding and decoding, to gradually build up the robustness against cross-domain shape variations. Meanwhile, the model-enriched scheme intends to strengthen model capabilities by progressively appending and consolidating the latest expertise into a dynamically-expanded multi-expert network, to gradually cultivate the generalization ability over style-variated domains. The above two schemes work in synergy to collaboratively upgrade model capabilities in two-pronged manners. We have extensively evaluated our network with the ACDC and M&Ms datasets in single-domain and compound-domain incremental learning settings. Our approach outperformed other competing methods and achieved comparable results to the upper bound.
随着心脏图像分割取得显着进展，当代研究致力于通过领域增量学习逐步探索随时间推移顺序交付的数据集，从而进一步将模型功能升级到完美。现有的工作主要集中在解决异质风格变化，但忽视了隐藏在子疾病组成差异背后的跨领域的关键形状变化。如果更新的模型灾难性地忘记了在过去领域中学到的但在后续领域中不再存在的子疾病，我们提出了一种双重富集协同策略，以逐步扩大模型针对越来越多的子疾病的能力。数据丰富方案旨在通过位移感知形状编码和解码来使当前训练数据的形状组成多样化，以逐步建立针对跨域形状变化的鲁棒性。同时，模型丰富方案旨在通过将最新的专业知识逐步附加和整合到动态扩展的多专家网络中来增强模型能力，逐步培养不同风格领域的泛化能力。上述两种方案协同作用，双管齐下协同升级模型能力。我们在单域和复合域增量学习设置中使用 ACDC 和 M&Ms 数据集广泛评估了我们的网络。我们的方法优于其他竞争方法，并取得了与上限相当的结果。

AU Ren, Zhimei Sidky, Emil Y. Barber, Rina Foygel Kao, Chien-Min Pan, Xiaochuan
AU Ren、Zhimei Sidky、Emil Y. Barber、Rina Foygel Kao、Chien-Min Pan、小川

Simultaneous Activity and Attenuation Estimation in TOF-PET With TV-Constrained Nonconvex Optimization
利用电视约束非凸优化同时估计 TOF-PET 中的活性和衰减

An alternating direction method of multipliers (ADMM) framework is developed for nonsmooth biconvex optimization for inverse problems in imaging. In particular, the simultaneous estimation of activity and attenuation (SAA) problem in time-of-flight positron emission tomography (TOF-PET) has such a structure when maximum likelihood estimation (MLE) is employed. The ADMM framework is applied to MLE for SAA in TOF-PET, resulting in the ADMM-SAA algorithm. This algorithm is extended by imposing total variation (TV) constraints on both the activity and attenuation map, resulting in the ADMM-TVSAA algorithm. The performance of this algorithm is illustrated using the penalized maximum likelihood activity and attenuation estimation (P-MLAA) algorithm as a reference.
开发了一种交替方向乘法器 (ADMM) 框架，用于成像反问题的非光滑双凸优化。特别地，当采用最大似然估计（MLE）时，飞行时间正电子发射断层扫描（TOF-PET）中的活性和衰减的同时估计（SAA）问题具有这样的结构。 ADMM框架应用于TOF-PET中SAA的MLE，产生了ADMM-SAA算法。通过对活动图和衰减图施加总变分 (TV) 约束来扩展该算法，从而形成 ADMM-TVSAA 算法。使用惩罚最大似然活动和衰减估计（P-MLAA）算法作为参考来说明该算法的性能。

AU Tuccio, Giulia Afrakhteh, Sajjad Iacca, Giovanni Demi, Libertario
AU Tuccio、Giulia Afrakhteh、Sajjad Iacca、Giovanni Demi、Libertario

Time Efficient Ultrasound Localization Microscopy Based on A Novel Radial Basis Function 2D Interpolation
基于新型径向基函数二维插值的省时超声定位显微镜

Ultrasound localization microscopy (ULM) allows for the generation of super-resolved (SR) images of the vasculature by precisely localizing intravenously injected microbubbles. Although SR images may be useful for diagnosing and treating patients, their use in the clinical context is limited by the need for prolonged acquisition times and high frame rates. The primary goal of our study is to relax the requirement of high frame rates to obtain SR images. To this end, we propose a new time-efficient ULM (TEULM) pipeline built on a cutting-edge interpolation method. More specifically, we suggest employing Radial Basis Functions (RBFs) as interpolators to estimate the missing values in the 2-dimensional (2D) spatio-temporal structures. To evaluate this strategy, we first mimic the data acquisition at a reduced frame rate by applying a down-sampling (DS = 2, 4, 8, and 10) factor to high frame rate ULM data. Then, we up-sample the data to the original frame rate using the suggested interpolation to reconstruct the missing frames. Finally, using both the original high frame rate data and the interpolated one, we reconstruct SR images using the ULM framework steps. We evaluate the proposed TEULM using four in vivo datasets, a Rat brain (dataset A), a Rat kidney (dataset B), a Rat tumor (dataset C) and a Rat brain bolus (dataset D), interpolating at the in-phase and quadrature (IQ) level. Results demonstrate the effectiveness of TEULM in recovering vascular structures, even at a DS rate of 10 (corresponding to a frame rate of sub-100Hz). In conclusion, the proposed technique is successful in reconstructing accurate SR images while requiring frame rates of one order of magnitude lower than standard ULM.
超声定位显微镜（ULM）可以通过精确定位静脉注射的微泡来生成脉管系统的超分辨率（SR）图像。尽管 SR 图像可能对诊断和治疗患者有用，但其在临床环境中的使用因需要延长采集时间和高帧速率而受到限制。我们研究的主要目标是放宽获取 SR 图像的高帧率要求。为此，我们提出了一种基于尖端插值方法构建的新的省时 ULM (TEULM) 流程。更具体地说，我们建议使用径向基函数（RBF）作为插值器来估计二维（2D）时空结构中的缺失值。为了评估此策略，我们首先通过对高帧率 ULM 数据应用下采样（DS = 2、4、8 和 10）因子来模拟降低帧率的数据采集。然后，我们使用建议的插值将数据上采样到原始帧速率，以重建丢失的帧。最后，使用原始高帧率数据和插值数据，我们使用 ULM 框架步骤重建 SR 图像。我们使用四个体内数据集评估所提出的 TEULM，即大鼠大脑（数据集 A）、大鼠肾脏（数据集 B）、大鼠肿瘤（数据集 C）和大鼠脑丸（数据集 D），在同相插值和正交 (IQ) 电平。结果证明了 TEULM 在恢复血管结构方面的有效性，即使在 DS 速率为 10（对应于低于 100Hz 的帧速率）时也是如此。总之，所提出的技术成功地重建了精确的 SR 图像，同时要求帧速率比标准 ULM 低一个数量级。

AU Wu, Weiwen Wang, Yanyang Liu, Qiegen Wang, Ge Zhang, Jianjia
吴AU、王伟文、刘艳阳、王切根、张戈、健佳

Wavelet-Improved Score-Based Generative Model for Medical Imaging
基于小波改进的医学成像评分生成模型

The score-based generative model (SGM) has demonstrated remarkable performance in addressing challenging under-determined inverse problems in medical imaging. However, acquiring high-quality training datasets for these models remains a formidable task, especially in medical image reconstructions. Prevalent noise perturbations or artifacts in low-dose Computed Tomography (CT) or under-sampled Magnetic Resonance Imaging (MRI) hinder the accurate estimation of data distribution gradients, thereby compromising the overall performance of SGMs when trained with these data. To alleviate this issue, we propose a wavelet-improved denoising technique to cooperate with the SGMs, ensuring effective and stable training. Specifically, the proposed method integrates a wavelet sub-network and the standard SGM sub-network into a unified framework, effectively alleviating inaccurate distribution of the data distribution gradient and enhancing the overall stability. The mutual feedback mechanism between the wavelet sub-network and the SGM sub-network empowers the neural network to learn accurate scores even when handling noisy samples. This combination results in a framework that exhibits superior stability during the learning process, leading to the generation of more precise and reliable reconstructed images. During the reconstruction process, we further enhance the robustness and quality of the reconstructed images by incorporating regularization constraint. Our experiments, which encompass various scenarios of low-dose and sparse-view CT, as well as MRI with varying under-sampling rates and masks, demonstrate the effectiveness of the proposed method by significantly enhanced the quality of the reconstructed images. Especially, our method with noisy training samples achieves comparable results to those obtained using clean data.
基于评分的生成模型（SGM）在解决医学成像中具有挑战性的欠定逆问题方面表现出了卓越的性能。然而，为这些模型获取高质量的训练数据集仍然是一项艰巨的任务，特别是在医学图像重建中。低剂量计算机断层扫描 (CT) 或欠采样磁共振成像 (MRI) 中普遍存在的噪声扰动或伪影阻碍了数据分布梯度的准确估计，从而影响了使用这些数据进行训练时 SGM 的整体性能。为了缓解这个问题，我们提出了一种小波改进的去噪技术来与 SGM 配合，确保训练的有效和稳定。具体来说，该方法将小波子网络和标准SGM子网络集成到统一的框架中，有效缓解数据分布梯度分布不准确的问题，增强整体稳定性。小波子网络和SGM子网络之间的相互反馈机制使神经网络即使在处理噪声样本时也能学习准确的分数。这种组合形成了一个在学习过程中表现出卓越稳定性的框架，从而生成更精确和可靠的重建图像。在重建过程中，我们通过结合正则化约束进一步增强重建图像的鲁棒性和质量。我们的实验涵盖低剂量和稀疏视图 CT 的各种场景，以及具有不同欠采样率和掩模的 MRI，通过显着提高重建图像的质量证明了所提出方法的有效性。特别是，我们使用噪声训练样本的方法取得了与使用干净数据获得的结果相当的结果。

AU Yang, Yuxuan Wang, Hao Wang, Jizhou Dong, Kai Ding, Shuai
欧阳、王雨轩、王浩、董继周、丁凯、帅

Semantic-Preserving Surgical Video Retrieval With Phase and Behavior Coordinated Hashing
使用相位和行为协调哈希的语义保留手术视频检索

Medical professionals rely on surgical video retrieval to discover relevant content within large numbers of videos for surgical education and knowledge transfer. However, the existing retrieval techniques often fail to obtain user-expected results since they ignore valuable semantics in surgical videos. The incorporation of rich semantics into video retrieval is challenging in terms of the hierarchical relationship modeling and coordination between coarse- and fine-grained semantics. To address these issues, this paper proposes a novel semantic-preserving surgical video retrieval (SPSVR) framework, which incorporates surgical phase and behavior semantics using a dual-level hashing module to capture their hierarchical relationship. This module preserves the semantics in binary hash codes by transforming the phase and behavior similarities into high- and low-level similarities in a shared Hamming space. The binary codes are optimized by performing a reconstruction task, a high-level similarity preservation task, and a low-level similarity preservation task, using a coordinated optimization strategy for efficient learning. A self-supervised learning scheme is adopted to capture behavior semantics from video clips so that the indexing of behaviors is unencumbered by fine-grained annotation and recognition. Experiments on four surgical video datasets for two different disciplines demonstrate the robust performance of the proposed framework. In addition, the results of the clinical validation experiments indicate the ability of the proposed method to retrieve the results expected by surgeons. The code can be found at https://github.com/trigger26/SPSVR.
医疗专业人员依靠手术视频检索在大量视频中发现相关内容，以进行手术教育和知识转移。然而，现有的检索技术往往无法获得用户期望的结果，因为它们忽略了手术视频中有价值的语义。将丰富的语义融入视频检索在层次关系建模以及粗粒度和细粒度语义之间的协调方面具有挑战性。为了解决这些问题，本文提出了一种新颖的语义保留手术视频检索（SPSVR）框架，该框架使用双层哈希模块来结合手术阶段和行为语义来捕获它们的层次关系。该模块通过将阶段和行为相似性转换为共享汉明空间中的高级和低级相似性来保留二进制哈希码中的语义。通过执行重建任务、高级相似性保存任务和低级相似性保存任务来优化二进制代码，使用协调优化策略进行高效学习。采用自监督学习方案从视频剪辑中捕获行为语义，从而使行为索引不受细粒度注释和识别的阻碍。对两个不同学科的四个手术视频数据集的实验证明了所提出的框架的稳健性能。此外，临床验证实验的结果表明所提出的方法能够检索外科医生期望的结果。代码可以在 https://github.com/trigger26/SPSVR 找到。

AU Lu, Xu Cui, Zengzhen Sun, Yihua Khor, Hee Guan Sun, Ao Ma, Longfei Chen, Fang Gao, Shan Tian, Yun Zhou, Fang Lv, Yang Liao, Hongen
AU Lu、徐翠、孙增振、许一华、孙熙冠、敖马、陈龙飞、高方、田善、周云、吕方、廖杨、洪恩

Better Rough Than Scarce: Proximal Femur Fracture Segmentation With Rough Annotations
粗糙比稀缺更好：带有粗糙注释的近端股骨骨折分割

Proximal femoral fracture segmentation in computed tomography (CT) is essential in the preoperative planning of orthopedic surgeons. Recently, numerous deep learning-based approaches have been proposed for segmenting various structures within CT scans. Nevertheless, distinguishing various attributes between fracture fragments and soft tissue regions in CT scans frequently poses challenges, which have received comparatively limited research attention. Besides, the cornerstone of contemporary deep learning methodologies is the availability of annotated data, while detailed CT annotations remain scarce. To address the challenge, we propose a novel weakly-supervised framework, namely Rough Turbo Net (RT-Net), for the segmentation of proximal femoral fractures. We emphasize the utilization of human resources to produce rough annotations on a substantial scale, as opposed to relying on limited fine-grained annotations that demand a substantial time to create. In RT-Net, rough annotations pose fractured-region constraints, which have demonstrated significant efficacy in enhancing the accuracy of the network. Conversely, the fine annotations can provide more details for recognizing edges and soft tissues. Besides, we design a spatial adaptive attention module (SAAM) that adapts to the spatial distribution of the fracture regions and align feature in each decoder. Moreover, we propose a fine-edge loss which is applied through an edge discrimination network to penalize the absence or imprecision edge features. Extensive quantitative and qualitative experiments demonstrate the superiority of RT-Net to state-of-the-art approaches. Furthermore, additional experiments show that RT-Net has the capability to produce pseudo labels for raw CT images that can further improve fracture segmentation performance and has the potential to improve segmentation performance on public datasets.
计算机断层扫描 (CT) 中的近端股骨骨折分割对于骨科医生的术前计划至关重要。最近，人们提出了许多基于深度学习的方法来分割 CT 扫描中的各种结构。然而，在 CT 扫描中区分骨折碎片和软组织区域的各种属性经常会带来挑战，而这些挑战受到的研究关注相对有限。此外，当代深度学习方法的基石是注释数据的可用性，而详细的 CT 注释仍然很少。为了应对这一挑战，我们提出了一种新颖的弱监督框架，即 Rough Turbo Net (RT-Net)，用于近端股骨骨折的分割。我们强调利用人力资源大规模地生成粗略注释，而不是依赖需要大量时间来创建的有限细粒度注释。在 RT-Net 中，粗略的注释造成了断裂区域约束，这在提高网络准确性方面表现出了显着的功效。相反，精细注释可以为识别边缘和软组织提供更多细节。此外，我们设计了一个空间自适应注意模块（SAAM），它适应断裂区域的空间分布并在每个解码器中对齐特征。此外，我们提出了一种通过边缘判别网络应用的精细边缘损失，以惩罚边缘特征的缺失或不精确。大量的定量和定性实验证明了 RT-Net 相对于最先进方法的优越性。此外，额外的实验表明，RT-Net 具有为原始 CT 图像生成伪标签的能力，可以进一步提高裂缝分割性能，并有可能提高公共数据集的分割性能。

C1 Tsinghua Univ, Sch Biomed Engn, Beijing 100084, Peoples R China C1 Tsinghua Univ, Grad Sch Shenzhen, Shenzhen 518055, Peoples R China C1 Peking Univ Third Hosp, Dept Orthoped, Beijing 100191, Peoples R China C1 Tsinghua Univ, Sch Biomed Engn, Beijing 100084, Peoples R China C1 Shanghai Jiao Tong Univ, Sch Biomed Engn, Shanghai 200240, Peoples R China C1 Shanghai Jiao Tong Univ, Inst Med Robot, Shanghai 200240, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-09-18 UT WOS:001307429600012 PM 38652607 ER
C1 清华大学生物医学工程学院，北京 100084，人民大学 C1 清华大学深圳研究生院，深圳 518055，人民大学 C1 北京大学第三医院骨科，北京 100191，人民大学 C1 清华大学生物医学工程学院，北京100084，人民R中国C1上海交通大学，上海生物医学工程学院，上海200240，人民R中国C1上海交通大学，医学机器人研究所，上海200240，人民R中国SN 0278-0062 EI 1558-254X DA 2024- 09-18 UT WOS:001307429600012 PM 38652607 ER

AU Ruan, Guohui Wang, Zhaonian Liu, Chunyi Xia, Ling Wang, Huafeng Qi, Li Chen, Wufan
阮盟、王国辉、刘兆年、夏春一、王凌、齐华峰、陈力、吴凡

Magnetic Resonance Electrical Properties Tomography Based on Modified Physics- Informed Neural Network and Multiconstraints
基于修正物理信息神经网络和多约束的磁共振电特性层析成像

This paper presents a novel method based on leveraging physics-informed neural networks for magnetic resonance electrical property tomography (MREPT). MREPT is a noninvasive technique that can retrieve the spatial distribution of electrical properties (EPs) of scanned tissues from measured transmit radiofrequency (RF) in magnetic resonance imaging (MRI) systems. The reconstruction of EP values in MREPT is achieved by solving a partial differential equation derived from Maxwell's equations that lacks a direct solution. Most conventional MREPT methods suffer from artifacts caused by the invalidation of the assumption applied for simplification of the problem and numerical errors caused by numerical differentiation. Existing deep learning-based (DL-based) MREPT methods comprise data-driven methods that need to collect massive datasets for training or model-driven methods that are only validated in trivial cases. Hence we proposed a model-driven method that learns mapping from a measured RF, its spatial gradient and Laplacian to EPs using fully connected networks (FCNNs). The spatial gradient of EP can be computed through the automatic differentiation of FCNNs and the chain rule. FCNNs are optimized using the residual of the central physical equation of convection-reaction MREPT as the loss function ( ${{\mathcal {L}}}{)}$ . To alleviate the ill condition of the problem, we added multiconstraints, including the similarity constraint between permittivity and conductivity and the ${\ell }_{{{1}}}$ norm of spatial gradients of permittivity and conductivity, to the ${{\mathcal {L}}}$ . We demonstrate the proposed method with a three-dimensional realistic head model, a digital phantom simulation, and a practical phantom experiment at a 9.4T animal MRI system.
本文提出了一种基于物理信息神经网络的磁共振电特性断层扫描（MREPT）的新方法。 MREPT 是一种无创技术，可以从磁共振成像 (MRI) 系统中测量的发射射频 (RF) 中检索扫描组织的电特性 (EP) 的空间分布。 MREPT 中 EP 值的重建是通过求解从缺乏直接解的麦克斯韦方程组导出的偏微分方程来实现的。大多数传统的 MREPT 方法都存在因用于简化问题的假设无效而导致的伪影以及数值微分导致的数值误差。现有的基于深度学习（DL）的 MREPT 方法包括需要收集大量数据集进行训练的数据驱动方法或仅在琐碎情况下验证的模型驱动方法。因此，我们提出了一种模型驱动的方法，该方法使用全连接网络（FCNN）学习从测量的 RF、其空间梯度和拉普拉斯算子到 EP 的映射。 EP的空间梯度可以通过FCNN的自动微分和链式法则来计算。 FCNN 使用对流反应 MREPT 中心物理方程的残差作为损失函数 ( ${{\mathcal {L}}}{)}$ 进行优化。为了缓解这个问题，我们在 ${ {\mathcal {L}}}$ 。我们通过三维逼真头部模型、数字体模模拟以及 9.4T 动物 MRI 系统的实际体模实验演示了所提出的方法。

AU Xu, Jiaxing Bian, Qingtian Li, Xinhang Zhang, Aihu Ke, Yiping Qiao, Miao Zhang, Wei Sim, Wei Khang Jeremy Gulyas, Balazs CA Alzheimers Dis Neuroimaging Initiative
徐AU、卞嘉兴、李庆天、张新航、柯爱虎、乔一平、张淼、辛伟、康伟 Jeremy Gulyas、Balazs CA 阿尔茨海默病神经影像计划

Contrastive Graph Pooling for Explainable Classification of Brain Networks
用于可解释的大脑网络分类的对比图池

Functional magnetic resonance imaging (fMRI) is a commonly used technique to measure neural activation. Its application has been particularly important in identifying underlying neurodegenerative conditions such as Parkinson's, Alzheimer's, and Autism. Recent analysis of fMRI data models the brain as a graph and extracts features by graph neural networks (GNNs). However, the unique characteristics of fMRI data require a special design of GNN. Tailoring GNN to generate effective and domain-explainable features remains challenging. In this paper, we propose a contrastive dual-attention block and a differentiable graph pooling method called ContrastPool to better utilize GNN for brain networks, meeting fMRI-specific requirements. We apply our method to 5 resting-state fMRI brain network datasets of 3 diseases and demonstrate its superiority over state-of-the-art baselines. Our case study confirms that the patterns extracted by our method match the domain knowledge in neuroscience literature, and disclose direct and interesting insights. Our contributions underscore the potential of ContrastPool for advancing the understanding of brain networks and neurodegenerative conditions. The source code is available at https://github.com/AngusMonroe/ContrastPool.
功能磁共振成像（fMRI）是测量神经激活的常用技术。它的应用对于识别潜在的神经退行性疾病（如帕金森病、阿尔茨海默病和自闭症）尤其重要。最近对功能磁共振成像数据的分析将大脑建模为图形，并通过图形神经网络 (GNN) 提取特征。然而，fMRI数据的独特特征需要GNN的特殊设计。定制 GNN 以生成有效且领域可解释的特征仍然具有挑战性。在本文中，我们提出了一种对比性双注意力块和一种称为 ContrastPool 的可微图池方法，以更好地将 GNN 用于大脑网络，满足 fMRI 的特定要求。我们将我们的方法应用于 3 种疾病的 5 个静息态 fMRI 脑网络数据集，并证明了其相对于最先进基线的优越性。我们的案例研究证实，我们的方法提取的模式与神经科学文献中的领域知识相匹配，并揭示了直接且有趣的见解。我们的贡献强调了 ContrastPool 在促进对大脑网络和神经退行性疾病的理解方面的潜力。源代码可在 https://github.com/AngusMonroe/ContrastPool 获取。

AU Kim, Boah Zhuang, Yan Mathai, Tejas Sudharshan Summers, Ronald M
AU Kim、Boah Zhuang、Yan Mathai、Tejas Sudharshan Summers、Ronald M

OTMorph: Unsupervised Multi-domain Abdominal Medical Image Registration Using Neural Optimal Transport.
OTMorph：使用神经最优传输的无监督多域腹部医学图像配准。

Deformable image registration is one of the essential processes in analyzing medical images. In particular, when diagnosing abdominal diseases such as hepatic cancer and lymphoma, multi-domain images scanned from different modalities or different imaging protocols are often used. However, they are not aligned due to scanning times, patient breathing, movement, etc. Although recent learning-based approaches can provide deformations in real-time with high performance, multi-domain abdominal image registration using deep learning is still challenging since the images in different domains have different characteristics such as image contrast and intensity ranges. To address this, this paper proposes a novel unsupervised multi-domain image registration framework using neural optimal transport, dubbed OTMorph. When moving and fixed volumes are given as input, a transport module of our proposed model learns the optimal transport plan to map data distributions from the moving to the fixed volumes and estimates a domain-transported volume. Subsequently, a registration module taking the transported volume can effectively estimate the deformation field, leading to deformation performance improvement. Experimental results on multi-domain image registration using multi-modality and multi-parametric abdominal medical images demonstrate that the proposed method provides superior deformable registration via the domain-transported image that alleviates the domain gap between the input images. Also, we attain the improvement even on out-of-distribution data, which indicates the superior generalizability of our model for the registration of various medical images. Our source code is available at https://github.com/boahK/OTMorph.
变形图像配准是分析医学图像的基本过程之一。特别是，在诊断肝癌和淋巴瘤等腹部疾病时，经常使用从不同模式或不同成像协议扫描的多域图像。然而，由于扫描时间、患者呼吸、运动等原因，它们并没有对齐。尽管最近基于学习的方法可以高性能地实时提供变形，但使用深度学习的多域腹部图像配准仍然具有挑战性，因为图像不同领域具有不同的特征，例如图像对比度和强度范围。为了解决这个问题，本文提出了一种使用神经最优传输的新型无监督多域图像配准框架，称为 OTMorph。当移动和固定体积作为输入时，我们提出的模型的传输模块学习最佳传输计划以将数据分布从移动体积映射到固定体积并估计域传输体积。随后，采用传输体积的配准模块可以有效地估计变形场，从而提高变形性能。使用多模态和多参数腹部医学图像进行多域图像配准的实验结果表明，该方法通过域传输图像提供了优异的变形配准，从而减轻了输入图像之间的域间隙。此外，我们甚至在分布外数据上也取得了改进，这表明我们的模型对于各种医学图像的配准具有卓越的通用性。我们的源代码可在 https://github.com/boahK/OTMorph 获取。

AU Ali, Rehman Mitcham, Trevor M. Brevett, Thurston Agudo, Oscar Calderon Martinez, Cristina Duran Li, Cuiping Doyley, Marvin M. Duric, Nebojsa
AU Ali、Rehman Mitcham、Trevor M. Brevett、Thurston Agudo、Oscar Calderon Martinez、Cristina Duran Li、Cuiping Doyley、Marvin M. Duric、Nebojsa

2-D Slicewise Waveform Inversion of Sound Speed and Acoustic Attenuation for Ring Array Ultrasound Tomography Based on a Block LU Solver
基于 Block LU 求解器的环形阵列超声断层扫描声速和声衰减的二维切片波形反演

Ultrasound tomography is an emerging imaging modality that uses the transmission of ultrasound through tissue to reconstruct images of its mechanical properties. Initially, ray-based methods were used to reconstruct these images, but their inability to account for diffraction often resulted in poor resolution. Waveform inversion overcame this limitation, providing high-resolution images of the tissue. Most clinical implementations, often directed at breast cancer imaging, currently rely on a frequency-domain waveform inversion to reduce computation time. For ring arrays, ray tomography was long considered a necessary step prior to waveform inversion in order to avoid cycle skipping. However, in this paper, we demonstrate that frequency-domain waveform inversion can reliably reconstruct high-resolution images of sound speed and attenuation without relying on ray tomography to provide an initial model. We provide a detailed description of our frequency-domain waveform inversion algorithm with open-source code and data that we make publicly available.
超声断层扫描是一种新兴的成像方式，它利用超声波通过组织的传输来重建其机械特性的图像。最初，使用基于射线的方法来重建这些图像，但它们无法考虑衍射，通常导致分辨率较差。波形反演克服了这一限制，提供了组织的高分辨率图像。大多数临床实施通常针对乳腺癌成像，目前依赖频域波形反演来减少计算时间。对于环形阵列，射线断层扫描长期以来被认为是波形反演之前的必要步骤，以避免周期跳跃。然而，在本文中，我们证明频域波形反演可以可靠地重建声速和衰减的高分辨率图像，而无需依赖射线断层扫描来提供初始模型。我们通过公开的开源代码和数据提供了频域波形反演算法的详细描述。

AU Lin, Wenjun Hu, Yan Fu, Huazhu Yang, Mingming Chng, Chin-Boon Kawasaki, Ryo Chui, Cheekong Liu, Jiang

Instrument-Tissue Interaction Detection Framework for Surgical Video Understanding
用于手术视频理解的仪器-组织相互作用检测框架

Instrument-tissue interaction detection task, which helps understand surgical activities, is vital for constructing computer-assisted surgery systems but with many challenges. Firstly, most models represent instrument-tissue interaction in a coarse-grained way which only focuses on classification and lacks the ability to automatically detect instruments and tissues. Secondly, existing works do not fully consider relations between intra- and inter-frame of instruments and tissues. In the paper, we propose to represent instrument-tissue interaction as instrument class, instrument bounding box, tissue class, tissue bounding box, action class quintuple and present an Instrument-Tissue Interaction Detection Network (ITIDNet) to detect the quintuple for surgery videos understanding. Specifically, we propose a Snippet Consecutive Feature (SCF) Layer to enhance features by modeling relationships of proposals in the current frame using global context information in the video snippet. We also propose a Spatial Corresponding Attention (SCA) Layer to incorporate features of proposals between adjacent frames through spatial encoding. To reason relationships between instruments and tissues, a Temporal Graph (TG) Layer is proposed with intra-frame connections to exploit relationships between instruments and tissues in the same frame and inter-frame connections to model the temporal information for the same instance. For evaluation, we build a cataract surgery video (PhacoQ) dataset and a cholecystectomy surgery video (CholecQ) dataset. Experimental results demonstrate the promising performance of our model, which outperforms other state-of-the-art models on both datasets.
仪器与组织相互作用检测任务有助于理解手术活动，对于构建计算机辅助手术系统至关重要，但也面临许多挑战。首先，大多数模型以粗粒度的方式表示器械与组织的相互作用，仅注重分类，缺乏自动检测器械和组织的能力。其次，现有的工作没有充分考虑仪器和组织框架内和框架间的关系。在本文中，我们建议将仪器-组织相互作用表示为仪器类、仪器边界框、组织类、组织边界框、动作类五元组，并提出一种仪器-组织相互作用检测网络（ITIDNet）来检测五元组以进行手术视频理解。具体来说，我们提出了一个片段连续特征（SCF）层，通过使用视频片段中的全局上下文信息对当前帧中的提案关系进行建模来增强特征。我们还提出了一个空间对应注意（SCA）层，通过空间编码合并相邻帧之间的提案特征。为了推理仪器和组织之间的关系，提出了具有帧内连接的时间图（TG）层，以利用同一帧中仪器和组织之间的关系，以及帧间连接来对同一实例的时间信息进行建模。为了进行评估，我们构建了白内障手术视频（PhacoQ）数据集和胆囊切除手术视频（CholecQ）数据集。实验结果证明了我们的模型的良好性能，在两个数据集上都优于其他最先进的模型。

AU Wang, Kang Zheng, Feiyang Cheng, Lan Dai, Hong-Ning Dou, Qi Qin, Jing
王AU、郑康、程飞扬、戴蓝、窦红宁、秦琪、静

Breast Cancer Classification From Digital Pathology Images via Connectivity-Aware Graph Transformer
通过连接感知图形转换器对数字病理图像进行乳腺癌分类

Automated classification of breast cancer subtypes from digital pathology images has been an extremely challenging task due to the complicated spatial patterns of cells in the tissue micro-environment. While newly proposed graph transformers are able to capture more long-range dependencies to enhance accuracy, they largely ignore the topological connectivity between graph nodes, which is nevertheless critical to extract more representative features to address this difficult task. In this paper, we propose a novel connectivity-aware graph transformer (CGT) for phenotyping the topology connectivity of the tissue graph constructed from digital pathology images for breast cancer classification. Our CGT seamlessly integrates connectivity embedding to node feature at every graph transformer layer by using local connectivity aggregation, in order to yield more comprehensive graph representations to distinguish different breast cancer subtypes. In light of the realistic intercellular communication mode, we then encode the spatial distance between two arbitrary nodes as connectivity bias in self-attention calculation, thereby allowing the CGT to distinctively harness the connectivity embedding based on the distance of two nodes. We extensively evaluate the proposed CGT on a large cohort of breast carcinoma digital pathology images stained by Haematoxylin & Eosin. Experimental results demonstrate the effectiveness of our CGT, which outperforms state-of-the-art methods by a large margin. Codes are released on https://github.com/wang-kang-6/CGT.
由于组织微环境中细胞的空间模式复杂，从数字病理图像中自动分类乳腺癌亚型一直是一项极具挑战性的任务。虽然新提出的图转换器能够捕获更多的远程依赖关系以提高准确性，但它们在很大程度上忽略了图节点之间的拓扑连接性，但这对于提取更具代表性的特征来解决这一艰巨的任务至关重要。在本文中，我们提出了一种新颖的连接感知图转换器（CGT），用于对根据乳腺癌分类的数字病理图像构建的组织图的拓扑连接进行表型分析。我们的 CGT 通过使用局部连接聚合将连接嵌入无缝集成到每个图转换器层的节点特征，以便产生更全面的图表示来区分不同的乳腺癌亚型。根据实际的细胞间通信模式，我们将两个任意节点之间的空间距离编码为自注意力计算中的连接偏差，从而使 CGT 能够根据两个节点的距离独特地利用连接嵌入。我们在大量苏木精和伊红染色的乳腺癌数字病理图像上广泛评估了所提出的 CGT。实验结果证明了我们的 CGT 的有效性，其性能大大优于最先进的方法。代码发布在https://github.com/wang-kang-6/CGT。

AU Zhou, Lei Zhang, Yuzhong Zhang, Jiadong Qian, Xuejun Gong, Chen Sun, Kun Ding, Zhongxiang Wang, Xing Li, Zhenhui Liu, Zaiyi Shen, Dinggang
周AU、张磊、张玉中、钱家栋、宫学军、孙晨、丁坤、王忠祥、李星、刘振辉、沉在义、丁刚

Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI.
用于 DCE-MRI 中乳腺肿瘤分割的原型学习引导混合网络。

Automated breast tumor segmentation on the basis of dynamic contrast-enhancement magnetic resonance imaging (DCE-MRI) has shown great promise in clinical practice, particularly for identifying the presence of breast disease. However, accurate segmentation of breast tumor is a challenging task, often necessitating the development of complex networks. To strike an optimal tradeoff between computational costs and segmentation performance, we propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers. Specifically, the hybrid network consists of a encoder-decoder architecture by stacking convolution and deconvolution layers. Effective 3D transformer layers are then implemented after the encoder subnetworks, to capture global dependencies between the bottleneck features. To improve the efficiency of hybrid network, two parallel encoder sub-networks are designed for the decoder and the transformer layers, respectively. To further enhance the discriminative capability of hybrid network, a prototype learning guided prediction module is proposed, where the category-specified prototypical features are calculated through online clustering. All learned prototypical features are finally combined with the features from decoder for tumor mask prediction. The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network achieves superior performance than the state-of-the-art (SOTA) methods, while maintaining balance between segmentation accuracy and computation cost. Moreover, we demonstrate that automatically generated tumor masks can be effectively applied to identify HER2-positive subtype from HER2-negative subtype with the similar accuracy to the analysis based on manual tumor segmentation. The source code is available at https://github.com/ZhouL-lab/ PLHN.
基于动态对比增强磁共振成像（DCE-MRI）的自动乳腺肿瘤分割在临床实践中显示出巨大的前景，特别是在识别乳腺疾病的存在方面。然而，乳腺肿瘤的准确分割是一项具有挑战性的任务，通常需要开发复杂的网络。为了在计算成本和分割性能之间取得最佳权衡，我们提出了一种通过卷积神经网络（CNN）和变压器层相结合的混合网络。具体来说，混合网络由堆叠卷积层和反卷积层的编码器-解码器架构组成。然后在编码器子网络之后实现有效的 3D 变换层，以捕获瓶颈特征之间的全局依赖性。为了提高混合网络的效率，分别为解码器层和变换器层设计了两个并行编码器子网络。为了进一步增强混合网络的判别能力，提出了一种原型学习引导预测模块，其中通过在线聚类计算类别指定的原型特征。所有学习到的原型特征最终与来自解码器的特征相结合以进行肿瘤掩模预测。在私有和公共 DCE-MRI 数据集上的实验结果表明，所提出的混合网络比最先进的（SOTA）方法具有更优越的性能，同时保持分割精度和计算成本之间的平衡。此外，我们证明自动生成的肿瘤掩模可以有效地应用于识别 HER2 阳性亚型和 HER2 阴性亚型，其准确性与基于手动肿瘤分割的分析相似。源代码可在 https://github.com/ZhouL-lab/PLHN 获取。

AU Cai, Zhiyuan Lin, Li He, Huaqing Cheng, Pujin Tang, Xiaoying
蔡区、林志远、何力、程华清、唐普金、小英

Uni4Eye++: A General Masked Image Modeling Multi-modal Pre-training Framework for Ophthalmic Image Classification and Segmentation.
Uni4Eye++：用于眼科图像分类和分割的通用掩模图像建模多模态预训练框架。

A large-scale labeled dataset is a key factor for the success of supervised deep learning in most ophthalmic image analysis scenarios. However, limited annotated data is very common in ophthalmic image analysis, since manual annotation is time-consuming and labor-intensive. Self-supervised learning (SSL) methods bring huge opportunities for better utilizing unlabeled data, as they do not require massive annotations. To utilize as many unlabeled ophthalmic images as possible, it is necessary to break the dimension barrier, simultaneously making use of both 2D and 3D images as well as alleviating the issue of catastrophic forgetting. In this paper, we propose a universal self-supervised Transformer framework named Uni4Eye++ to discover the intrinsic image characteristic and capture domain-specific feature embedding in ophthalmic images. Uni4Eye++ can serve as a global feature extractor, which builds its basis on a Masked Image Modeling task with a Vision Transformer architecture. On the basis of our previous work Uni4Eye, we further employ an image entropy guided masking strategy to reconstruct more-informative patches and a dynamic head generator module to alleviate modality confusion. We evaluate the performance of our pre-trained Uni4Eye++ encoder by fine-tuning it on multiple downstream ophthalmic image classification and segmentation tasks. The superiority of Uni4Eye++ is successfully established through comparisons to other state-of-the-art SSL pre-training methods. Our code is available at Github1.
大规模标记数据集是监督深度学习在大多数眼科图像分析场景中成功的关键因素。然而，有限的注释数据在眼科图像分析中非常常见，因为手动注释既耗时又费力。自监督学习（SSL）方法为更好地利用未标记数据带来了巨大的机会，因为它们不需要大量注释。为了利用尽可能多的未标记眼科图像，有必要打破维度障碍，同时利用 2D 和 3D 图像，并减轻灾难性遗忘的问题。在本文中，我们提出了一种名为 Uni4Eye++ 的通用自监督 Transformer 框架，用于发现内在图像特征并捕获眼科图像中嵌入的特定领域特征。 Uni4Eye++ 可以用作全局特征提取器，它以具有 Vision Transformer 架构的掩模图像建模任务为基础。在我们之前工作 Uni4Eye 的基础上，我们进一步采用图像熵引导掩蔽策略来重建更多信息的补丁和动态头部生成器模块来减轻模态混乱。我们通过在多个下游眼科图像分类和分割任务上对其进行微调来评估预训练 Uni4Eye++ 编码器的性能。通过与其他最先进的 SSL 预训练方法的比较，成功确立了 Uni4Eye++ 的优越性。我们的代码可以在 Github1 上找到。

AU De Marco, Fabio Andrejewski, Jana Urban, Theresa Willer, Konstantin Gromann, Lukas Koehler, Thomas Maack, Hanns-Ingo Herzen, Julia Pfeiffer, Franz
AU De Marco、法比奥·安德烈耶夫斯基、贾娜·厄本、特里萨·威勒、康斯坦丁·格罗曼、卢卡斯·克勒、托马斯·麦克、汉斯-英戈·赫尔岑、朱莉娅·菲佛、弗朗兹

X-Ray Dark-Field Signal Reduction Due to Hardening of the Visibility Spectrum
可见光谱硬化导致 X 射线暗场信号减少

X-ray dark-field imaging enables a spatially-resolved visualization of ultra-small-angle X-ray scattering. Using phantom measurements, we demonstrate that a material's effective dark-field signal may be reduced by modification of the visibility spectrum by other dark-field-active objects in the beam. This is the dark-field equivalent of conventional beam-hardening, and is distinct from related, known effects, where the dark-field signal is modified by attenuation or phase shifts. We present a theoretical model for this group of effects and verify it by comparison to the measurements. These findings have significant implications for the interpretation of dark-field signal strength in polychromatic measurements.
X 射线暗场成像可实现超小角度 X 射线散射的空间分辨可视化。使用体模测量，我们证明材料的有效暗场信号可以通过光束中其他暗场活跃物体修改可见光谱来减少。这是传统光束硬化的暗场等效，并且与相关的已知效应不同，在已知效应中，暗场信号通过衰减或相移进行修改。我们提出了这组效应的理论模型，并通过与测量结果的比较来验证它。这些发现对于多色测量中暗场信号强度的解释具有重要意义。

AU Rong, Dingyi Zhao, Zhongyin Wu, Yue Ke, Bilian Ni, Binging
区蓉、赵定一、吴中银、柯岳、倪碧莲、冰冰

Prediction of Myopia Eye Axial Elongation With Orthokeratology Treatment via Dense I2I Based Corneal Topography Change Analysis
通过基于密集 I2I 的角膜地形变化分析预测近视眼眼轴伸长与角膜塑形治疗

While orthokeratology (OK) has shown effective to slow the progression of myopia, it remains unknown how spatially distributed structural stress/tension applying to different regions affects the change of corneal geometry, and consecutive the outcome of myopia control, at fine-grained detail. Acknowledging that the underlying working mechanism of OK lens is essentially mechanics induced refractive parameter reshaping, in this study, we develop a novel mechanics rule guided deep image-to-image learning framework, which densely predicts patient's corneal topography change according to treatment parameters (lens geometry, wearing time, physiological parameters, etc.), and consecutively predicts the influence on eye axial length change after OK treatment. Encapsulated in a U-shaped multi-resolution map-to-map architecture, the proposed model features two major components. First, geometric and wearing parameters of OK lens are spatially encoded with convolutions to form a multi-channel input volume/tensor for latent encodings of external stress/tension applied to different regions of cornea. Second, these external latent force maps are progressively down-sampled and injected into this multi-scale architecture for predicting the change of corneal topography map. At each feature learning layer, we formally derive a mathematic framework that simulates the physical process of corneal deformation induced by lens-to-cornea interaction and corneal internal tension, which is reformulated into parameter learnable cross-attention/self-attention modules in the context of transformer architecture. A total of 1854 eyes of myopia patients are included in the study and the results show that the proposed model precisely predicts corneal topography change with a high PSNR as 28.45dB, as well as a significant accuracy gain for axial elongation prediction (i.e., 0.0276 in MSE). It is also demonstrated that our method provides interpretable associations between various OK treatment parameters and the final control effect.
虽然角膜塑形术（OK）已被证明可以有效减缓近视的进展，但仍不清楚施加到不同区域的空间分布的结构应力/张力如何影响角膜几何形状的变化，以及在细粒度细节上连续影响近视控制的结果。认识到 OK 镜片的基本工作机制本质上是力学引起的屈光参数重塑，在本研究中，我们开发了一种新颖的力学规则引导的深度图像到图像学习框架，该框架根据治疗参数（镜片）密集预测患者的角膜地形变化几何形状、佩戴时间、生理参数等），并连续预测OK治疗后对眼轴长度变化的影响。所提出的模型封装在 U 形多分辨率地图到地图架构中，具有两个主要组成部分。首先，利用卷积对OK镜片的几何和佩戴参数进行空间编码，以形成多通道输入体积/张量，用于对施加到角膜不同区域的外部应力/张力进行潜在编码。其次，这些外部潜在力图被逐步下采样并注入到这个多尺度架构中，以预测角膜地形图的变化。在每个特征学习层，我们正式推导了一个数学框架，该框架模拟由晶状体与角膜相互作用和角膜内部张力引起的角膜变形的物理过程，并在上下文中将其重新表述为参数可学习的交叉注意/自注意模块变压器架构。该研究共纳入了 1854 只近视患者眼睛，结果表明该模型能够准确预测角膜地形图变化，PSNR 高达 28。45dB，以及轴向伸长预测的显着精度增益（即 MSE 中的 0.0276）。还证明我们的方法提供了各种 OK 治疗参数和最终控制效果之间的可解释关联。

AU Cheung, Chim-Lee Wu, Mengjie Fang, Ge Ho, Justin D. L. Liang, Liyuan Tan, Kel Vin Lin, Fa-Hsuan Chang, Hing-Chiu Kwok, Ka-Wai
AU Cheung、Chim-Lee Wu、Mengjie Fang、Ge Ho、Justin DL Liang、Liyuan Tan、Kel Vin Lin、Fa-Hsuan Chang、Hing-Chiu Kwok、Ka-Wai

Omnidirectional Monolithic Marker for Intra-Operative MR-Based Positional Sensing in Closed MRI
用于闭合 MRI 中基于 MR 的术中位置传感的全向整体标记

We present a design of an inductively coupled radio frequency (ICRF) marker for magnetic resonance (MR)-based positional tracking, enabling the robust increase of tracking signal at all scanning orientations in quadrature-excited closed MR imaging (MRI). The marker employs three curved resonant circuits fully covering a cylindrical surface that encloses the signal source. Each resonant circuit is a planar spiral inductor with parallel plate capacitors fabricated monolithically on flexible printed circuit board (FPC) and bent to achieve the curved structure. Size of the constructed marker is Phi 3-mm x5 -mm with quality factor > 22, and its tracking performance was validated with 1.5 T MRI scanner. As result, the marker remains as a high positive contrast spot under 360(degrees )rotations in 3 axes. The marker can be accurately localized with a maximum error of 0.56 mm under a displacement of 56 mm from the isocenter, along with an inherent standard deviation of 0.1-mm. Accrediting to the high image contrast, the presented marker enables automatic and real-time tracking in 3D without dependency on its orientation with respect to the MRI scanner receive coil. In combination with its small form-factor, the presented marker would facilitate robust and wireless MR-based tracking for intervention and clinical diagnosis. This method targets applications that can involve rotational changes in all axes (X-Y-Z).
我们提出了一种用于基于磁共振 (MR) 的位置跟踪的电感耦合射频 (ICRF) 标记的设计，能够在正交激励闭合 MR 成像 (MRI) 中的所有扫描方向上实现跟踪信号的强劲增加。该标记采用三个弯曲谐振电路，完全覆盖包围信号源的圆柱形表面。每个谐振电路都是平面螺旋电感器和平行板电容器，单片制造在柔性印刷电路板 (FPC) 上并弯曲以实现弯曲结构。构建的标记尺寸为 Phi 3-mm x5-mm，品质因数为 > 22，其跟踪性能通过 1.5 T MRI 扫描仪进行了验证。结果，标记在 3 个轴 360（度）旋转下仍保持高正对比度点。在距等中心点位移 56 毫米的情况下，可以精确定位标记，最大误差为 0.56 毫米，固有标准偏差为 0.1 毫米。由于具有高图像对比度，所提出的标记能够在 3D 中自动实时跟踪，而不依赖于其相对于 MRI 扫描仪接收线圈的方向。结合其小巧的外形，所提出的标记将促进基于 MR 的稳健和无线跟踪，以进行干预和临床诊断。此方法针对可能涉及所有轴 (XYZ) 旋转变化的应用。

AU Yue, Guanghui Zhang, Lixin Du, Jingfeng Zhou, Tianwei Zhou, Wei Lin, Weisi
区悦、张光辉、杜立新、周景峰、周天伟、林伟、伟思

Subjective and Objective Quality Assessment of Colonoscopy Videos.
结肠镜检查视频的主观和客观质量评估。

Captured colonoscopy videos usually suffer from multiple real-world distortions, such as motion blur, low brightness, abnormal exposure, and object occlusion, which impede visual interpretation. However, existing works mainly investigate the impacts of synthesized distortions, which differ from real-world distortions greatly. This research aims to carry out an in-depth study for colonoscopy Video Quality Assessment (VQA). In this study, we advance this topic by establishing both subjective and objective solutions. Firstly, we collect 1,000 colonoscopy videos with typical visual quality degradation conditions in practice and construct a multi-attribute VQA database. The quality of each video is annotated by subjective experiments from five distortion attributes (i.e., temporal-spatial visibility, brightness, specular reflection, stability, and utility), as well as an overall perspective. Secondly, we propose a Distortion Attribute Reasoning Network (DARNet) for automatic VQA. DARNet includes two streams to extract features related to spatial and temporal distortions, respectively. It adaptively aggregates the attribute-related features through a multi-attribute association module to predict the quality score of each distortion attribute. Motivated by the observation that the rating behaviors for all attributes are different, a behavior guided reasoning module is further used to fuse the attribute-aware features, resulting in the overall quality. Experimental results on the constructed database show that our DARNet correlates well with subjective ratings and is superior nine state-of-the-art methods.
捕获的结肠镜检查视频通常会遭受多种现实世界的扭曲，例如运动模糊、低亮度、异常曝光和物体遮挡，这些都会妨碍视觉解释。然而，现有的工作主要研究合成扭曲的影响，这与现实世界的扭曲有很大不同。本研究旨在对结肠镜视频质量评估（VQA）进行深入研究。在这项研究中，我们通过建立主观和客观的解决方案来推进这个主题。首先，我们收集了实践中具有典型视觉质量退化情况的1000个结肠镜检查视频，并构建了多属性VQA数据库。每个视频的质量是通过五个失真属性（即时空可见性、亮度、镜面反射、稳定性和实用性）以及整体视角的主观实验来注释的。其次，我们提出了一种用于自动 VQA 的失真属性推理网络 (DARNet)。 DARNet 包括两个流，分别用于提取与空间和时间扭曲相关的特征。它通过多属性关联模块自适应地聚合与属性相关的特征，以预测每个失真属性的质量得分。由于观察到所有属性的评分行为都是不同的，因此进一步使用行为引导推理模块来融合属性感知特征，从而得出整体质量。所构建数据库的实验结果表明，我们的 DARNet 与主观评分具有良好的相关性，并且优于九种最先进的方法。

AU Mineo, Raffaele Salanitri, F. Proietto Bellitto, G. Kavasidis, I. De Filippo, O. Millesimo, M. De Ferrari, G. M. Aldinucci, M. Giordano, D. Palazzo, S. D'Ascenzo, F. Spampinato, C.
AU Mineo、Raffaele Salanitri、F. Proietto Bellitto、G. Kavasidis、I. De Filippo、O. Millesimo、M. De Ferrari、GM Aldinucci、M. Giordano、D. Palazzo、S. D'Ascenzo、F. Spampinato、 C.

A Convolutional-Transformer Model for FFR and iFR Assessment From Coronary Angiography
用于冠状动脉造影 FFR 和 iFR 评估的卷积变压器模型

The quantification of stenosis severity from X-ray catheter angiography is a challenging task. Indeed, this requires to fully understand the lesion's geometry by analyzing dynamics of the contrast material, only relying on visual observation by clinicians. To support decision making for cardiac intervention, we propose a hybrid CNN-Transformer model for the assessment of angiography-based non-invasive fractional flow-reserve (FFR) and instantaneous wave-free ratio (iFR) of intermediate coronary stenosis. Our approach predicts whether a coronary artery stenosis is hemodynamically significant and provides direct FFR and iFR estimates. This is achieved through a combination of regression and classification branches that forces the model to focus on the cut-off region of FFR (around 0.8 FFR value), which is highly critical for decision-making. We also propose a spatio-temporal factorization mechanisms that redesigns the transformer's self-attention mechanism to capture both local spatial and temporal interactions between vessel geometry, blood flow dynamics, and lesion morphology. The proposed method achieves state-of-the-art performance on a dataset of 778 exams from 389 patients. Unlike existing methods, our approach employs a single angiography view and does not require knowledge of the key frame; supervision at training time is provided by a classification loss (based on a threshold of the FFR/iFR values) and a regression loss for direct estimation. Finally, the analysis of model interpretability and calibration shows that, in spite of the complexity of angiographic imaging data, our method can robustly identify the location of the stenosis and correlate prediction uncertainty to the provided output scores.
通过 X 射线导管血管造影量化狭窄严重程度是一项具有挑战性的任务。事实上，这需要通过分析对比材料的动力学来充分了解病变的几何形状，仅依靠临床医生的视觉观察。为了支持心脏介入决策，我们提出了一种混合 CNN-Transformer 模型，用于评估基于血管造影的无创血流储备分数 (FFR) 和中间冠状动脉狭窄的瞬时无波比 (iFR)。我们的方法可以预测冠状动脉狭窄是否具有血流动力学显着性，并提供直接的 FFR 和 iFR 估计。这是通过回归和分类分支的组合来实现的，迫使模型专注于 FFR 的截止区域（大约 0.8 FFR 值），这对于决策非常关键。我们还提出了一种时空分解机制，重新设计了变压器的自注意力机制，以捕获血管几何形状、血流动力学和病变形态之间的局部空间和时间相互作用。所提出的方法在 389 名患者的 778 项检查数据集上实现了最先进的性能。与现有方法不同，我们的方法采用单一血管造影视图，不需要关键帧的知识；训练时的监督由分类损失（基于 FFR/iFR 值的阈值）和用于直接估计的回归损失提供。最后，模型可解释性和校准的分析表明，尽管血管造影成像数据很复杂，但我们的方法可以稳健地识别狭窄的位置，并将预测不确定性与提供的输出分数相关联。

AU Li, Xibao Ouyang, Xi Zhang, Jiadong Ding, Zhongxiang Zhang, Yuyao Xue, Zhong Shi, Feng Shen, Dinggang
AU Li, 欧阳喜宝, 张喜, 丁家栋, 张忠祥, 薛玉瑶, 石钟, 沉峰, 丁刚

Carotid Vessel Wall Segmentation Through Domain Aligner, Topological Learning, and Segment Anything Model for Sparse Annotation in MR Images.
通过域对准器、拓扑学习和分段任意模型进行颈动脉血管壁分割，以实现 MR 图像中的稀疏注释。

Medical image analysis poses significant challenges due to limited availability of clinical data, which is crucial for training accurate models. This limitation is further compounded by the specialized and labor-intensive nature of the data annotation process. For example, despite the popularity of computed tomography angiography (CTA) in diagnosing atherosclerosis with an abundance of annotated datasets, magnetic resonance (MR) images stand out with better visualization for soft plaque and vessel wall characterization. However, the higher cost and limited accessibility of MR, as well as time-consuming nature of manual labeling, contribute to fewer annotated datasets. To address these issues, we formulate a multi-modal transfer learning network, named MT-Net, designed to learn from unpaired CTA and sparsely-annotated MR data. Additionally, we harness the Segment Anything Model (SAM) to synthesize additional MR annotations, enriching the training process. Specifically, our method first segments vessel lumen regions followed by precise characterization of carotid artery vessel walls, thereby ensuring both segmentation accuracy and clinical relevance. Validation of our method involved rigorous experimentation on publicly available datasets from COSMOS and CARE-II challenge, demonstrating its superior performance compared to existing state-of-the-art techniques.
由于临床数据的可用性有限，医学图像分析面临重大挑战，而临床数据对于训练准确的模型至关重要。数据注释过程的专业性和劳动密集型性质进一步加剧了这种限制。例如，尽管计算机断层扫描血管造影 (CTA) 在诊断动脉粥样硬化方面很受欢迎，并且具有大量带注释的数据集，但磁共振 (MR) 图像在软斑块和血管壁表征方面具有更好的可视化效果，因此脱颖而出。然而，MR 的成本较高、可访问性有限，以及手动标记的耗时性，导致带注释的数据集较少。为了解决这些问题，我们制定了一个多模态迁移学习网络，名为 MT-Net，旨在从不成对的 CTA 和稀疏注释的 MR 数据中学习。此外，我们利用分段任意模型 (SAM) 来合成额外的 MR 注释，丰富训练过程。具体来说，我们的方法首先分割血管腔区域，然后精确表征颈动脉血管壁，从而确保分割准确性和临床相关性。我们的方法的验证涉及对 COSMOS 和 CARE-II 挑战赛的公开数据集进行严格的实验，证明其与现有最先进技术相比具有卓越的性能。

AU Wang, Jian Qiao, Liang Zhou, Shichong Zhou, Jin Wang, Jun Li, Juncheng Ying, Shihui Chang, Cai Shi, Jun
AU Wang、Jian Qiao、Liang Zhou、Shichong Zhou、Jin Wang、Jun Li、Jun Cheng Ying、Shihui Chang、Cai Shi、Jun

Weakly Supervised Lesion Detection and Diagnosis for Breast Cancers With Partially Annotated Ultrasound Images
利用部分注释的超声图像对乳腺癌进行弱监督病变检测和诊断

Deep learning (DL) has proven highly effective for ultrasound-based computer-aided diagnosis (CAD) of breast cancers. In an automatic CAD system, lesion detection is critical for the following diagnosis. However, existing DL-based methods generally require voluminous manually-annotated region of interest (ROI) labels and class labels to train both the lesion detection and diagnosis models. In clinical practice, the ROI labels, i.e. ground truths, may not always be optimal for the classification task due to individual experience of sonologists, resulting in the issue of coarse annotation to limit the diagnosis performance of a CAD model. To address this issue, a novel Two-Stage Detection and Diagnosis Network (TSDDNet) is proposed based on weakly supervised learning to improve diagnostic accuracy of the ultrasound-based CAD for breast cancers. In particular, all the initial ROI-level labels are considered as coarse annotations before model training. In the first training stage, a candidate selection mechanism is then designed to refine manual ROIs in the fully annotated images and generate accurate pseudo-ROIs for the partially annotated images under the guidance of class labels. The training set is updated with more accurate ROI labels for the second training stage. A fusion network is developed to integrate detection network and classification network into a unified end-to-end framework as the final CAD model in the second training stage. A self-distillation strategy is designed on this model for joint optimization to further improves its diagnosis performance. The proposed TSDDNet is evaluated on three B-mode ultrasound datasets, and the experimental results indicate that it achieves the best performance on both lesion detection and diagnosis tasks, suggesting promising application potential.
事实证明，深度学习 (DL) 对于基于超声的乳腺癌计算机辅助诊断 (CAD) 非常有效。在自动 CAD 系统中，病变检测对于后续诊断至关重要。然而，现有的基于深度学习的方法通常需要大量手动注释的感兴趣区域（ROI）标签和类别标签来训练病变检测和诊断模型。在临床实践中，由于超声医师的个人经验，ROI 标签（即基本事实）可能并不总是最适合分类任务，从而导致粗略注释的问题，从而限制了 CAD 模型的诊断性能。为了解决这个问题，提出了一种基于弱监督学习的新型两阶段检测和诊断网络（TSDDNet），以提高基于超声的 CAD 对乳腺癌的诊断准确性。特别是，在模型训练之前，所有初始 ROI 级别标签都被视为粗略注释。在第一个训练阶段，设计候选选择机制来细化完全注释图像中的手动 ROI，并在类标签的指导下为部分注释图像生成准确的伪 ROI。在第二个训练阶段，训练集会更新为更准确的 ROI 标签。开发融合网络，将检测网络和分类网络集成到统一的端到端框架中，作为第二训练阶段的最终CAD模型。在此模型上设计了自蒸馏策略进行联合优化，以进一步提高其诊断性能。所提出的 TSDDNet 在三个 B 型超声数据集上进行了评估，实验结果表明它在病变检测和诊断任务上均取得了最佳性能，表明其具有广阔的应用潜力。

AU Liu, Yuedong Zhou, Xuan Wei, Cunfeng Xu, Qiong
刘AU、周跃东、韦宣、徐存峰、琼

Sparse-view Spectral CT Reconstruction and Material Decomposition based on Multi-channel SGM.
基于多通道SGM的稀疏视能谱CT重建与材料分解。

In medical applications, the diffusion of contrast agents in tissue can reflect the physiological function of organisms, so it is valuable to quantify the distribution and content of contrast agents in the body over a period. Spectral CT has the advantages of multi-energy projection acquisition and material decomposition, which can quantify K-edge contrast agents. However, multiple repetitive spectral CT scans can cause excessive radiation doses. Sparse-view scanning is commonly used to reduce dose and scan time, but its reconstructed images are usually accompanied by streaking artifacts, which leads to inaccurate quantification of the contrast agents. To solve this problem, an unsupervised sparse-view spectral CT reconstruction and material decomposition algorithm based on the multi-channel score-based generative model (SGM) is proposed in this paper. First, multi-energy images and tissue images are used as multi-channel input data for SGM training. Secondly, the organism is multiply scanned in sparse views, and the trained SGM is utilized to generate multi-energy images and tissue images driven by sparse-view projections. After that, a material decomposition algorithm using tissue images generated by SGM as prior images for solving contrast agent images is established. Finally, the distribution and content of the contrast agents are obtained. The comparison and evaluation of this method are given in this paper, and a series of mouse scanning experiments are carried out to verify the effectiveness of the method.
在医学应用中，造影剂在组织中的扩散可以反映生物体的生理功能，因此量化造影剂在一段时间内在体内的分布和含量具有重要价值。能谱CT具有多能量投影采集和物质分解的优点，可以量化K边造影剂。然而，多次重复的能谱 CT 扫描可能会导致辐射剂量过多。稀疏视图扫描通常用于减少剂量和扫描时间，但其重建图像通常伴有条纹伪影，从而导致造影剂定量不准确。针对这一问题，本文提出一种基于多通道评分生成模型（SGM）的无监督稀疏视图能谱CT重建和材料分解算法。首先，使用多能量图像和组织图像作为SGM训练的多通道输入数据。其次，在稀疏视图中对生物体进行多次扫描，并利用经过训练的 SGM 生成由稀疏视图投影驱动的多能量图像和组织图像。之后，建立了一种利用SGM生成的组织图像作为先验图像来求解造影剂图像的材料分解算法。最后获得造影剂的分布和含量。本文对该方法进行了比较和评价，并进行了一系列小鼠扫描实验来验证该方法的有效性。

EI 1558-254X DA 2024-06-18 UT MEDLINE:38865221 PM 38865221 ER
EI 1558-254X DA 2024-06-18 UT MEDLINE：38865221 PM 38865221 ER

AU Billot, Benjamin Dey, Neel Moyer, Daniel Hoffmann, Malte Turk, Esra Abaci Gagoski, Borjan Ellen Grant, P Golland, Polina
AU Billot、本杰明·戴伊、尼尔·莫耶、丹尼尔·霍夫曼、马尔特·特克、埃斯拉·阿巴奇·加戈斯基、博尔扬·艾伦·格兰特、P Golland、波利纳

SE(3)-Equivariant and Noise-Invariant 3D Rigid Motion Tracking in Brain MRI.
SE(3)-脑 MRI 中的等变和噪声不变 3D 刚性运动跟踪。

Rigid motion tracking is paramount in many medical imaging applications where movements need to be detected, corrected, or accounted for. Modern strategies rely on convolutional neural networks (CNN) and pose this problem as rigid registration. Yet, CNNs do not exploit natural symmetries in this task, as they are equivariant to translations (their outputs shift with their inputs) but not to rotations. Here we propose EquiTrack, the first method that uses recent steerable SE(3)-equivariant CNNs (E-CNN) for motion tracking. While steerable E-CNNs can extract corresponding features across different poses, testing them on noisy medical images reveals that they do not have enough learning capacity to learn noise invariance. Thus, we introduce a hybrid architecture that pairs a denoiser with an E-CNN to decouple the processing of anatomically irrelevant intensity features from the extraction of equivariant spatial features. Rigid transforms are then estimated in closed-form. EquiTrack outperforms state-of-the-art learning and optimisation methods for motion tracking in adult brain MRI and fetal MRI time series. Our code is available at https://github.com/BBillot/EquiTrack.
刚性运动跟踪在许多需要检测、校正或解释运动的医学成像应用中至关重要。现代策略依赖于卷积神经网络（CNN），并将这个问题称为刚性配准。然而，CNN 在此任务中并未利用自然对称性，因为它们与平移等价（它们的输出随输入变化），但与旋转不同。在这里，我们提出了 EquiTrack，这是第一个使用最新的可操纵 SE(3) 等变 CNN (E-CNN) 进行运动跟踪的方法。虽然可操纵的 E-CNN 可以提取不同姿势的相应特征，但在噪声医学图像上测试它们表明它们没有足够的学习能力来学习噪声不变性。因此，我们引入了一种混合架构，将降噪器与 E-CNN 配对，以将解剖上不相关的强度特征的处理与等变空间特征的提取解耦。然后以封闭形式估计刚性变换。 EquiTrack 在成人大脑 MRI 和胎儿 MRI 时间序列的运动跟踪方面优于最先进的学习和优化方法。我们的代码可在 https://github.com/BBillot/EquiTrack 获取。

AU Chen, Fang Han, Haojie Wan, Peng Chen, Lingyu Kong, Wentao Liao, Hongen Wen, Baojie Liu, Chunrui Zhang, Daoqiang
陈AU、韩芳、万浩杰、陈鹏、孔令宇、廖文涛、温洪恩、刘宝杰、张春瑞、道强

Do as Sonographers Think: Contrast-enhanced Ultrasound for Thyroid Nodules Diagnosis via Microvascular Infiltrative Awareness.
按照超声医师的想法去做：对比增强超声通过微血管浸润意识诊断甲状腺结节。

Dynamic contrast-enhanced ultrasound (CEUS) imaging can reflect the microvascular distribution and blood flow perfusion, thereby holding clinical significance in distinguishing between malignant and benign thyroid nodules. Notably, CEUS offers a meticulous visualization of the microvascular distribution surrounding the nodule, leading to an apparent increase in tumor size compared to gray-scale ultrasound (US). In the dual-image obtained, the lesion size enlarged from gray-scale US to CEUS, as the microvascular appeared to be continuously infiltrating the surrounding tissue. Although the infiltrative dilatation of microvasculature remains ambiguous, sonographers believe it may promote the diagnosis of thyroid nodules. We propose a deep learning model designed to emulate the diagnostic reasoning process employed by sonographers. This model integrates the observation of microvascular infiltration on dynamic CEUS, leveraging the additional insights provided by gray-scale US for enhanced diagnostic support. Specifically, temporal projection attention is implemented on time dimension of dynamic CEUS to represent the microvascular perfusion. Additionally, we employ a group of confidence maps with flexible Sigmoid Alpha Functions to aware and describe the infiltrative dilatation process. Moreover, a self-adaptive integration mechanism is introduced to dynamically integrate the assisted gray-scale US and the confidence maps of CEUS for individual patients, ensuring a trustworthy diagnosis of thyroid nodules. In this retrospective study, we collected a thyroid nodule dataset of 282 CEUS videos. The method achieves a superior diagnostic accuracy and sensitivity of 89.52% and 93.75%, respectively. These results suggest that imitating the diagnostic thinking of sonographers, encompassing dynamic microvascular perfusion and infiltrative expansion, proves beneficial for CEUS-based thyroid nodule diagnosis.
动态超声造影（CEUS）成像可以反映微血管分布和血流灌注情况，对鉴别甲状腺结节的良恶性具有临床意义。值得注意的是，CEUS 可以对结节周围的微血管分布进行细致的可视化，与灰度超声 (US) 相比，导致肿瘤尺寸明显增加。在获得的双图像中，病变尺寸从灰度US放大到CEUS，因为微血管似乎不断浸润周围组织。尽管微脉管系统的浸润性扩张仍然不明确，但超声检查医师认为它可能促进甲状腺结节的诊断。我们提出了一种深度学习模型，旨在模拟超声医师所采用的诊断推理过程。该模型整合了动态 CEUS 上微血管浸润的观察，利用灰度超声提供的额外见解来增强诊断支持。具体来说，在动态CEUS的时间维度上实施时间投影注意力来表示微血管灌注。此外，我们采用一组具有灵活的 Sigmoid Alpha 函数的置信图来感知和描述渗透扩张过程。此外，引入自适应整合机制，动态整合个体患者的辅助灰度超声和CEUS置信度图，确保甲状腺结节的诊断可信。在这项回顾性研究中，我们收集了 282 个 CEUS 视频的甲状腺结节数据集。该方法的诊断准确率和灵敏度分别为 89.52% 和 93.75%。这些结果表明，模仿超声医师的诊断思维，包括动态微血管灌注和浸润扩张，被证明有利于基于 CEUS 的甲状腺结节诊断。

EI 1558-254X DA 2024-05-31 UT MEDLINE:38801692 PM 38801692 ER
EI 1558-254X DA 2024-05-31 UT MEDLINE：38801692 PM 38801692 ER

AU Khan, M Owais Seresti, Anahita A Menon, Karthik Marsden, Alison L Nieman, Koen
AU Khan、M Owais Seresti、Anahita A Menon、Karthik Marsden、Alison L Nieman、Koen

Quantification and Visualization of CT Myocardial Perfusion Imaging to Detect Ischemia-Causing Coronary Arteries.
CT 心肌灌注成像的量化和可视化以检测引起缺血的冠状动脉。

Coronary computed tomography angiography (cCTA) has poor specificity to identify coronary stenosis that limit blood flow to the myocardial tissue. Integration of dynamic CT myocardial perfusion imaging (CT-MPI) can potentially improve the diagnostic accuracy. We propose a method that integrates cCTA and CT-MPI to identify culprit coronary lesions that limit blood flow to the myocardium. Coronary arteries and left ventricle surfaces were segmented from cCTA and registered to CT-MPI. Myocardial blood flow (MBF) was derived from CT-MPI. A ray-casting approach was developed to project volumetric MBF onto the left ventricle surface. MBF volume were divided into coronary-specific territories based on proximity to the nearest coronary artery. MBF and normalized MBF were computed for the myocardium and each of the coronary artery. Projection of MBF onto cCTA allowed for direct visualization of perfusion defects. Normalized MBF had higher correlation with ischemic myocardial territory compared to MBF (MBF: R2=0.81 and Index MBF: R2=0.90). There were 18 vessels that showed angiographic disease (stenosis >50%); however, normalized MBF demonstrated only 5 coronary territories to be ischemic. These findings demonstrate that cCTA and CT-MPI can be integrated to visualize myocardial defects and detect culprit coronary arteries responsible for perfusion defects. These methods can allow for non-invasive detection of ischemia-causing coronary lesions and ultimately help guide clinicians to deliver more targeted coronary interventions.
冠状动脉计算机断层扫描血管造影 (cCTA) 识别限制流向心肌组织的血流的冠状动脉狭窄的特异性较差。动态 CT 心肌灌注成像 (CT-MPI) 的集成可以潜在地提高诊断准确性。我们提出了一种整合 cCTA 和 CT-MPI 的方法来识别限制心肌血流的罪魁祸首冠状动脉病变。冠状动脉和左心室表面从 cCTA 中分割出来并注册到 CT-MPI 上。心肌血流量 (MBF) 源自 CT-MPI。开发了一种射线投射方法，将体积 MBF 投射到左心室表面。 MBF 体积根据与最近冠状动脉的接近程度分为冠状动脉特定区域。计算心肌和每条冠状动脉的MBF 和归一化MBF。 MBF 投影到 cCTA 上可以直接显示灌注缺陷。与 MBF 相比，归一化 MBF 与缺血心肌区域的相关性更高（MBF：R2=0.81，指数 MBF：R2=0.90）。有18条血管显示血管造影疾病（狭窄>50%）；然而，标准化的 MBF 表明只有 5 个冠状动脉区域缺血。这些发现表明，cCTA 和 CT-MPI 可以整合起来，以可视化心肌缺陷并检测导致灌注缺陷的罪魁祸首冠状动脉。这些方法可以对引起缺血的冠状动脉病变进行无创检测，并最终帮助指导临床医生提供更有针对性的冠状动脉干预措施。

AU Jin, Yifei Meng, Ling-Jian
区金、孟亦飞、凌健

Exploration of Coincidence Detection of Cascade Photons to Enhance Preclinical Multi-Radionuclide SPECT Imaging
级联光子符合检测增强临床前多放射性核素 SPECT 成像的探索

We proposed a technique of coincidence detection of cascade photons (CDCP) to enhance preclinical SPECT imaging of therapeutic radionuclides emitting cascade photons, such as Lu-177, Ac-225, Ra-223, and In-111. We have carried out experimental studies to evaluate the proposed CDCP-SPECT imaging of low-activity radionuclides using a prototype coincidence detection system constructed with large-volume cadmium zinc telluride (CZT) imaging spectrometers and a pinhole collimator. With In-111 in experimental studies, the CDCP technique allows us to improve the signal-to-contamination in the projection (Projection-SCR) by similar to 53 times and reduce similar to 98% of the normalized contamination. Compared to traditional scatter correction, which achieves a Projection-SCR of 1.00, our CDCP method boosts it to 15.91, showing enhanced efficacy in reducing down-scattered contamination, especially at lower activities. The reconstructed images of a line source demonstrated the dramatic enhancement of the image quality with CDCP-SPECT compared to conventional and triple-energy-window-corrected SPECT data acquisition. We also introduced artificial energy blurring and Monte Carlo simulation to quantify the impact of detector performance, especially its energy resolution and timing resolution, on the enhancement through the CDCP technique. We have further demonstrated the benefits of the CDCP technique with simulation studies, which shows the potential of improving the signal-to-contamination ratio by 300 times with Ac-225, which emits cascade photons with a decay constant of similar to 0.1 ns. These results have demonstrated the potential of CDCP-enhanced SPECT for imaging a super-low level of therapeutic radionuclides in small animals.
我们提出了一种级联光子符合检测 (CDCP) 技术，以增强发射级联光子的治疗性放射性核素（例如 Lu-177、Ac-225、Ra-223 和 In-111）的临床前 SPECT 成像。我们开展了实验研究，使用由大体积碲化镉锌 (CZT) 成像光谱仪和针孔准直器构建的原型符合检测系统来评估所提出的低活度放射性核素的 CDCP-SPECT 成像。通过实验研究中的 In-111，CDCP 技术使我们能够将投影 (Projection-SCR) 中的信号污染比提高约 53 倍，并减少约 98% 的标准化污染。与传统的散射校正（投影-SCR 达到 1.00）相比，我们的 CDCP 方法将其提高到 15.91，显示出在减少下散射污染方面的增强功效，尤其是在较低的活动情况下。线源的重建图像表明，与传统和三能量窗口校正 SPECT 数据采集相比，CDCP-SPECT 的图像质量显着提高。我们还引入了人工能量模糊和蒙特卡罗模拟来量化探测器性能的影响，特别是其能量分辨率和定时分辨率，对通过CDCP技术增强的影响。我们通过模拟研究进一步证明了CDCP技术的优势，结果表明Ac-225具有将信号污染比提高300倍的潜力，Ac-225发射的级联光子的衰减常数接近0.1 ns。这些结果证明了CDCP增强的SPECT在小动物中对超低水平的治疗性放射性核素进行成像的潜力。

AU Jung, Wonsik Jeon, Eunjin Kang, Eunsong Suk, Heung-Il
欧正、全元植、姜恩珍、石恩松、兴日

EAG-RS: A Novel Explainability-Guided ROI-Selection Framework for ASD Diagnosis via Inter-Regional Relation Learning
EAG-RS：一种新颖的可解释性引导的 ROI 选择框架，用于通过区域间关系学习进行 ASD 诊断

Deep learning models based on resting-state functional magnetic resonance imaging (rs-fMRI) have been widely used to diagnose brain diseases, particularly autism spectrum disorder (ASD). Existing studies have leveraged the functional connectivity (FC) of rs-fMRI, achieving notable classification performance. However, they have significant limitations, including the lack of adequate information while using linear low-order FC as inputs to the model, not considering individual characteristics (i.e., different symptoms or varying stages of severity) among patients with ASD, and the non-explainability of the decision process. To cover these limitations, we propose a novel explainability-guided region of interest (ROI) selection (EAG-RS) framework that identifies non-linear high-order functional associations among brain regions by leveraging an explainable artificial intelligence technique and selects class-discriminative regions for brain disease identification. The proposed framework includes three steps: (i) inter-regional relation learning to estimate non-linear relations through random seed-based network masking, (ii) explainable connection-wise relevance score estimation to explore high-order relations between functional connections, and (iii) non-linear high-order FC-based diagnosis-informative ROI selection and classifier learning to identify ASD. We validated the effectiveness of our proposed method by conducting experiments using the Autism Brain Imaging Database Exchange (ABIDE) dataset, demonstrating that the proposed method outperforms other comparative methods in terms of various evaluation metrics. Furthermore, we qualitatively analyzed the selected ROIs and identified ASD subtypes linked to previous neuroscientific studies.
基于静息态功能磁共振成像（rs-fMRI）的深度学习模型已广泛用于诊断脑部疾病，特别是自闭症谱系障碍（ASD）。现有研究利用 rs-fMRI 的功能连接 (FC)，取得了显着的分类性能。然而，它们有很大的局限性，包括在使用线性低阶 FC 作为模型的输入时缺乏足够的信息，没有考虑 ASD 患者的个体特征（即不同的症状或不同的严重程度阶段），以及非决策过程的可解释性。为了克服这些局限性，我们提出了一种新颖的可解释性引导的感兴趣区域（ROI）选择（EAG-RS）框架，该框架通过利用可解释的人工智能技术来识别大脑区域之间的非线性高阶功能关联，并选择类别判别性脑部疾病识别区域。所提出的框架包括三个步骤：（i）区域间关系学习，通过基于随机种子的网络掩码来估计非线性关系，（ii）可解释的连接相关性得分估计，以探索功能连接之间的高阶关系，以及(iii) 基于非线性高阶 FC 的诊断信息 ROI 选择和分类器学习来识别 ASD。我们通过使用自闭症脑成像数据库交换（ABIDE）数据集进行实验来验证我们提出的方法的有效性，证明所提出的方法在各种评估指标方面优于其他比较方法。此外，我们对选定的 ROI 进行了定性分析，并确定了与之前的神经科学研究相关的 ASD 亚型。

AU Du, Lei Zhao, Ying Zhang, Jianting Shang, Muheng Zhang, Jin Han, Junwei CA Alzheimers Dis Neuroimaging
杜AU, 赵磊, 张颖, 商建婷, 张木恒, 韩进, 俊伟 CA 阿尔茨海默病神经影像学

Identification of Genetic Risk Factors Based on Disease Progression Derived From Longitudinal Brain Imaging Phenotypes
根据纵向脑成像表型得出的疾病进展识别遗传风险因素

Neurodegenerative disorders usually happen stage-by-stage rather than overnight. Thus, cross-sectional brain imaging genetic methods could be insufficient to identify genetic risk factors. Repeatedly collecting imaging data over time appears to solve the problem. But most existing imaging genetic methods only use longitudinal imaging phenotypes straightforwardly, ignoring the disease progression trajectory which might be a more stable disease signature. In this paper, we propose a novel sparse multi-task mixed-effects longitudinal imaging genetic method (SMMLING). In our model, disease progression fitting and genetic risk factors identification are conducted jointly. Specifically, SMMLING models the disease progression using longitudinal imaging phenotypes, and then associates fitted disease progression with genetic variations. The baseline status and changing rate, i.e., the intercept and slope, of the progression trajectory thus shoulder the responsibility to discover loci of interest, which would have superior and stable performance. To facilitate the interpretation and stability, we employ $\ell _{{2},{1}}$ -norm and the fused group lasso (FGL) penalty to identify loci at both the individual level and group level. SMMLING can be solved by an efficient optimization algorithm which is guaranteed to converge to the global optimum. We evaluate SMMLING on synthetic data and real longitudinal neuroimaging genetic data. Both results show that, compared to existing longitudinal methods, SMMLING can not only decrease the modeling error but also identify more accurate and relevant genetic factors. Most risk loci reported by SMMLING are missed by comparison methods, implicating its superiority in genetic risk factors identification. Consequently, SMMLING could be a promising computational method for longitudinal imaging genetics.
神经退行性疾病通常是分阶段发生的，而不是一夜之间发生的。因此，横断面脑成像遗传方法可能不足以识别遗传风险因素。随着时间的推移反复收集成像数据似乎可以解决这个问题。但大多数现有的成像遗传学方法仅直接使用纵向成像表型，忽略了疾病进展轨迹，这可能是更稳定的疾病特征。在本文中，我们提出了一种新颖的稀疏多任务混合效应纵向成像遗传方法（SMMLING）。在我们的模型中，疾病进展拟合和遗传风险因素识别是联合进行的。具体来说，SMMLING 使用纵向成像表型对疾病进展进行建模，然后将拟合的疾病进展与遗传变异相关联。因此，进展轨迹的基线状态和变化率，即截距和斜率，担负着发现感兴趣位点的责任，具有优越和稳定的性能。为了促进解释和稳定性，我们采用 $\ell _{{2},{1}}$ -范数和融合组套索（FGL）惩罚来识别个体水平和群体水平的基因座。 SMMLING 可以通过有效的优化算法来解决，该算法保证收敛到全局最优值。我们根据合成数据和真实的纵向神经影像遗传数据评估 SMMLING。这两个结果都表明，与现有的纵向方法相比，SMMLING不仅可以减少建模误差，而且可以识别更准确和相关的遗传因素。 SMMLING 报告的大多数风险位点都被比较方法遗漏，这表明其在遗传风险因素识别方面的优越性。因此，SMMLING 可能是一种有前途的纵向成像遗传学计算方法。

AU Tan, Zhiwei Shi, Fei Zhou, Yi Wang, Jingcheng Wang, Meng Peng, Yuanyuan Xu, Kai Liu, Ming Chen, Xinjian
谭AU、史志伟、周飞、王毅、王景城、彭猛、徐媛媛、刘凯、陈明、新建

A Multi-Scale Fusion and Transformer Based Registration Guided Speckle Noise Reduction for OCT Images
基于多尺度融合和变压器的 OCT 图像配准引导散斑降噪

Optical coherence tomography (OCT) images are inevitably affected by speckle noise because OCT is based on low-coherence interference. Multi-frame averaging is one of the effective methods to reduce speckle noise. Before averaging, the misalignment between images must be calibrated. In this paper, in order to reduce misalignment between images caused during the acquisition, a novel multi-scale fusion and Transformer based (MsFTMorph) method is proposed for deformable retinal OCT image registration. The proposed method captures global connectivity and locality with convolutional vision transformer and also incorporates a multi-resolution fusion strategy for learning the global affine transformation. Comparative experiments with other state-of-the-art registration methods demonstrate that the proposed method achieves higher registration accuracy. Guided by the registration, subsequent multi-frame averaging shows better results in speckle noise reduction. The noise is suppressed while the edges can be preserved. In addition, our proposed method has strong cross-domain generalization, which can be directly applied to images acquired by different scanners with different modes.
光学相干断层扫描（OCT）图像不可避免地受到散斑噪声的影响，因为OCT基于低相干干涉。多帧平均是降低散斑噪声的有效方法之一。在平均之前，必须校准图像之间的错位。在本文中，为了减少采集过程中引起的图像之间的错位，提出了一种新颖的基于多尺度融合和Transformer（MsFTMorph）的可变形视网膜OCT图像配准方法。所提出的方法通过卷积视觉变换器捕获全局连通性和局部性，并且还结合了用于学习全局仿射变换的多分辨率融合策略。与其他最先进的配准方法的比较实验表明，所提出的方法实现了更高的配准精度。在配准的指导下，后续的多帧平均显示出更好的散斑噪声抑制效果。噪声被抑制，同时可以保留边缘。此外，我们提出的方法具有很强的跨域泛化性，可以直接应用于不同模式的不同扫描仪获取的图像。

AU Hooshangnejad, Hamed China, Debarghya Huang, Yixuan Zbijewski, Wojciech Uneri, Ali McNutt, Todd Lee, Junghoon Ding, Kai
AU Hooshangnejad、Hamed China、Debarghya Huang、Yixuan Zbijewski、Wojciech Uneri、Ali McNutt、Todd Lee、Junghoon Ding、Kai

XIOSIS: An X-Ray-Based Intra-Operative Image-Guided Platform for Oncology Smart Material Delivery
XIOSIS：基于 X 射线的术中图像引导平台，用于肿瘤学智能材料输送

Image-guided interventional oncology procedures can greatly enhance the outcome of cancer treatment. As an enhancing procedure, oncology smart material delivery can increase cancer therapy's quality, effectiveness, and safety. However, the effectiveness of enhancing procedures highly depends on the accuracy of smart material placement procedures. Inaccurate placement of smart materials can lead to adverse side effects and health hazards. Image guidance can considerably improve the safety and robustness of smart material delivery. In this study, we developed a novel generative deep-learning platform that highly prioritizes clinical practicality and provides the most informative intra-operative feedback for image-guided smart material delivery. XIOSIS generates a patient-specific 3D volumetric computed tomography (CT) from three intraoperative radiographs (X-ray images) acquired by a mobile C-arm during the operation. As the first of its kind, XIOSIS (i) synthesizes the CT from small field-of-view radiographs;(ii) reconstructs the intra-operative spacer distribution; (iii) is robust; and (iv) is equipped with a novel soft-contrast cost function. To demonstrate the effectiveness of XIOSIS in providing intra-operative image guidance, we applied XIOSIS to the duodenal hydrogel spacer placement procedure. We evaluated XIOSIS performance in an image-guided virtual spacer placement and actual spacer placement in two cadaver specimens. XIOSIS showed a clinically acceptable performance, reconstructed the 3D intra-operative hydrogel spacer distribution with an average structural similarity of 0.88 and Dice coefficient of 0.63 and with less than 1 cm difference in spacer location relative to the spinal cord.
图像引导的介入肿瘤学手术可以极大地提高癌症治疗的效果。作为一种增强程序，肿瘤学智能材料输送可以提高癌症治疗的质量、有效性和安全性。然而，增强程序的有效性很大程度上取决于智能材料放置程序的准确性。智能材料的不准确放置可能会导致不良副作用和健康危害。图像引导可以显着提高智能物料输送的安全性和稳健性。在这项研究中，我们开发了一种新颖的生成深度学习平台，该平台高度重视临床实用性，并为图像引导的智能材料输送提供最丰富的术中反馈。 XIOSIS 根据手术期间移动 C 形臂采集的三张术中 X 光照片（X 射线图像）生成患者特定的 3D 体积计算机断层扫描 (CT)。作为同类产品中的第一个，XIOSIS (i) 通过小视场 X 线照片合成 CT；(ii) 重建术中垫片分布； (iii) 稳健； (iv) 配备了新颖的软对比成本函数。为了证明 XIOSIS 在提供术中图像引导方面的有效性，我们将 XIOSIS 应用于十二指肠水凝胶间隔物放置过程。我们评估了 XIOSIS 在图像引导的虚拟垫片放置和两个尸体标本中的实际垫片放置方面的性能。 XIOSIS 显示了临床上可接受的性能，重建了 3D 术中水凝胶垫片分布，平均结构相似性为 0.88，Dice 系数为 0.63，垫片相对于脊髓的位置差异小于 1 cm。

AU Wei, Xingyue Ge, Lin Huang, Lijie Luo, Jianwen Xu, Yan
区伟、葛星月、黄林、罗丽杰、徐建文、严

Unsupervised Non-rigid Histological Image Registration Guided by Keypoint Correspondences Based on Learnable Deep Features with Iterative Training.
基于可学习深度特征和迭代训练的关键点对应引导的无监督非刚性组织学图像配准。

Histological image registration is a fundamental task in histological image analysis. It is challenging because of substantial appearance differences due to multiple staining. Keypoint correspondences, i.e., matched keypoint pairs, have been introduced to guide unsupervised deep learning (DL) based registration methods to handle such a registration task. This paper proposes an iterative keypoint correspondence-guided (IKCG) unsupervised network for non-rigid histological image registration. Fixed deep features and learnable deep features are introduced as keypoint descriptors to automatically establish keypoint correspondences, the distance between which is used as a loss function to train the registration network. Fixed deep features extracted from DL networks that are pre-trained on natural image datasets are more discriminative than handcrafted ones, benefiting from the deep and hierarchical nature of DL networks. The intermediate layer outputs of the registration networks trained on histological image datasets are extracted as learnable deep features, which reveal unique information for histological images. An iterative training strategy is adopted to train the registration network and optimize learnable deep features jointly. Benefiting from the excellent matching ability of learnable deep features optimized with the iterative training strategy, the proposed method can solve the local non-rigid large displacement problem, an inevitable problem usually caused by misoperation, such as tears in producing tissue slices. The proposed method is evaluated on the Automatic Non-rigid Histology Image Registration (ANHIR) website and AutomatiC Registration Of Breast cAncer Tissue (ACROBAT) website. It ranked 1st on both websites as of August 6th, 2024.
组织学图像配准是组织学图像分析的一项基本任务。由于多次染色导致外观存在显着差异，因此具有挑战性。已引入关键点对应关系，即匹配的关键点对来指导基于无监督深度学习（DL）的注册方法来处理此类注册任务。本文提出了一种用于非刚性组织学图像配准的迭代关键点对应引导（IKCG）无监督网络。引入固定深度特征和可学习深度特征作为关键点描述符来自动建立关键点对应关系，其之间的距离用作损失函数来训练配准网络。从在自然图像数据集上预先训练的深度学习网络中提取的固定深度特征比手工制作的特征更具辨别力，这得益于深度学习网络的深度和分层性质。在组织学图像数据集上训练的配准网络的中间层输出被提取为可学习的深层特征，这些特征揭示了组织学图像的独特信息。采用迭代训练策略来训练配准网络并联合优化可学习的深层特征。得益于迭代训练策略优化的可学习深度特征的优异匹配能力，该方法可以解决局部非刚性大位移问题，这是通常由误操作引起的不可避免的问题，例如制作组织切片时的撕裂。该方法在自动非刚性组织学图像配准（ANHIR）网站和乳腺癌组织自动配准（ACROBAT）网站上进行了评估。截至2024年8月6日，它在两个网站上均排名第一。

AU Zhang, Yue Peng, Chengtao Wang, Qiuli Song, Dan Li, Kaiyan Kevin Zhou, S
AU 张、彭岳、王成涛、宋秋丽、李丹、凯彦 Kevin Zhou、S

Unified Multi-Modal Image Synthesis for Missing Modality Imputation.
用于缺失模态插补的统一多模态图像合成。

Multi-modal medical images provide complementary soft-tissue characteristics that aid in the screening and diagnosis of diseases. However, limited scanning time, image corruption and various imaging protocols often result in incomplete multi-modal images, thus limiting the usage of multi-modal data for clinical purposes. To address this issue, in this paper, we propose a novel unified multi-modal image synthesis method for missing modality imputation. Our method overall takes a generative adversarial architecture, which aims to synthesize missing modalities from any combination of available ones with a single model. To this end, we specifically design a Commonality- and Discrepancy-Sensitive Encoder for the generator to exploit both modality-invariant and specific information contained in input modalities. The incorporation of both types of information facilitates the generation of images with consistent anatomy and realistic details of the desired distribution. Besides, we propose a Dynamic Feature Unification Module to integrate information from a varying number of available modalities, which enables the network to be robust to random missing modalities. The module performs both hard integration and soft integration, ensuring the effectiveness of feature combination while avoiding information loss. Verified on two public multi-modal magnetic resonance datasets, the proposed method is effective in handling various synthesis tasks and shows superior performance compared to previous methods.
多模态医学图像提供互补的软组织特征，有助于疾病的筛查和诊断。然而，有限的扫描时间、图像损坏和各种成像协议通常会导致多模态图像不完整，从而限制了多模态数据在临床上的使用。为了解决这个问题，在本文中，我们提出了一种用于缺失模态插补的新型统一多模态图像合成方法。我们的方法总体上采用生成对抗架构，其目的是从可用的模式与单个模型的任意组合中合成缺失的模式。为此，我们专门为生成器设计了一个通用性和差异敏感编码器，以利用输入模态中包含的模态不变信息和特定信息。两种类型信息的结合有助于生成具有一致的解剖结构和所需分布的真实细节的图像。此外，我们提出了一个动态特征统一模块来集成来自不同数量的可用模态的信息，这使得网络能够对随机丢失的模态具有鲁棒性。该模块同时进行硬集成和软集成，保证特征组合的有效性，同时避免信息丢失。在两个公共多模态磁共振数据集上进行验证，所提出的方法可以有效处理各种合成任务，并且与以前的方法相比表现出优越的性能。

AU Xiao, Jiayin Li, Si Lin, Tongxu Zhu, Jian Yuan, Xiaochen Feng, David Dagan Sheng, Bin
区晓、李佳音、林思、朱同旭、袁建、冯晓晨、盛大干、斌

Multi-Label Chest X-Ray Image Classification with Single Positive Labels.
具有单一阳性标签的多标签胸部 X 射线图像分类。

Deep learning approaches for multi-label Chest X-ray (CXR) images classification usually require large-scale datasets. However, acquiring such datasets with full annotations is costly, time-consuming, and prone to noisy labels. Therefore, we introduce a weakly supervised learning problem called Single Positive Multi-label Learning (SPML) into CXR images classification (abbreviated as SPML-CXR), in which only one positive label is annotated per image. A simple solution to SPML-CXR problem is to assume that all the unannotated pathological labels are negative, however, it might introduce false negative labels and decrease the model performance. To this end, we present a Multi-level Pseudo-label Consistency (MPC) framework for SPML-CXR. First, inspired by the pseudo-labeling and consistency regularization in semi-supervised learning, we construct a weak-to-strong consistency framework, where the model prediction on weakly-augmented image is treated as the pseudo label for supervising the model prediction on a strongly-augmented version of the same image, and define an Image-level Perturbation-based Consistency (IPC) regularization to recover the potential mislabeled positive labels. Besides, we incorporate Random Elastic Deformation (RED) as an additional strong augmentation to enhance the perturbation. Second, aiming to expand the perturbation space, we design a perturbation stream to the consistency framework at the feature-level and introduce a Feature-level Perturbation-based Consistency (FPC) regularization as a supplement. Third, we design a Transformer-based encoder module to explore the sample relationship within each mini-batch by a Batch-level Transformer-based Correlation (BTC) regularization. Extensive experiments on the CheXpert and MIMIC-CXR datasets have shown the effectiveness of our MPC framework for solving the SPML-CXR problem.
用于多标签胸部 X 射线 (CXR) 图像分类的深度学习方法通常需要大规模数据集。然而，获取具有完整注释的此类数据集成本高昂、耗时，并且容易产生嘈杂的标签。因此，我们将一种称为单正多标签学习（SPML）的弱监督学习问题引入CXR图像分类（缩写为SPML-CXR），其中每张图像仅注释一个正标签。 SPML-CXR 问题的一个简单解决方案是假设所有未注释的病理标签均为阴性，然而，这可能会引入假阴性标签并降低模型性能。为此，我们提出了 SPML-CXR 的多级伪标签一致性（MPC）框架。首先，受到半监督学习中伪标签和一致性正则化的启发，我们构建了一个弱到强的一致性框架，其中弱增强图像上的模型预测被视为伪标签，用于监督弱增强图像上的模型预测。同一图像的强烈增强版本，并定义图像级基于扰动的一致性（IPC）正则化以恢复潜在的错误标记的正标签。此外，我们将随机弹性变形（RED）作为额外的强增强来增强扰动。其次，为了扩展扰动空间，我们设计了特征级一致性框架的扰动流，并引入了特征级基于扰动的一致性（FPC）正则化作为补充。第三，我们设计了一个基于 Transformer 的编码器模块，通过批量级基于 Transformer 的相关性 (BTC) 正则化来探索每个小批量内的样本关系。对 CheXpert 和 MIMIC-CXR 数据集的大量实验表明，我们的 MPC 框架在解决 SPML-CXR 问题方面的有效性。

AU Yang, Yuming Duan, Huilong Zheng, Yinfei
欧阳、段玉明、郑辉龙、银飞

Improved Transcranial Plane-Wave Imaging With Learned Speed-of-Sound Maps
利用学习的声速图改进经颅平面波成像

Although transcranial ultrasound plane-wave imaging (PWI) has promising clinical application prospects, studies have shown that variable speed-of-sound (SoS) would seriously damage the quality of ultrasound images. The mismatch between the conventional constant velocity assumption and the actual SoS distribution leads to the general blurring of ultrasound images. The optimization scheme for reconstructing transcranial ultrasound image is often solved using iterative methods like full-waveform inversion. These iterative methods are computationally expensive and based on prior magnetic resonance imaging (MRI) or computed tomography (CT) information. In contrast, the multi-stencils fast marching (MSFM) method can produce accurate time travel maps for the skull with heterogeneous acoustic speed. In this study, we first propose a convolutional neural network (CNN) to predict SoS maps of the skull from PWI channel data. Then, use these maps to correct the travel time to reduce transcranial aberration. To validate the performance of the proposed method, numerical, phantom and intact human skull studies were conducted using a linear array transducer (L11-5v, 128 elements, pitch = 0.3 mm). Numerical simulations demonstrate that for point targets, the lateral resolution of MSFM-restored images increased by 65%, and the center position shift decreased by 89%. For the cyst targets, the eccentricity of the fitting ellipse decreased by 75%, and the center position shift decreased by 58%. In the phantom study, the lateral resolution of MSFM-restored images was increased by 49%, and the position shift was reduced by 1.72 mm. This pipeline, termed AutoSoS, thus shows the potential to correct distortions in real-time transcranial ultrasound imaging, as demonstrated by experiments on the intact human skull.
尽管经颅超声平面波成像（PWI）具有广阔的临床应用前景，但研究表明可变声速（SoS）会严重损害超声图像的质量。传统的等速假设与实际的 SoS 分布之间的不匹配导致超声图像普遍模糊。重建经颅超声图像的优化方案通常使用全波形反演等迭代方法来求解。这些迭代方法的计算成本很高，并且基于先前的磁共振成像 (MRI) 或计算机断层扫描 (CT) 信息。相比之下，多模板快速行进（MSFM）方法可以为具有异质声速的头骨生成准确的时间旅行图。在这项研究中，我们首先提出了一个卷积神经网络（CNN）来根据 PWI 通道数据预测头骨的 SoS 图。然后，使用这些图来校正行进时间以减少经颅像差。为了验证所提出方法的性能，使用线性阵列传感器（L11-5v，128 个元件，节距 = 0.3 mm）进行了数值、模型和完整的人类头骨研究。数值模拟表明，对于点目标，MSFM恢复图像的横向分辨率提高了65%，中心位置偏移降低了89%。对于囊肿目标，拟合椭圆的偏心率降低了75%，中心位置偏移降低了58%。在体模研究中，MSFM恢复图像的横向分辨率提高了49%，位置偏移减少了1.72毫米。因此，这一管道被称为 AutoSoS，显示出纠正实时经颅超声成像失真的潜力，正如在完整人类头骨上进行的实验所证明的那样。

AU Zhang, Binyu Meng, Zhu Li, Hongyuan Zhao, Zhicheng Su, Fei
张AU、孟斌宇、朱力、赵宏远、苏志成、费

MTCSNet: One-stage learning and two-point labeling are sufficient for cell segmentation.
MTCSNet：一阶段学习和两点标记足以进行细胞分割。

Deep convolution neural networks have been widely used in medical image analysis, such as lesion identification in whole-slide images, cancer detection, and cell segmentation, etc. However, it is often inevitable that researchers try their best to refine annotations so as to enhance the model performance, especially for cell segmentation task. Weakly supervised learning can greatly reduce the workload of annotations, while there is still a huge performance gap between the weakly and fully supervised learning approaches. In this work, we propose a weakly-supervised cell segmentation method, namely Multi-Task Cell Segmentation Network (MTCSNet), for multi-modal medical images, including pathological, brightfield, fluorescent, phase-contrast and differential interference contrast images. MTCSNet is learnt in a single-stage training manner, where only two annotated points for each cell provide supervision information, and the first one is the centroid, the second one is its boundary. Additionally, five auxiliary tasks are elaborately designed to train the network, including two pixel-level classifications, a pixel-level regression, a local temperature scaling and an instance-level distance regression task, which is proposed to regress the distances between the cell centroid and its boundaries in eight orientations. The experimental results indicate that our method outperforms all state-of-the-art weakly-supervised cell segmentation approaches on public multi-modal medical image datasets. The promising performance also shows that a single-stage learning with two-point labeling approach are sufficient for cell segmentation, instead of fine contour delineation. The codes are available at: https://github.com/binging512/MTCSNet.
深度卷积神经网络已广泛应用于医学图像分析，例如全切片图像中的病灶识别、癌症检测和细胞分割等。然而，研究人员往往不可避免地会尽力细化注释以增强注释模型性能，特别是细胞分割任务。弱监督学习可以大大减少注释的工作量，但弱监督学习方法和全监督学习方法之间仍然存在巨大的性能差距。在这项工作中，我们提出了一种弱监督细胞分割方法，即多任务细胞分割网络（MTCSNet），用于多模态医学图像，包括病理、明场、荧光、相差和微分干涉对比图像。 MTCSNet以单阶段训练方式学习，每个单元只有两个注释点提供监督信息，第一个是质心，第二个是其边界。此外，还精心设计了五个辅助任务来训练网络，包括两个像素级分类、一个像素级回归、一个局部温度缩放和一个实例级距离回归任务，该任务旨在回归细胞质心之间的距离及其八个方向的边界。实验结果表明，我们的方法在公共多模态医学图像数据集上优于所有最先进的弱监督细胞分割方法。令人鼓舞的性能还表明，采用两点标记方法的单阶段学习足以进行细胞分割，而不是精细轮廓描绘。代码位于：https://github.com/binging512/MTCSNet。

AU Zhu, Jianjun Wang, Cheng Zhang, Yi Zhan, Meixiao Zhao, Wei Teng, Sitong Lu, Ligong Teng, Gao-Jun
朱AU、王建军、张成、詹毅、赵美晓、滕伟、陆思同、滕立功、高军

3D/2D Vessel Registration Based on Monte Carlo Tree Search and Manifold Regularization
基于蒙特卡罗树搜索和流形正则化的 3D/2D 船舶配准

The augmented intra-operative real-time imaging in vascular interventional surgery, which is generally performed by projecting preoperative computed tomography angiography images onto intraoperative digital subtraction angiography (DSA) images, can compensate for the deficiencies of DSA-based navigation, such as lack of depth information and excessive use of toxic contrast agents. 3D/2D vessel registration is the critical step in image augmentation. A 3D/2D registration method based on vessel graph matching is proposed in this study. For rigid registration, the matching of vessel graphs can be decomposed into continuous states, thus 3D/2D vascular registration is formulated as a search tree problem. The Monte Carlo tree search method is applied to find the optimal vessel matching associated with the highest rigid registration score. For nonrigid registration, we propose a novel vessel deformation model based on manifold regularization. This model incorporates the smoothness constraint of vessel topology into the objective function. Furthermore, we derive simplified gradient formulas that enable fast registration. The proposed technique undergoes evaluation against seven rigid and three nonrigid methods using a variety of data - simulated, algorithmically generated, and manually annotated - across three vascular anatomies: the hepatic artery, coronary artery, and aorta. Our findings show the proposed method's resistance to pose variations, noise, and deformations, outperforming existing methods in terms of registration accuracy and computational efficiency. The proposed method demonstrates average registration errors of 2.14 mm and 0.34 mm for rigid and nonrigid registration, and an average computation time of 0.51 s.
血管介入手术中增强的术中实时成像通常通过将术前计算机断层扫描血管造影图像投影到术中数字减影血管造影（DSA）图像上来进行，可以弥补基于DSA的导航的不足，例如缺乏深度信息和过量使用有毒造影剂。 3D/2D 血管配准是图像增强的关键步骤。本研究提出了一种基于血管图匹配的3D/2D配准方法。对于刚性配准，血管图的匹配可以分解为连续状态，因此3D/2D血管配准被表述为搜索树问题。应用蒙特卡罗树搜索方法来寻找与最高刚性配准分数相关的最佳血管匹配。对于非刚性配准，我们提出了一种基于流形正则化的新型血管变形模型。该模型将容器拓扑的平滑约束纳入目标函数。此外，我们推导了简化的梯度公式，可以实现快速配准。所提出的技术使用各种数据（模拟的、算法生成的和手动注释的）针对七种刚性和三种非刚性方法进行了评估，涉及三种血管解剖结构：肝动脉、冠状动脉和主动脉。我们的研究结果表明，所提出的方法能够抵抗姿势变化、噪声和变形，在配准精度和计算效率方面优于现有方法。该方法的刚性和非刚性配准平均配准误差分别为 2.14 毫米和 0.34 毫米，平均计算时间为 0.51 秒。

AU Guan, Yu Yu, Chuanming Cui, Zhuoxu Zhou, Huilin Liu, Qiegen
区管、余宇、崔传明、周卓旭、刘慧琳、切根

Correlated and Multi-frequency Diffusion Modeling for Highly Under-sampled MRI Reconstruction.
用于高度欠采样 MRI 重建的相关多频扩散建模。

Given the obstacle in accentuating the reconstruction accuracy for diagnostically significant tissues, most existing MRI reconstruction methods perform targeted reconstruction of the entire MR image without considering fine details, especially when dealing with highly under-sampled images. Therefore, a considerable volume of efforts has been directed towards surmounting this challenge, as evidenced by the emergence of numerous methods dedicated to preserving high-frequency content as well as fine textural details in the reconstructed image. In this case, exploring the merits associated with each method of mining high-frequency information and formulating a reasonable principle to maximize the joint utilization of these approaches will be a more effective solution to achieve accurate reconstruction. Specifically, this work constructs an innovative principle named Correlated and Multi-frequency Diffusion Model (CM-DM) for highly under-sampled MRI reconstruction. In essence, the rationale underlying the establishment of such principle lies not in assembling arbitrary models, but in pursuing the effective combinations and replacement of components. It also means that the novel principle focuses on forming a correlated and multi-frequency prior through different high-frequency operators in the diffusion process. Moreover, multi-frequency prior further constraints the noise term closer to the target distribution in the frequency domain, thereby making the diffusion process converge faster. Experimental results verify that the proposed method achieved superior reconstruction accuracy, with a notable enhancement of approximately 2dB in PSNR compared to state-of-the-art methods.
鉴于在增强具有诊断意义的组织的重建精度方面存在障碍，大多数现有的 MRI 重建方法对整个 MR 图像进行有针对性的重建，而不考虑精细细节，特别是在处理高度欠采样的图像时。因此，人们付出了大量的努力来克服这一挑战，许多致力于保留重建图像中高频内容以及精细纹理细节的方法的出现就证明了这一点。在这种情况下，探索每种高频信息挖掘方法的优点，并制定合理的原则，最大限度地联合利用这些方法，将是实现精确重建的更有效的解决方案。具体来说，这项工作构建了一种创新原理，称为相关和多频扩散模型（CM-DM），用于高度欠采样 MRI 重建。从本质上讲，建立这一原则的依据不在于任意组装模型，而在于追求构件的有效组合和替换。这也意味着该新颖原理的重点是在扩散过程中通过不同的高频算子形成相关的多频先验。而且，多频先验进一步约束噪声项在频域上更接近目标分布，从而使扩散过程收敛得更快。实验结果验证了所提出的方法实现了优异的重建精度，与最先进的方法相比，PSNR 显着提高了约 2dB。

AU Huang, Peizhou Zhang, Chaoyi Zhang, Xiaoliang Li, Xiaojuan Dong, Liang Ying, Leslie
黄AU、张培洲、张超一、李晓亮、董晓娟、梁莹、Leslie

Self-Supervised Deep Unrolled Reconstruction Using Regularization by Denoising
使用正则化去噪的自监督深度展开重建

Deep learning methods have been successfully used in various computer vision tasks. Inspired by that success, deep learning has been explored in magnetic resonance imaging (MRI) reconstruction. In particular, integrating deep learning and model-based optimization methods has shown considerable advantages. However, a large amount of labeled training data is typically needed for high reconstruction quality, which is challenging for some MRI applications. In this paper, we propose a novel reconstruction method, named DURED-Net, that enables interpretable self-supervised learning for MR image reconstruction by combining a self-supervised denoising network and a plug-and-play method. We aim to boost the reconstruction performance of Noise2Noise in MR reconstruction by adding an explicit prior that utilizes imaging physics. Specifically, the leverage of a denoising network for MRI reconstruction is achieved using Regularization by Denoising (RED). Experiment results demonstrate that the proposed method requires a reduced amount of training data to achieve high reconstruction quality among the state-of-the-art approaches utilizing Noise2Noise.
深度学习方法已成功应用于各种计算机视觉任务。受这一成功的启发，深度学习在磁共振成像 (MRI) 重建中得到了探索。特别是，深度学习和基于模型的优化方法的结合已经显示出相当大的优势。然而，为了获得高重建质量，通常需要大量标记的训练数据，这对于某些 MRI 应用来说是一个挑战。在本文中，我们提出了一种名为 DURED-Net 的新颖重建方法，该方法通过结合自监督去噪网络和即插即用方法，实现 MR 图像重建的可解释自监督学习。我们的目标是通过添加利用成像物理的显式先验来提高 MR 重建中 Noise2Noise 的重建性能。具体来说，利用去噪正则化 (RED) 来实现 MRI 重建的去噪网络的作用。实验结果表明，在利用 Noise2Noise 的最先进方法中，所提出的方法需要减少训练数据量来实现高重建质量。

AU Pei, Yuchen Zhao, Fenqiang Zhong, Tao Ma, Laifa Liao, Lufan Wu, Zhengwang Wang, Li Zhang, He Wang, Lisheng Li, Gang
AU Pei, 赵雨辰, 钟奋强, 马涛, 廖来发, 吴路凡, 王正旺, 张莉, 王鹤, 李力生, 刚

PETS-Nets: Joint Pose Estimation and Tissue Segmentation of Fetal Brains Using Anatomy-Guided Networks
PETS-Nets：使用解剖引导网络进行胎儿大脑的联合姿势估计和组织分割

Fetal Magnetic Resonance Imaging (MRI) is challenged by fetal movements and maternal breathing. Although fast MRI sequences allow artifact free acquisition of individual 2D slices, motion frequently occurs in the acquisition of spatially adjacent slices. Motion correction for each slice is thus critical for the reconstruction of 3D fetal brain MRI. In this paper, we propose a novel multi-task learning framework that adopts a coarse-to-fine strategy to jointly learn the pose estimation parameters for motion correction and tissue segmentation map of each slice in fetal MRI. Particularly, we design a regression-based segmentation loss as a deep supervision to learn anatomically more meaningful features for pose estimation and segmentation. In the coarse stage, a U-Net-like network learns the features shared for both tasks. In the refinement stage, to fully utilize the anatomical information, signed distance maps constructed from the coarse segmentation are introduced to guide the feature learning for both tasks. Finally, iterative incorporation of the signed distance maps further improves the performance of both regression and segmentation progressively. Experimental results of cross-validation across two different fetal datasets acquired with different scanners and imaging protocols demonstrate the effectiveness of the proposed method in reducing the pose estimation error and obtaining superior tissue segmentation results simultaneously, compared with state-of-the-art methods.
胎儿磁共振成像 (MRI) 受到胎儿运动和母亲呼吸的挑战。尽管快速 MRI 序列允许无伪影地采集单个 2D 切片，但在采集空间相邻切片时经常会发生运动。因此，每个切片的运动校正对于 3D 胎儿脑 MRI 的重建至关重要。在本文中，我们提出了一种新颖的多任务学习框架，该框架采用从粗到细的策略来共同学习胎儿 MRI 中每个切片的运动校正和组织分割图的姿势估计参数。特别是，我们设计了基于回归的分割损失作为深度监督，以学习解剖学上更有意义的特征，以进行姿势估计和分割。在粗略阶段，类似 U-Net 的网络学习两个任务共享的特征。在细化阶段，为了充分利用解剖信息，引入了从粗分割构建的符号距离图来指导这两个任务的特征学习。最后，带符号距离图的迭代合并进一步逐步提高了回归和分割的性能。与最先进的方法相比，使用不同扫描仪和成像协议获取的两个不同胎儿数据集的交叉验证实验结果证明了所提出的方法在减少姿势估计误差和同时获得优异的组织分割结果方面的有效性。

AU van Gogh, Stefano Mukherjee, Subhadip Rawlik, Michal Pereira, Alexandre Spindler, Simon Zdora, Marie-Christine Stauber, Martin Varga, Zsuzsanna Stampanoni, Marco
AU 梵高、斯特凡诺·慕克吉、苏巴迪普·罗利克、米哈尔·佩雷拉、亚历山大·斯平德勒、西蒙·兹多拉、玛丽-克里斯蒂娜·施陶伯、马丁·瓦尔加、苏珊娜·斯坦帕诺尼、马可

Data-Driven Gradient Regularization for Quasi-Newton Optimization in Iterative Grating Interferometry CT Reconstruction
迭代光栅干涉CT重建中数据驱动的梯度正则化拟牛顿优化

Grating interferometry CT (GI-CT) is a promising technology that could play an important role in future breast cancer imaging. Thanks to its sensitivity to refraction and small-angle scattering, GI-CT could augment the diagnostic content of conventional absorption-based CT. However, reconstructing GI-CT tomographies is a complex task because of ill problem conditioning and high noise amplitudes. It has previously been shown that combining data-driven regularization with iterative reconstruction is promising for tackling challenging inverse problems in medical imaging. In this work, we present an algorithm that allows seamless combination of data-driven regularization with quasi-Newton solvers, which can better deal with ill-conditioned problems compared to gradient descent-based optimization algorithms. Contrary to most available algorithms, our method applies regularization in the gradient domain rather than in the image domain. This comes with a crucial advantage when applied in conjunction with quasi-Newton solvers: the Hessian is approximated solely based on denoised data. We apply the proposed method, which we call GradReg, to both conventional breast CT and GI-CT and show that both significantly benefit from our approach in terms of dose efficiency. Moreover, our results suggest that thanks to its sharper gradients that carry more high spatial-frequency content, GI-CT can benefit more from GradReg compared to conventional breast CT. Crucially, GradReg can be applied to any image reconstruction task which relies on gradient-based updates.
光栅干涉CT（GI-CT）是一项很有前景的技术，可能在未来乳腺癌成像中发挥重要作用。由于其对折射和小角散射的敏感性，GI-CT 可以增强传统吸收 CT 的诊断内容。然而，由于不良问题调节和高噪声幅度，重建 GI-CT 断层扫描是一项复杂的任务。先前的研究表明，将数据驱动的正则化与迭代重建相结合有望解决医学成像中具有挑战性的逆问题。在这项工作中，我们提出了一种算法，可以将数据驱动的正则化与拟牛顿求解器无缝结合，与基于梯度下降的优化算法相比，它可以更好地处理病态问题。与大多数可用算法相反，我们的方法在梯度域而不是图像域中应用正则化。当与拟牛顿求解器结合使用时，这具有至关重要的优势：Hessian 矩阵仅基于去噪数据进行近似。我们将所提出的方法（我们称之为 GradReg）应用于传统乳腺 CT 和 GI-CT，并表明两者在剂量效率方面都显着受益于我们的方法。此外，我们的结果表明，由于其更尖锐的梯度携带更多的高空间频率内容，与传统乳腺 CT 相比，GI-CT 可以从 GradReg 中受益更多。至关重要的是，GradReg 可以应用于任何依赖于基于梯度的更新的图像重建任务。

AU Guo, Pengfei Mei, Yiqun Zhou, Jinyuan Jiang, Shanshan Patel, Vishal M.
郭AU，梅鹏飞，周逸群，姜金元，珊珊帕特尔，Vishal M.

ReconFormer: Accelerated MRI Reconstruction Using Recurrent Transformer
ReconFormer：使用循环变压器加速 MRI 重建

The accelerating magnetic resonance imaging (MRI) reconstruction process is a challenging ill-posed inverse problem due to the excessive under-sampling operation in k -space. In this paper, we propose a recurrent Transformer model, namely ReconFormer, for MRI reconstruction, which can iteratively reconstruct high-fidelity magnetic resonance images from highly under-sampled k-space data (e.g., up to 8 x acceleration). In particular, the proposed architecture is built upon Recurrent Pyramid Transformer Layers (RPTLs). The core design of the proposed method is Recurrent Scale-wise Attention (RSA), which jointly exploits intrinsic multi-scale information at every architecture unit as well as the dependencies of the deep feature correlation through recurrent states. Moreover, benefiting from its recurrent nature, ReconFormer is lightweight compared to other baselines and only contains 1.1 M trainable parameters. We validate the effectiveness of ReconFormer on multiple datasets with different magnetic resonance sequences and show that it achieves significant improvements over the state-of-the-art methods with better parameter efficiency. The implementation code and pre-trained weights are available at https://github.com/guopengf/ReconFormer.
由于k空间中过度的欠采样操作，加速磁共振成像（MRI）重建过程是一个具有挑战性的不适定反问题。在本文中，我们提出了一种用于 MRI 重建的循环 Transformer 模型，即 ReconFormer，它可以从高度欠采样的 k 空间数据（例如，高达 8 倍加速度）迭代重建高保真磁共振图像。特别是，所提出的架构是建立在循环金字塔变压器层（RPTL）之上的。该方法的核心设计是循环尺度注意（RSA），它联合利用每个架构单元的内在多尺度信息以及通过循环状态的深层特征相关性的依赖关系。此外，得益于其循环性质，ReconFormer 与其他基线相比是轻量级的，并且仅包含 110 万个可训练参数。我们在具有不同磁共振序列的多个数据集上验证了 ReconFormer 的有效性，并表明它比最先进的方法取得了显着改进，具有更好的参数效率。实现代码和预训练权重可在 https://github.com/guopengf/ReconFormer 获取。

AU Song, Zhiyun Du, Penghui Yan, Junpeng Li, Kailu Shou, Jianzhong Lai, Maode Fan, Yubo Xu, Yan
AU宋、杜志云、严鹏辉、李俊鹏、寿凯鲁、赖建中、范茂德、徐宇波、严

Nucleus-Aware Self-Supervised Pretraining Using Unpaired Image-to-Image Translation for Histopathology Images
使用不成对的图像到图像转换进行组织病理学图像的核感知自监督预训练

Self-supervised pretraining attempts to enhance model performance by obtaining effective features from unlabeled data, and has demonstrated its effectiveness in the field of histopathology images. Despite its success, few works concentrate on the extraction of nucleus-level information, which is essential for pathologic analysis. In this work, we propose a novel nucleus-aware self-supervised pretraining framework for histopathology images. The framework aims to capture the nuclear morphology and distribution information through unpaired image-to-image translation between histopathology images and pseudo mask images. The generation process is modulated by both conditional and stochastic style representations, ensuring the reality and diversity of the generated histopathology images for pretraining. Further, an instance segmentation guided strategy is employed to capture instance-level information. The experiments on 7 datasets show that the proposed pretraining method outperforms supervised ones on Kather classification, multiple instance learning, and 5 dense-prediction tasks with the transfer learning protocol, and yields superior results than other self-supervised approaches on 8 semi-supervised tasks. Our project is publicly available at https://github.com/zhiyuns/UNITPathSSL.
自监督预训练试图通过从未标记数据中获取有效特征来增强模型性能，并已在组织病理学图像领域证明了其有效性。尽管取得了成功，但很少有工作集中于提取细胞核水平信息，这对于病理分析至关重要。在这项工作中，我们提出了一种新颖的用于组织病理学图像的核感知自监督预训练框架。该框架旨在通过组织病理学图像和伪掩模图像之间不成对的图像到图像转换来捕获核形态和分布信息。生成过程通过条件和随机风格表示进行调制，确保生成的用于预训练的组织病理学图像的真实性和多样性。此外，采用实例分割引导策略来捕获实例级信息。在 7 个数据集上的实验表明，所提出的预训练方法在 Kather 分类、多实例学习和使用迁移学习协议的 5 个密集预测任务上优于有监督的方法，并且在 8 个半监督任务上比其他自监督方法产生了更好的结果。我们的项目已公开发布于 https://github.com/zhiyuns/UNITPathSSL。

AU Fontanella, Alessandro Mair, Grant Wardlaw, Joanna Trucco, Emanuele Storkey, Amos
AU Fontanella、亚历山德罗·梅尔、格兰特·沃德劳、乔安娜·特鲁科、埃马努埃莱·斯托基、阿莫斯

Diffusion Models for Counterfactual Generation and Anomaly Detection in Brain Images.
脑图像中反事实生成和异常检测的扩散模型。

Segmentation masks of pathological areas are useful in many medical applications, such as brain tumour and stroke management. Moreover, healthy counterfactuals of diseased images can be used to enhance radiologists' training files and to improve the interpretability of segmentation models. In this work, we present a weakly supervised method to generate a healthy version of a diseased image and then use it to obtain a pixel-wise anomaly map. To do so, we start by considering a saliency map that approximately covers the pathological areas, obtained with ACAT. Then, we propose a technique that allows to perform targeted modifications to these regions, while preserving the rest of the image. In particular, we employ a diffusion model trained on healthy samples and combine Denoising Diffusion Probabilistic Model (DDPM) and Denoising Diffusion Implicit Model (DDIM) at each step of the sampling process. DDPM is used to modify the areas affected by a lesion within the saliency map, while DDIM guarantees reconstruction of the normal anatomy outside of it. The two parts are also fused at each timestep, to guarantee the generation of a sample with a coherent appearance and a seamless transition between edited and unedited parts. We verify that when our method is applied to healthy samples, the input images are reconstructed without significant modifications. We compare our approach with alternative weakly supervised methods on the task of brain lesion segmentation, achieving the highest mean Dice and IoU scores among the models considered.
病理区域的分割掩模在许多医学应用中都很有用，例如脑肿瘤和中风管理。此外，患病图像的健康反事实可用于增强放射科医生的培训文件并提高分割模型的可解释性。在这项工作中，我们提出了一种弱监督方法来生成患病图像的健康版本，然后使用它来获得逐像素异常图。为此，我们首先考虑使用 ACAT 获得的大致覆盖病理区域的显着图。然后，我们提出了一种技术，允许对这些区域进行有针对性的修改，同时保留图像的其余部分。特别是，我们采用在健康样本上训练的扩散模型，并在采样过程的每个步骤中结合去噪扩散概率模型（DDPM）和去噪扩散隐式模型（DDIM）。 DDPM 用于修改显着图中受病变影响的区域，而 DDIM 则保证重建其外部的正常解剖结构。这两个部分也在每个时间步融合，以保证生成具有连贯外观的样本以及编辑和未编辑部分之间的无缝过渡。我们验证了当我们的方法应用于健康样本时，输入图像的重建无需进行重大修改。我们将我们的方法与大脑病变分割任务上的其他弱监督方法进行比较，在所考虑的模型中实现了最高的平均 Dice 和 IoU 分数。

AU Wu, Jianghao Guo, Dong Wang, Guotai Yue, Qiang Yu, Huijun Li, Kang Zhang, Shaoting
吴AU、郭江浩、王栋、岳国泰、于强、李惠军、张康、绍婷

FPL plus : Filtered Pseudo Label-Based Unsupervised Cross-Modality Adaptation for 3D Medical Image Segmentation
FPL plus：基于过滤伪标签的无监督跨模态适应 3D 医学图像分割

Adapting a medical image segmentation model to a new domain is important for improving its cross-domain transferability, and due to the expensive annotation process, Unsupervised Domain Adaptation (UDA) is appealing where only unlabeled images are needed for the adaptation. Existing UDA methods are mainly based on image or feature alignment with adversarial training for regularization, and they are limited by insufficient supervision in the target domain. In this paper, we propose an enhanced Filtered Pseudo Label (FPL+)-based UDA method for 3D medical image segmentation. It first uses cross-domain data augmentation to translate labeled images in the source domain to a dual-domain training set consisting of a pseudo source-domain set and a pseudo target-domain set. To leverage the dual-domain augmented images to train a pseudo label generator, domain-specific batch normalization layers are used to deal with the domain shift while learning the domain-invariant structure features, generating high-quality pseudo labels for target-domain images. We then combine labeled source-domain images and target-domain images with pseudo labels to train a final segmentor, where image-level weighting based on uncertainty estimation and pixel-level weighting based on dual-domain consensus are proposed to mitigate the adverse effect of noisy pseudo labels. Experiments on three public multi-modal datasets for Vestibular Schwannoma, brain tumor and whole heart segmentation show that our method surpassed ten state-of-the-art UDA methods, and it even achieved better results than fully supervised learning in the target domain in some cases.
将医学图像分割模型适应新领域对于提高其跨域可转移性非常重要，并且由于昂贵的注释过程，无监督域适应（UDA）在仅需要未标记图像进行适应的情况下很有吸引力。现有的UDA方法主要基于图像或特征对齐并通过对抗性训练进行正则化，并且受到目标域监督不足的限制。在本文中，我们提出了一种基于增强型过滤伪标签（FPL+）的 UDA 方法，用于 3D 医学图像分割。它首先使用跨域数据增强将源域中的标记图像转换为由伪源域集和伪目标域集组成的双域训练集。为了利用双域增强图像来训练伪标签生成器，使用特定于域的批量归一化层来处理域移位，同时学习域不变的结构特征，为目标域图像生成高质量的伪标签。然后，我们将标记的源域图像和目标域图像与伪标签结合起来训练最终的分割器，其中提出基于不确定性估计的图像级加权和基于双域共识的像素级加权来减轻嘈杂的伪标签。在前庭神经鞘瘤、脑肿瘤和全心脏分割的三个公共多模态数据集上的实验表明，我们的方法超越了十种最先进的 UDA 方法，甚至在某些目标领域取得了比完全监督学习更好的结果。案例。

AU Yang, Chen Wang, Kailing Wang, Yuehao Dou, Qi Yang, Xiaokang Shen, Wei
欧阳、王晨、王凯灵、窦跃豪、杨奇、沉小康、魏

Efficient Deformable Tissue Reconstruction via Orthogonal Neural Plane
通过正交神经平面的高效可变形组织重建

Intraoperative imaging techniques for reconstructing deformable tissues in vivo are pivotal for advanced surgical systems. Existing methods either compromise on rendering quality or are excessively computationally intensive, often demanding dozens of hours to perform, which significantly hinders their practical application. In this paper, we introduce Fast Orthogonal Plane (Forplane), a novel, efficient framework based on neural radiance fields (NeRF) for the reconstruction of deformable tissues. We conceptualize surgical procedures as 4D volumes, and break them down into static and dynamic fields comprised of orthogonal neural planes. This factorization discretizes the four-dimensional space, leading to a decreased memory usage and faster optimization. A spatiotemporal importance sampling scheme is introduced to improve performance in regions with tool occlusion as well as large motions and accelerate training. An efficient ray marching method is applied to skip sampling among empty regions, significantly improving inference speed. Forplane accommodates both binocular and monocular endoscopy videos, demonstrating its extensive applicability and flexibility. Our experiments, carried out on two in vivo datasets, the EndoNeRF and Hamlyn datasets, demonstrate the effectiveness of our framework. In all cases, Forplane substantially accelerates both the optimization process (by over 100 times) and the inference process (by over 15 times) while maintaining or even improving the quality across a variety of non-rigid deformations. This significant performance improvement promises to be a valuable asset for future intraoperative surgical applications. The code of our project is now available at https://github.com/Loping151/ForPlane.
用于重建体内可变形组织的术中成像技术对于先进手术系统至关重要。现有的方法要么会影响渲染质量，要么计算量过大，通常需要数十个小时才能执行，这极大地阻碍了它们的实际应用。在本文中，我们介绍了快速正交平面（Forplane），这是一种基于神经辐射场（NeRF）的新型高效框架，用于重建可变形组织。我们将外科手术概念化为 4D 体积，并将其分解为由正交神经平面组成的静态和动态场。这种因式分解使四维空间离散化，从而减少内存使用并加快优化速度。引入时空重要性采样方案来提高工具遮挡和大运动区域的性能并加速训练。采用高效的光线行进方法来跳过空白区域之间的采样，显着提高推理速度。 Forplane 可容纳双目和单目内窥镜视频，展示了其广泛的适用性和灵活性。我们在两个体内数据集 EndoNeRF 和 Hamlyn 数据集上进行的实验证明了我们框架的有效性。在所有情况下，Forplane 都显着加速了优化过程（超过 100 倍）和推理过程（超过 15 倍），同时保持甚至提高了各种非刚性变形的质量。这一显着的性能改进有望成为未来术中手术应用的宝贵资产。我们项目的代码现在可以在 https://github.com/Loping151/ForPlane 上找到。

AU Xu, Zhenghua Liu, Yunxin Xu, Gang Lukasiewicz, Thomas
AU 徐、刘正华、徐云欣、Gang Lukasiewicz、Thomas

Self-Supervised Medical Image Segmentation Using Deep Reinforced Adaptive Masking.
使用深度强化自适应掩蔽的自监督医学图像分割。

Self-supervised learning aims to learn transferable representations from unlabeled data for downstream tasks. Inspired by masked language modeling in natural language processing, masked image modeling (MIM) has achieved certain success in the field of computer vision, but its effectiveness in medical images remains unsatisfactory. This is mainly due to the high redundancy and small discriminative regions in medical images compared to natural images. Therefore, this paper proposes an adaptive hard masking (AHM) approach based on deep reinforcement learning to expand the application of MIM in medical images. Unlike predefined random masks, AHM uses an asynchronous advantage actor-critic (A3C) model to predict reconstruction loss for each patch, enabling the model to learn where masking is valuable. By optimizing the non-differentiable sampling process using reinforcement learning, AHM enhances the understanding of key regions, thereby improving downstream task performance. Experimental results on two medical image datasets demonstrate that AHM outperforms state-of-the-art methods. Additional experiments under various settings validate the effectiveness of AHM in constructing masked images.
自监督学习旨在从下游任务的未标记数据中学习可转移的表示。受自然语言处理中掩蔽语言建模的启发，掩蔽图像建模（MIM）在计算机视觉领域取得了一定的成功，但其在医学图像中的效果仍不理想。这主要是由于与自然图像相比，医学图像具有高冗余度和小辨别区域。因此，本文提出一种基于深度强化学习的自适应硬掩蔽（AHM）方法，以扩展MIM在医学图像中的应用。与预定义的随机掩码不同，AHM 使用异步优势行动者批评家 (A3C) 模型来预测每个补丁的重建损失，使模型能够了解掩码在何处有价值。通过使用强化学习优化不可微采样过程，AHM 增强了对关键区域的理解，从而提高了下游任务性能。两个医学图像数据集的实验结果表明，AHM 优于最先进的方法。各种设置下的附加实验验证了 AHM 在构建蒙版图像方面的有效性。

EI 1558-254X DA 2024-08-03 UT MEDLINE:39088493 PM 39088493 ER
EI 1558-254X DA 2024-08-03 UT MEDLINE：39088493 PM 39088493 ER

AU Cai, Linqin Fang, Haodu Xu, Nuoying Ren, Bo
蔡区、方林勤、徐浩都、任诺英、薄

Counterfactual Causal-Effect Intervention for Interpretable Medical Visual Question Answering.
可解释的医学视觉问答的反事实因果效应干预。

Medical Visual Question Answering (VQA-Med) is a challenging task that involves answering clinical questions related to medical images. However, most current VQA-Med methods ignore the causal correlation between specific lesion or abnormality features and answers, while also failing to provide accurate explanations for their decisions. To explore the interpretability of VQA-Med, this paper proposes a novel CCIS-MVQA model for VQA-Med based on a counterfactual causal-effect intervention strategy. This model consists of the modified ResNet for image feature extraction, a GloVe decoder for question feature extraction, a bilinear attention network for vision and language feature fusion, and an interpretability generator for producing the interpretability and prediction results. The proposed CCIS-MVQA introduces a layer-wise relevance propagation method to automatically generate counterfactual samples. Additionally, CCIS-MVQA applies counterfactual causal reasoning throughout the training phase to enhance interpretability and generalization. Extensive experiments on three benchmark datasets show that the proposed CCIS-MVQA model outperforms the state-of-the-art methods. Enough visualization results are produced to analyze the interpretability and performance of CCIS-MVQA.
医学视觉问答 (VQA-Med) 是一项具有挑战性的任务，涉及回答与医学图像相关的临床问题。然而，当前大多数 VQA-Med 方法忽略了特定病变或异常特征与答案之间的因果关系，同时也未能为其决策提供准确的解释。为了探索 VQA-Med 的可解释性，本文提出了一种基于反事实因果干预策略的 VQA-Med 新型 CCIS-MVQA 模型。该模型由用于图像特征提取的改进的 ResNet、用于问题特征提取的 GloVe 解码器、用于视觉和语言特征融合的双线性注意网络以及用于生成可解释性和预测结果的可解释性生成器组成。所提出的 CCIS-MVQA 引入了分层相关性传播方法来自动生成反事实样本。此外，CCIS-MVQA 在整个训练阶段应用反事实因果推理，以增强可解释性和泛化性。对三个基准数据集的广泛实验表明，所提出的 CCIS-MVQA 模型优于最先进的方法。生成足够的可视化结果来分析 CCIS-MVQA 的可解释性和性能。

AU Zhu, Enjun Feng, Haiyu Chen, Long Lai, Yongqiang Chai, Senchun
朱AU、冯恩俊、陈海宇、赖龙、柴永强、森春

MP-Net: A Multi-Center Privacy-Preserving Network for Medical Image Segmentation
MP-Net：用于医学图像分割的多中心隐私保护网络

In this paper, we present the Multi-Center Privacy-Preserving Network (MP-Net), a novel framework designed for secure medical image segmentation in multi-center collaborations. Our methodology offers a new approach to multi-center collaborative learning, capable of reducing the volume of data transmission and enhancing data privacy protection. Unlike federated learning, which requires the transmission of model data between the central server and local servers in each round, our method only necessitates a single transfer of encrypted data. The proposed MP-Net comprises a three-layer model, consisting of encryption, segmentation, and decryption networks. We encrypt the image data into ciphertext using an encryption network and introduce an improved U-Net for image ciphertext segmentation. Finally, the segmentation mask is obtained through a decryption network. This architecture enables ciphertext-based image segmentation through computable image encryption. We evaluate the effectiveness of our approach on three datasets, including two cardiac MRI datasets and a CTPA dataset. Our results demonstrate that the MP-Net can securely utilize data from multiple centers to establish a more robust and information-rich segmentation model.
在本文中，我们提出了多中心隐私保护网络（MP-Net），这是一种专为多中心协作中安全医学图像分割而设计的新颖框架。我们的方法论提供了一种新的多中心协作学习方法，能够减少数据传输量并增强数据隐私保护。与联邦学习每轮都需要在中央服务器和本地服务器之间传输模型数据不同，我们的方法只需要单次传输加密数据。所提出的 MP-Net 包括一个三层模型，由加密、分段和解密网络组成。我们使用加密网络将图像数据加密为密文，并引入改进的 U-Net 进行图像密文分割。最后通过解密网络得到分割掩码。该架构通过可计算图像加密实现基于密文的图像分割。我们评估了我们的方法在三个数据集上的有效性，包括两个心脏 MRI 数据集和一个 CTPA 数据集。我们的结果表明，MP-Net 可以安全地利用来自多个中心的数据来建立更强大且信息丰富的分割模型。

AU Le, Tuan-Anh Bui, Minh Phu Hadadian, Yaser Gadelmowla, Khaled Mohamed Oh, Seungjun Im, Chaemin Hahn, Seungyong Yoon, Jungwon
AU Le、Tuan-Anh Bui、Minh Phu Hadadian、Yaser Gadelmowla、Khaled Mohamed Oh、Seungjun Im、Chaemin Hahn、Seungyong Yoon、Jungwon

Towards human-scale magnetic particle imaging: development of the first system with superconductor-based selection coils.
迈向人体规模的磁粒子成像：开发第一个具有基于超导体的选择线圈的系统。

Magnetic Particle Imaging (MPI) is an emerging tomographic modality that allows for precise three-dimensional (3D) mapping of magnetic nanoparticles (MNPs) concentration and distribution. Although significant progress has been made towards improving MPI since its introduction, scaling it up for human applications has proven challenging. High-quality images have been obtained in animal-scale MPI scanners with gradients up to 7 T/m/mu0, however, for MPI systems with bore diameters around 200 mm the gradients generated by electromagnets drop significantly to below 0.5 T/m/mu0. Given the current technological limitations in image reconstruction and the properties of available MNPs, these low gradients inherently impose limitations on improving MPI resolution for higher precision medical imaging. Utilizing superconductors stands out as a promising approach for developing a human-scale MPI system. In this study, we introduce, for the first time, a human-scale amplitude-modulated (AM) MPI system with superconductor-based selection coils. The system achieves an unprecedented magnetic field gradient of up to 2.5 T/m/mu0 within a 200 mm bore diameter, enabling large fields of view of 100 * 130 * 98 mm3 at 2.5 T/m/mu0 for 3D imaging. While obtained spatial resolution is in the order of previous animal-scale AM MPIs, incorporating superconductors for achieving such high gradients in a 200 mm bore diameter marks a major step toward clinical MPI.
磁粒子成像 (MPI) 是一种新兴的断层扫描模式，可对磁性纳米颗粒 (MNP) 的浓度和分布进行精确的三维 (3D) 测绘。尽管自推出以来在改进 MPI 方面已经取得了重大进展，但将其扩展到人类应用程序已被证明具有挑战性。梯度高达 7 T/m/mu0 的动物级 MPI 扫描仪已获得高质量图像，但是，对于孔径约为 200 mm 的 MPI 系统，电磁体产生的梯度显着下降至 0.5 T/m/mu0 以下。考虑到当前图像重建的技术限制和可用 MNP 的特性，这些低梯度本质上限制了提高 MPI 分辨率以实现更高精度的医学成像。利用超导体是开发人类规模的 MPI 系统的一种有前途的方法。在这项研究中，我们首次引入了具有基于超导体的选择线圈的人体规模调幅 (AM) MPI 系统。该系统在200毫米孔径内实现了前所未有的高达2.5 T/m/mu0的磁场梯度，在2.5 T/m/mu0下实现了100 * 130 * 98 mm3的大视场，用于3D成像。虽然获得的空间分辨率与之前的动物规模 AM MPI 相当，但采用超导体在 200 毫米孔径中实现如此高的梯度标志着向临床 MPI 迈出了重要一步。

AU Wang, Yanyang Li, Zirong Wu, Weiwen
王AU、李艳阳、吴子荣、伟文

Time-reversion Fast-sampling Score-based Model for Limited-angle CT Reconstruction.
用于有限角度 CT 重建的时间反转快速采样评分模型。

The score-based generative model (SGM) has received significant attention in the field of medical imaging, particularly in the context of limited-angle computed tomography (LACT). Traditional SGM approaches achieved robust reconstruction performance by incorporating a substantial number of sampling steps during the inference phase. However, these established SGM-based methods require large computational cost to reconstruct one case. The main challenge lies in achieving high-quality images with rapid sampling while preserving sharp edges and small features. In this study, we propose an innovative rapid-sampling strategy for SGM, which we have aptly named the time-reversion fast-sampling (TIFA) score-based model for LACT reconstruction. The entire sampling procedure adheres steadfastly to the principles of robust optimization theory and is firmly grounded in a comprehensive mathematical model. TIFA's rapid-sampling mechanism comprises several essential components, including jump sampling, time-reversion with re-sampling, and compressed sampling. In the initial jump sampling stage, multiple sampling steps are bypassed to expedite the attainment of preliminary results. Subsequently, during the time-reversion process, the initial results undergo controlled corruption by introducing small-scale noise. The re-sampling process then diligently refines the initially corrupted results. Finally, compressed sampling fine-tunes the refinement outcomes by imposing regularization term. Quantitative and qualitative assessments conducted on numerical simulations, real physical phantom, and clinical cardiac datasets, unequivocally demonstrate that TIFA method (using 200 steps) outperforms other state-of-the-art methods (using 2000 steps) from available [0°, 90°] and [0°, 60°]. Furthermore, experimental results underscore that our TIFA method continues to reconstruct high-quality images even with 10 steps. Our code at https://github.com/tianzhijiaoziA/TIFADiffusion.
基于评分的生成模型（SGM）在医学成像领域受到了极大的关注，特别是在有限角度计算机断层扫描（LACT）的背景下。传统的 SGM 方法通过在推理阶段结合大量采样步骤来实现稳健的重建性能。然而，这些已建立的基于 SGM 的方法需要大量的计算成本来重建一种情况。主要挑战在于通过快速采样获得高质量图像，同时保留锐利边缘和小特征。在这项研究中，我们提出了一种创新的 SGM 快速采样策略，我们恰当地将其命名为基于时间反转快速采样 (TIFA) 评分的 LACT 重建模型。整个采样过程坚定地遵循鲁棒优化理论的原则，并牢固地建立在综合数学模型的基础上。 TIFA 的快速采样机制由几个基本组件组成，包括跳跃采样、重新采样的时间反转以及压缩采样。在初始跳跃采样阶段，绕过多个采样步骤以加快获得初步结果。随后，在时间反转过程中，初始结果通过引入小规模噪声而遭受受控损坏。然后，重新采样过程会努力完善最初损坏的结果。最后，压缩采样通过施加正则化项来微调细化结果。对数值模拟、真实物理体模和临床心脏数据集进行的定量和定性评估明确证明 TIFA 方法（使用 200 个步骤）优于其他最先进的方法（使用 2000 个步骤）[0°, 90] °]和[0°，60°]。此外，实验结果强调，我们的 TIFA 方法即使使用 10 个步骤也能继续重建高质量图像。我们的代码位于 https://github.com/tianzhijiaoziA/TIFADiffusion。

AU Guo, Zhanqiang Tan, Zimeng Feng, Jianjiang Zhou, Jie
郭区、谭占强、冯子萌、周建江、杰

3D Vascular Segmentation Supervised by 2D Annotation of Maximum Intensity Projection
由最大强度投影的 2D 注释监督的 3D 血管分割

Vascular structure segmentation plays a crucial role in medical analysis and clinical applications. The practical adoption of fully supervised segmentation models is impeded by the intricacy and time-consuming nature of annotating vessels in the 3D space. This has spurred the exploration of weakly-supervised approaches that reduce reliance on expensive segmentation annotations. Despite this, existing weakly supervised methods employed in organ segmentation, which encompass points, bounding boxes, or graffiti, have exhibited suboptimal performance when handling sparse vascular structure. To alleviate this issue, we employ maximum intensity projection (MIP) to decrease the dimensionality of 3D volume to 2D image for efficient annotation, and the 2D labels are utilized to provide guidance and oversight for training 3D vessel segmentation model. Initially, we generate pseudo-labels for 3D blood vessels using the annotations of 2D projections. Subsequently, taking into account the acquisition method of the 2D labels, we introduce a weakly-supervised network that fuses 2D-3D deep features via MIP to further improve segmentation performance. Furthermore, we integrate confidence learning and uncertainty estimation to refine the generated pseudo-labels, followed by fine-tuning the segmentation network. Our method is validated on five datasets (including cerebral vessel, aorta and coronary artery), demonstrating highly competitive performance in segmenting vessels and the potential to significantly reduce the time and effort required for vessel annotation. Our code is available at: https://github.com/gzq17/Weakly-Supervised-by-MIP.
血管结构分割在医学分析和临床应用中起着至关重要的作用。 3D 空间中注释血管的复杂性和耗时性阻碍了完全监督分割模型的实际采用。这刺激了对弱监督方法的探索，以减少对昂贵的分割注释的依赖。尽管如此，器官分割中采用的现有弱监督方法（包括点、边界框或涂鸦）在处理稀疏血管结构时表现出次优性能。为了缓解这个问题，我们采用最大强度投影（MIP）将3D体积降维为2D图像以进行有效注释，并利用2D标签为训练3D血管分割模型提供指导和监督。最初，我们使用 2D 投影的注释生成 3D 血管的伪标签。随后，考虑到 2D 标签的获取方法，我们引入了一种弱监督网络，通过 MIP 融合 2D-3D 深度特征，以进一步提高分割性能。此外，我们集成置信度学习和不确定性估计来细化生成的伪标签，然后微调分割网络。我们的方法在五个数据集（包括脑血管、主动脉和冠状动脉）上进行了验证，证明了在分割血管方面具有高度竞争力的性能，并且有可能显着减少血管注释所需的时间和精力。我们的代码位于：https://github.com/gzq17/Weakly-Supervised-by-MIP。

AU Shi, Yongyi Gao, Yongfeng Xu, Qiong Li, Yang Zhang, Chaoyang Mou, Xuanqin Liang, Zhengrong
区石、高永义、徐永峰、李琼、张阳、牟朝阳、梁宣勤、峥嵘

Learned Tensor Neural Network Texture Prior for Photon-Counting CT Reconstruction.
学习用于光子计数 CT 重建的张量神经网络纹理先验。

Photon-counting computed tomography (PCCT) reconstructs multiple energy-channel images to describe the same object, where there exists a strong correlation among different channel images. In addition, reconstruction of each channel image suffers photon count starving problem. To make full use of the correlation among different channel images to suppress the data noise and enhance the texture details in reconstructing each channel image, this paper proposes a tensor neural network (TNN) architecture to learn a multi-channel texture prior for PCCT reconstruction. Specifically, we first learn a spatial texture prior in each individual channel image by modeling the relationship between the center pixels and its corresponding neighbor pixels using a neural network. Then, we merge the single channel spatial texture prior into multi-channel neural network to learn the spectral local correlation information among different channel images. Since our proposed TNN is trained on a series of unpaired small spatial-spectral cubes which are extracted from one single reference multi-channel image, the local correlation in the spatial-spectral cubes is considered by TNN. To boost the TNN performance, a low-rank representation is also employed to consider the global correlation among different channel images. Finally, we integrate the learned TNN and the low-rank representation as priors into Bayesian reconstruction framework. To evaluate the performance of the proposed method, four references are considered. One is simulated images from ultra-high-resolution CT. One is spectral images from dual-energy CT. The other two are animal tissue and preclinical mouse images from a custom-made PCCT systems. Our TNN prior Bayesian reconstruction demonstrated better performance than other state-of-the-art competing algorithms, in terms of not only preserving texture feature but also suppressing image noise in each channel image.
光子计数计算机断层扫描（PCCT）重建多个能量通道图像来描述同一物体，其中不同通道图像之间存在很强的相关性。此外，每个通道图像的重建都存在光子计数匮乏的问题。为了充分利用不同通道图像之间的相关性来抑制数据噪声并增强重建每个通道图像时的纹理细节，本文提出了一种张量神经网络（TNN）架构来学习PCCT重建的多通道纹理先验。具体来说，我们首先通过使用神经网络对中心像素与其相应的相邻像素之间的关系进行建模来学习每个单独通道图像中的空间纹理先验。然后，我们将单通道空间纹理先验合并到多通道神经网络中，以学习不同通道图像之间的光谱局部相关信息。由于我们提出的 TNN 是在一系列不成对的小空间光谱立方体上进行训练的，这些立方体是从单个参考多通道图像中提取的，因此 TNN 考虑了空间光谱立方体中的局部相关性。为了提高 TNN 性能，还采用低秩表示来考虑不同通道图像之间的全局相关性。最后，我们将学习到的 TNN 和低秩表示作为先验集成到贝叶斯重建框架中。为了评估所提出方法的性能，考虑了四个参考。一种是超高分辨率 CT 的模拟图像。一幅是双能 CT 的光谱图像。另外两个是来自定制 PCCT 系统的动物组织和临床前小鼠图像。我们的 TNN 先验贝叶斯重建表现出比其他最先进的竞争算法更好的性能，不仅保留了纹理特征，而且还抑制了每个通道图像中的图像噪声。

AU Tenditnaya, Anna Gabriels, Ruben Y Hooghiemstra, Wouter T R Klemm, Uwe Nagengast, Wouter B Ntziachristos, Vasilis Gorpas, Dimitris
AU Tenditnaya、Anna Gabriels、Ruben Y Hooghiemstra、Wouter TR Klemm、Uwe Nagengast、Wouter B Ntziachristos、Vasilis Gorpas、Dimitris

Performance Assessment and Quality Control of Fluorescence Molecular Endoscopy with a Multi-Parametric Rigid Standard.
使用多参数刚性标准对荧光分子内窥镜进行性能评估和质量控制。

Fluorescence molecular endoscopy (FME) is emerging as a "red-flag" technique with potential to deliver earlier, faster, and more personalized detection of disease in the gastrointestinal tract, including cancer, and to gain insights into novel drug distribution, dose finding, and response prediction. However, to date, the performance of FME systems is assessed mainly by endoscopists during a procedure, leading to arbitrary, potentially biased, and heavily subjective assessment. This approach significantly affects the repeatability of the procedures and the interpretation or comparison of the acquired data, representing a major bottleneck towards the clinical translation of the technology. Herein, we propose a robust methodology for FME performance assessment and quality control that is based on a novel multi-parametric rigid standard. This standard enables the characterization of an FME system's sensitivity through a single acquisition, performance comparison of multiple systems, and, for the first time, quality control of a system as a function of time and number of usages. We show the photostability of the standard experimentally and demonstrate how it can be used to characterize the performance of an FME system. Moreover, we showcase how the standard can be employed for quality control of a system. In this study, we find that the use of composite fluorescence standards before endoscopic procedures can ensure that an FME system meets the performance criteria and that components prone to performance degradation are replaced in time, avoiding disruption of clinical endoscopy logistics. This will help overcome a major barrier for the translation of FME into the clinics.
荧光分子内窥镜 (FME) 正在成为一种“危险信号”技术，有可能提供更早、更快、更个性化的胃肠道疾病（包括癌症）检测，并深入了解新药物分布、剂量发现、和响应预测。然而，迄今为止，FME 系统的性能主要由内窥镜医师在手术过程中进行评估，导致评估随意、可能存在偏见且主观性很强。这种方法显着影响程序的可重复性以及所获取数据的解释或比较，是该技术临床转化的主要瓶颈。在此，我们提出了一种基于新颖的多参数严格标准的稳健的 FME 性能评估和质量控制方法。该标准能够通过单次采集、多个系统的性能比较来表征 FME 系统的灵敏度，并且首次将系统的质量控制作为时间和使用次数的函数。我们通过实验展示了该标准的光稳定性，并演示了如何使用它来表征 FME 系统的性能。此外，我们还展示了如何利用该标准来控制系统的质量。在这项研究中，我们发现在内窥镜手术之前使用复合荧光标准品可以确保 FME 系统满足性能标准，并及时更换容易出现性能下降的组件，从而避免临床内窥镜检查物流中断。这将有助于克服 FME 转化为临床的主要障碍。

AU Huang, Xiaofei Gong, Hongfang
AU黄、宫晓飞、红芳

A Dual-Attention Learning Network With Word and Sentence Embedding for Medical Visual Question Answering
用于医学视觉问答的具有词和句子嵌入的双注意学习网络

Research in medical visual question answering (MVQA) can contribute to the development of computer-aided diagnosis. MVQA is a task that aims to predict accurate and convincing answers based on given medical images and associated natural language questions. This task requires extracting medical knowledge-rich feature content and making fine-grained understandings of them. Therefore, constructing an effective feature extraction and understanding scheme are keys to modeling. Existing MVQA question extraction schemes mainly focus on word information, ignoring medical information in the text, such as medical concepts and domain-specific terms. Meanwhile, some visual and textual feature understanding schemes cannot effectively capture the correlation between regions and keywords for reasonable visual reasoning. In this study, a dual-attention learning network with word and sentence embedding (DALNet-WSE) is proposed. We design a module, transformer with sentence embedding (TSE), to extract a double embedding representation of questions containing keywords and medical information. A dual-attention learning (DAL) module consisting of self-attention and guided attention is proposed to model intensive intramodal and intermodal interactions. With multiple DAL modules (DALs), learning visual and textual co-attention can increase the granularity of understanding and improve visual reasoning. Experimental results on the ImageCLEF 2019 VQA-MED (VQA-MED 2019) and VQA-RAD datasets demonstrate that our proposed method outperforms previous state-of-the-art methods. According to the ablation studies and Grad-CAM maps, DALNet-WSE can extract rich textual information and has strong visual reasoning ability.
医学视觉问答（MVQA）的研究有助于计算机辅助诊断的发展。 MVQA 是一项旨在根据给定的医学图像和相关自然语言问题预测准确且令人信服的答案的任务。这项任务需要提取医学知识丰富的特征内容并对其进行细粒度的理解。因此，构建有效的特征提取和理解方案是建模的关键。现有的MVQA问题提取方案主要关注单词信息，忽略文本中的医学信息，例如医学概念和领域特定术语。同时，一些视觉和文本特征理解方案无法有效捕获区域和关键词之间的相关性以进行合理的视觉推理。在这项研究中，提出了一种具有词和句子嵌入的双注意力学习网络（DALNet-WSE）。我们设计了一个模块，带有句子嵌入的变压器（TSE），来提取包含关键词和医疗信息的问题的双重嵌入表示。提出了一种由自注意力和引导注意力组成的双注意力学习（DAL）模块来模拟密集的模式内和模式间交互。通过多个 DAL 模块（DAL），学习视觉和文本共同注意力可以增加理解的粒度并改善视觉推理。 ImageCLEF 2019 VQA-MED (VQA-MED 2019) 和 VQA-RAD 数据集上的实验结果表明，我们提出的方法优于以前的最先进方法。根据消融研究和 Grad-CAM 图，DALNet-WSE 可以提取丰富的文本信息，并具有很强的视觉推理能力。

C1 Changsha Univ Sci & Technol, Sch Math & Stat, Changsha 410114, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-05-25 UT WOS:001203303400010 PM 37812550 ER
C1 长沙理工大学数学与统计学院, 长沙 410114, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-05-25 UT WOS:001203303400010 PM 37812550 ER

AU Li, Wen An, Nan Cao, Fuzhi Wang, Wenli Wang, Chunhui Xu, Weinan Gao, Yang Ning, Xiaolin
李区、安文、曹楠、王富志、王文丽、徐春辉、高伟南、宁杨、小林

Source Imaging Method based on Spatial Smoothing and Edge Sparsity (SISSES) and Its Application to OPM-MEG.
基于空间平滑和边缘稀疏性的源成像方法（SISSES）及其在 OPM-MEG 中的应用。

Source estimation in magnetoencephalography (MEG) involves solving a highly ill-posed problem without a unique solution. Accurate estimation of the time course and spatial extent of the source is important for studying the mechanisms of brain activity and preoperative functional localization. Traditional methods tend to yield small-amplitude diffuse or large-amplitude focused source estimates. Recently, the structured sparsity-based source imaging algorithm has emerged as one of the most promising algorithms for improving source extent estimation. However, it suffers from a notable amplitude bias. To improve the spatiotemporal resolution of reconstructed sources, we propose a novel method called the source imaging method based on spatial smoothing and edge sparsity (SISSES). In this method, the temporal dynamics of sources are modeled using a set of temporal basis functions, and the spatial characteristics of the source are represented by a first-order Markov random field (MRF) model. In particular, sparse constraints are imposed on the MRF model residuals in the original and variation domains. Numerical simulations were conducted to validate the SISSES. The results demonstrate that SISSES outperforms benchmark methods for estimating the time course, location, and extent of patch sources. Additionally, auditory and median nerve stimulation experiments were performed using a 31-channel optically pumped magnetometer MEG system, and the SISSES was applied to the source imaging of these data. The results demonstrate that SISSES correctly identified the source regions in which brain responses occurred at different times, demonstrating its feasibility for various practical applications.
脑磁图 (MEG) 中的源估计涉及解决高度不适定问题，而没有唯一的解决方案。准确估计源的时间过程和空间范围对于研究大脑活动机制和术前功能定位具有重要意义。传统方法倾向于产生小幅度扩散或大幅度聚焦源估计。最近，基于结构化稀疏性的源成像算法已成为改进源范围估计的最有前途的算法之一。然而，它存在明显的幅度偏差。为了提高重建源的时空分辨率，我们提出了一种称为基于空间平滑和边缘稀疏性的源成像方法（SISSES）的新方法。在该方法中，使用一组时间基函数对源的时间动态进行建模，并通过一阶马尔可夫随机场（MRF）模型表示源的空间特征。特别是，对原始域和变化域中的 MRF 模型残差施加稀疏约束。进行数值模拟以验证 SISSES。结果表明，在估计补丁源的时间过程、位置和范围方面，SISSES 的性能优于基准方法。此外，使用 31 通道光泵磁力计 MEG 系统进行听觉和正中神经刺激实验，并将 SISSES 应用于这些数据的源成像。结果表明，SISSES 正确识别了不同时间发生大脑反应的源区域，证明了其在各种实际应用中的可行性。

AU Su, Jianpo Wang, Bo Fan, Zhipeng Zhang, Yifan Zeng, Ling-Li Shen, Hui Hu, Dewen
苏AU、王建坡、范博、张志鹏、曾一凡、沉玲丽、胡辉、德文

M2DC: A Meta-Learning Framework for Generalizable Diagnostic Classification of Major Depressive Disorder.
M2DC：重度抑郁症通用诊断分类的元学习框架。

Psychiatric diseases are bringing heavy burdens for both individual health and social stability. The accurate and timely diagnosis of the diseases is essential for effective treatment and intervention. Thanks to the rapid development of brain imaging technology and machine learning algorithms, diagnostic classification of psychiatric diseases can be achieved based on brain images. However, due to divergences in scanning machines or parameters, the generalization capability of diagnostic classification models has always been an issue. We propose Meta-learning with Meta batch normalization and Distance Constraint (M2DC) for training diagnostic classification models. The framework can simulate the train-test domain shift situation and promote intra-class cohesion, as well as inter-class separation, which can lead to clearer classification margins and more generalizable models. To better encode dynamic brain graphs, we propose a concatenated spatiotemporal attention graph isomorphism network (CSTAGIN) as the backbone. The network is trained for the diagnostic classification of major depressive disorder (MDD) based on multi-site brain graphs. Extensive experiments on brain images from over 3261 subjects show that models trained by M2DC achieve the best performance on cross-site diagnostic classification tasks compared to various contemporary domain generalization methods and SOTA studies. The proposed M2DC is by far the first framework for multi-source closed-set domain generalizable training of diagnostic classification models for MDD and the trained models can be applied to reliable auxiliary diagnosis on novel data.
精神疾病给个人健康和社会稳定带来沉重负担。准确、及时的疾病诊断对于有效的治疗和干预至关重要。得益于脑成像技术和机器学习算法的快速发展，基于脑图像可以实现精神疾病的诊断分类。然而，由于扫描机器或参数的差异，诊断分类模型的泛化能力一直是一个问题。我们提出使用元批量归一化和距离约束（M2DC）的元学习来训练诊断分类模型。该框架可以模拟训练-测试领域转移的情况，促进类内凝聚力和类间分离，从而获得更清晰的分类边界和更通用的模型。为了更好地编码动态脑图，我们提出了一个级联时空注意力图同构网络（CSTAGIN）作为主干。该网络经过训练，可根据多部位脑图对重度抑郁症 (MDD) 进行诊断分类。对超过 3261 名受试者的大脑图像进行的广泛实验表明，与各种当代领域泛化方法和 SOTA 研究相比，M2DC 训练的模型在跨站点诊断分类任务上实现了最佳性能。所提出的 M2DC 是迄今为止第一个用于 MDD 诊断分类模型的多源闭集域可推广训练的框架，训练后的模型可以应用于新数据的可靠辅助诊断。

AU Feng, Rui Yang, Jingwen Huang, Hao Chen, Zelin Feng, Ruiyan Farrukh Hameed, N U Zhang, Xudong Hu, Jie Chen, Liang Lu, Shuo
AU Feng、Rui Yang、Jingwen Huang、Hao Chen、Zelin Feng、Ruiyan Farrukh Hameed、NU 张、胡旭东、Jie Chen、Liang Lu、Shuo

Spatiotemporal Microstate Dynamics of Spike-free Scalp EEG Offer a Potential Biomarker for Refractory Temporal Lobe Epilepsy.
无尖峰头皮脑电图的时空微观状态动力学为难治性颞叶癫痫提供了潜在的生物标志物。

Refractory temporal lobe epilepsy (TLE) is one of the most frequently observed subtypes of epilepsy and endangers more than 50 million people world-wide. Although electroencephalogram (EEG) had been widely recognized as a classic tool to screen and diagnose epilepsy, for many years it heavily relied on identifying epileptic discharges and epileptogenic zone localization, which however, limits the understanding of refractory epilepsy due to the network nature of this disease. This work hypothesizes that the microstate dynamics based on resting-state scalp EEG can offer an additional network depiction of the disease and provide potential complementary evaluation tool for the TLE even without detectable epileptic discharges on EEG. We propose a novel framework for EEG microstate spatial-temporal dynamics (EEG-MiSTD) analysis based on machine learning to comprehensively model millisecond-changing whole-brain network dynamics. With only 100 seconds of resting-state EEG even without epileptic discharges, this approach successfully distinguishes TLE patients from healthy controls and is related to the lateralization of epileptic focus. Besides, microstate temporal and spatial features are found to be widely related to clinical parameters, which further demonstrate that TLE is a network disease. A preliminary exploration suggests that the spatial topography is sensitive to the following surgical outcomes. From such a new perspective, our results suggest that spatiotemporal microstate dynamics is potentially a biomarker of the disease. The developed EEG-MiSTD framework can probably be considered as a general tool to examine dynamical brain network disruption in a user-friendly way for other types of epilepsy.
难治性颞叶癫痫 (TLE) 是最常见的癫痫亚型之一，危害着全球超过 5000 万人。尽管脑电图（EEG）已被广泛认为是筛查和诊断癫痫的经典工具，但多年来它严重依赖于识别癫痫放电和致痫区定位，然而，由于其网络性质，这限制了对难治性癫痫的理解。疾病。这项工作假设基于静息态头皮脑电图的微态动力学可以提供疾病的额外网络描述，并为 TLE 提供潜在的补充评估工具，即使脑电图上没有可检测到的癫痫放电。我们提出了一种基于机器学习的脑电微状态时空动力学（EEG-MiSTD）分析的新框架，以全面模拟毫秒变化的全脑网络动力学。即使没有癫痫放电，这种方法也只需 100 秒的静息态脑电图即可成功区分 TLE 患者与健康对照，并且与癫痫病灶的偏侧化有关。此外，微状态时间和空间特征被发现与临床参数广泛相关，这进一步证明TLE是一种网络疾病。初步探索表明，空间地形对以下手术结果敏感。从这样一个新的角度来看，我们的结果表明时空微态动力学可能是该疾病的生物标志物。开发的 EEG-MiSTD 框架可能被视为一种通用工具，以用户友好的方式检查其他类型癫痫的动态脑网络中断。

AU Cui, Yue Li, Chengyi Lu, Yuheng Ma, Liang Cheng, Luqi Cao, Long Yu, Shan Jiang, Tianzi
崔AU、李悦、路成毅、马宇恒、程亮、曹露琪、余龙、姜山、田子

Multimodal Connectivity-Based Individual Parcellation and Analysis for Humans and Rhesus Monkeys
基于多模态连接的人类和恒河猴个体划分和分析

Individual brains vary greatly in morphology, connectivity and organization. Individualized brain parcellation is capable of precisely localizing subject-specific functional regions. However, most individualization approaches have examined single modalities of data and have not generalized to nonhuman primates. The present study proposed a novel multimodal connectivity-based individual parcellation (MCIP) method, which optimizes within-region homogeneity, spatial continuity and similarity to a reference atlas with the fusion of personal functional and anatomical connectivity. Comprehensive evaluation demonstrated that MCIP outperformed state-of-the-art multimodal individualization methods in terms of functional and anatomical homogeneity, predictability of cognitive measures, heritability, reproducibility and generalizability across species. Comparative investigation showed a higher topographic variability in humans than that in macaques. Therefore, MCIP provides improved accurate and reliable mapping of brain functional regions over existing methods at an individual level across species, and could facilitate comparative and translational neuroscience research.
个体大脑在形态、连接性和组织方面存在很大差异。个性化的大脑分区能够精确定位受试者特定的功能区域。然而，大多数个体化方法都检查了单一的数据模式，并没有推广到非人类灵长类动物。本研究提出了一种新颖的基于多模态连接的个体分割（MCIP）方法，该方法通过融合个人功能和解剖连接来优化区域内的均匀性、空间连续性和与参考图集的相似性。综合评估表明，MCIP 在功能和解剖同质性、认知测量的可预测性、遗传性、再现性和跨物种普遍性方面优于最先进的多模式个体化方法。比较研究表明，人类的地形变异性高于猕猴。因此，与现有方法相比，MCIP 在跨物种的个体水平上提供了更准确、更可靠的大脑功能区域图谱，并且可以促进比较和转化神经科学研究。

AU Chikontwe, Philip Kim, Meejeong Jeong, Jaehoon Sung, Hyun Jung Go, Heounjeong Nam, Soo Jeong Park, Sang Hyun
AU Chikontwe、Philip Kim、Meejeong Jeong、Jaehoon Sung、Hyun Jung Go、Heounjeong Nam、Soo Jeong Park、Sang Hyun

FR-MIL: Distribution Re-calibration based Multiple Instance Learning with Transformer for Whole Slide Image Classification.
FR-MIL：基于分布重新校准的多实例学习，使用 Transformer 进行整个幻灯片图像分类。

In digital pathology, whole slide images (WSI) are crucial for cancer prognostication and treatment planning. WSI classification is generally addressed using multiple instance learning (MIL), alleviating the challenge of processing billions of pixels and curating rich annotations. Though recent MIL approaches leverage variants of the attention mechanism to learn better representations, they scarcely study the properties of the data distribution itself i.e., different staining and acquisition protocols resulting in intra-patch and inter-slide variations. In this work, we first introduce a distribution re-calibration strategy to shift the feature distribution of a WSI bag (instances) using the statistics of the max-instance (critical) feature. Second, we enforce class (bag) separation via a metric loss assuming that positive bags exhibit larger magnitudes than negatives. We also introduce a generative process leveraging Vector Quantization (VQ) for improved instance discrimination i.e., VQ helps model bag latent factors for improved classification. To model spatial and context information, a position encoding module (PEM) is employed with transformer-based pooling by multi-head self-attention (PMSA). Evaluation of popular WSI benchmark datasets reveals our approach improves over state-of-the-art MIL methods. Further, we validate the general applicability of our method on classic MIL benchmark tasks and for point cloud classification with limited points https://github.com/PhilipChicco/FRMIL.
在数字病理学中，全幻灯片图像 (WSI) 对于癌症预测和治疗计划至关重要。 WSI 分类通常使用多实例学习 (MIL) 来解决，从而减轻了处理数十亿像素和管理丰富注释的挑战。尽管最近的 MIL 方法利用注意力机制的变体来学习更好的表示，但它们很少研究数据分布本身的属性，即不同的染色和采集协议导致补丁内和幻灯片间的变化。在这项工作中，我们首先引入一种分布重新校准策略，使用最大实例（关键）特征的统计数据来改变 WSI 包（实例）的特征分布。其次，我们通过度量损失强制进行类（袋）分离，假设正袋表现出比负袋更大的量级。我们还引入了利用矢量量化 (VQ) 来改进实例辨别的生成过程，即 VQ 有助于对包潜在因子进行建模以改进分类。为了对空间和上下文信息进行建模，位置编码模块（PEM）与多头自注意力（PMSA）基于变压器的池化结合使用。对流行的 WSI 基准数据集的评估表明，我们的方法比最先进的 MIL 方法有所改进。此外，我们验证了我们的方法在经典 MIL 基准任务和有限点点云分类上的普遍适用性 https://github.com/PhilipChicco/FRMIL。

AU Alkan, Cagan Mardani, Morteza Liao, Congyu Li, Zhitao Vasanawala, Shreyas S Pauly, John M
AU Alkan、Cagan Mardani、Morteza Liao、李从宇、Zhitao Vasanawala、Shreyas S Pauly、John M

AutoSamp: Autoencoding k-space Sampling via Variational Information Maximization for 3D MRI.
AutoSamp：通过 3D MRI 的变分信息最大化自动编码 k 空间采样。

Accelerated MRI protocols routinely involve a predefined sampling pattern that undersamples the k-space. Finding an optimal pattern can enhance the reconstruction quality, however this optimization is a challenging task. To address this challenge, we introduce a novel deep learning framework, AutoSamp, based on variational information maximization that enables joint optimization of sampling pattern and reconstruction of MRI scans. We represent the encoder as a non-uniform Fast Fourier Transform that allows continuous optimization of k-space sample locations on a non-Cartesian plane, and the decoder as a deep reconstruction network. Experiments on public 3D acquired MRI datasets show improved reconstruction quality of the proposed AutoSamp method over the prevailing variable density and variable density Poisson disc sampling for both compressed sensing and deep learning reconstructions. We demonstrate that our data-driven sampling optimization method achieves 4.4dB, 2.0dB, 0.75dB, 0.7dB PSNR improvements over reconstruction with Poisson Disc masks for acceleration factors of R = 5, 10, 15, 25, respectively. Prospectively accelerated acquisitions with 3D FSE sequences using our optimized sampling patterns exhibit improved image quality and sharpness. Furthermore, we analyze the characteristics of the learned sampling patterns with respect to changes in acceleration factor, measurement noise, underlying anatomy, and coil sensitivities. We show that all these factors contribute to the optimization result by affecting the sampling density, k-space coverage and point spread functions of the learned sampling patterns.
加速 MRI 协议通常涉及对 k 空间进行欠采样的预定义采样模式。找到最佳模式可以提高重建质量，但是这种优化是一项具有挑战性的任务。为了应对这一挑战，我们引入了一种新颖的深度学习框架 AutoSamp，它基于变分信息最大化，可以联合优化采样模式和 MRI 扫描重建。我们将编码器表示为非均匀快速傅里叶变换，允许在非笛卡尔平面上连续优化 k 空间样本位置，并将解码器表示为深度重建网络。对公共 3D 采集的 MRI 数据集进行的实验表明，对于压缩感知和深度学习重建，所提出的 AutoSamp 方法的重建质量优于流行的可变密度和可变密度泊松盘采样。我们证明，在加速因子 R = 5、10、15、25 的情况下，我们的数据驱动采样优化方法比使用泊松盘掩模重建的 PSNR 分别提高了 4.4dB、2.0dB、0.75dB、0.7dB。使用我们优化的采样模式通过 3D FSE 序列进行前瞻性加速采集，可提高图像质量和清晰度。此外，我们还分析了学习到的采样模式在加速因子、测量噪声、底层解剖结构和线圈灵敏度方面的变化特征。我们表明，所有这些因素都通过影响学习采样模式的采样密度、k 空间覆盖范围和点扩散函数来对优化结果做出贡献。

AU Pei, Jialun Guo, Diandian Zhang, Jingyang Lin, Manxi Jin, Yueming Heng, Pheng-Ann
裴AU、郭嘉伦、张典典、林景阳、金曼希、衡月明、彭安

S2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR.
S2Former-OR：用于 OR 中场景图生成的单级双模态变压器。

Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR). However, previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. This pipeline may potentially compromise the flexibility of learning multimodal representations, consequently constraining the overall effectiveness. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed, S2Former-OR, aimed to complementally leverage multi-view 2D scenes and 3D point clouds for SGG in an end-to-end manner. Concretely, our model embraces a View-Sync Transfusion scheme to encourage multi-view visual information interaction. Concurrently, a Geometry-Visual Cohesion operation is designed to integrate the synergic 2D semantic features into 3D point cloud features. Moreover, based on the augmented feature, we propose a novel relation-sensitive transformer decoder that embeds dynamic entity-pair queries and relational trait priors, which enables the direct prediction of entity-pair relations for graph generation without intermediate steps. Extensive experiments have validated the superior SGG performance and lower computational cost of S2Former-OR on 4D-OR benchmark, compared with current OR-SGG methods, e.g., 3 percentage points Precision increase and 24.2M reduction in model parameters. We further compared our method with generic single-stage SGG methods with broader metrics for a comprehensive evaluation, with consistently better performance achieved. Our source code can be made available at: https://github.com/PJLallen/S2Former-OR.
外科手术的场景图生成 (SGG) 对于增强手术室 (OR) 的整体认知智能至关重要。然而，以前的工作主要依赖于多阶段学习，其中生成的语义场景图依赖于姿态估计和对象检测的中间过程。该管道可能会损害学习多模态表示的灵活性，从而限制整体有效性。在这项研究中，我们为 OR 中的 SGG 引入了一种新颖的单级双模态转换器框架，称为 S2Former-OR，旨在以端到端的方式互补地利用 SGG 的多视图 2D 场景和 3D 点云方式。具体来说，我们的模型采用视图同步传输方案来鼓励多视图视觉信息交互。同时，几何-视觉凝聚操作旨在将协同 2D 语义特征集成到 3D 点云特征中。此外，基于增强的特征，我们提出了一种新颖的关系敏感转换器解码器，它嵌入了动态实体对查询和关系特征先验，这使得能够直接预测实体对关系以进行图生成，而无需中间步骤。大量实验验证了S2Former-OR在4D-OR基准上与现有OR-SGG方法相比具有优越的SGG性能和更低的计算成本，例如精度提高了3个百分点，模型参数减少了24.2M。我们进一步将我们的方法与具有更广泛指标的通用单阶段 SGG 方法进行比较，以进行综合评估，并始终取得更好的性能。我们的源代码可以在以下位置获取：https://github.com/PJLallen/S2Former-OR。

AU Zhang, Jingke Huang, Chengwu Lok, U-Wai Dong, Zhijie Liu, Hui Gong, Ping Song, Pengfei Chen, Shigao
张AU、黄景科、乐成武、董宇伟、刘志杰、龚慧、宋平、陈鹏飞、石高

Enhancing Row-column array (RCA)-based 3D ultrasound vascular imaging with spatial-temporal similarity weighting.
通过时空相似性加权增强基于行列阵列 (RCA) 的 3D 超声血管成像。

Ultrasound vascular imaging (UVI) is a valuable tool for monitoring the physiological states and evaluating the pathological diseases. Advancing from conventional two-dimensional (2D) to three-dimensional (3D) UVI would enhance the vasculature visualization, thereby improving its reliability. Row-column array (RCA) has emerged as a promising approach for cost-effective ultrafast 3D imaging with a low channel count. However, ultrafast RCA imaging is often hampered by high-level sidelobe artifacts and low signal-to-noise ratio (SNR), which makes RCA-based UVI challenging. In this study, we propose a spatial-temporal similarity weighting (St-SW) method to overcome these challenges by exploiting the incoherence of sidelobe artifacts and noise between datasets acquired using orthogonal transmissions. Simulation, in vitro blood flow phantom, and in vivo experiments were conducted to compare the proposed method with existing orthogonal plane wave imaging (OPW), row-column-specific frame-multiply-and-sum beamforming (RC-FMAS), and XDoppler techniques. Qualitative and quantitative results demonstrate the superior performance of the proposed method. In simulations, the proposed method reduced the sidelobe level by 31.3 dB, 20.8 dB, and 14.0 dB, compared to OPW, XDoppler, and RC-FMAS, respectively. In the blood flow phantom experiment, the proposed method significantly improved the contrast-to-noise ratio (CNR) of the tube by 26.8 dB, 25.5 dB, and 19.7 dB, compared to OPW, XDoppler, and RC-FMAS methods, respectively. In the human submandibular gland experiment, it not only reconstructed a more complete vasculature but also improved the CNR by more than 15 dB, compared to OPW, XDoppler, and RC-FMAS methods. In summary, the proposed method effectively suppresses the side-lobe artifacts and noise in images collected using an RCA under low SNR conditions, leading to improved visualization of 3D vasculatures.
超声血管成像（UVI）是监测生理状态和评估病理疾病的重要工具。从传统的二维 (2D) 发展到三维 (3D) UVI 将增强脉管系统的可视化，从而提高其可靠性。行列阵列 (RCA) 已成为一种具有成本效益的低通道超快 3D 成像的有前途的方法。然而，超快 RCA 成像常常受到高水平旁瓣伪影和低信噪比 (SNR) 的阻碍，这使得基于 RCA 的 UVI 具有挑战性。在本研究中，我们提出了一种时空相似性加权（St-SW）方法，通过利用正交传输获取的数据集之间的旁瓣伪影和噪声的不相干性来克服这些挑战。进行了模拟、体外血流模型和体内实验，以将所提出的方法与现有的正交平面波成像（OPW）、行列特定帧乘和波束形成（RC-FMAS）和XDoppler进行比较技术。定性和定量结果证明了该方法的优越性能。在仿真中，与 OPW、XDoppler 和 RC-FMAS 相比，所提出的方法分别将旁瓣电平降低了 31.3 dB、20.8 dB 和 14.0 dB。在血流模型实验中，与OPW、XDoppler和RC-FMAS方法相比，该方法分别显着提高了管的对比度噪声比（CNR）26.8 dB、25.5 dB和19.7 dB。在人体颌下腺实验中，与OPW、XDoppler和RC-FMAS方法相比，它不仅重建了更完整的脉管系统，而且将CNR提高了15 dB以上。总之，所提出的方法有效地抑制了在低 SNR 条件下使用 RCA 收集的图像中的旁瓣伪影和噪声，从而改善了 3D 脉管系统的可视化。

AU Khan, Md Hadiur Rahman Righetti, Raffaella
AU Khan、Md Hadiur Rahman Righetti、Raffaella

A Novel Poroelastography Method for High-quality Estimation of Lateral Strain, Solid Stress and Fluid Pressure In Vivo.
一种新颖的孔隙弹性成像方法，用于高质量估计体内横向应变、固体应力和流体压力。

Assessment of mechanical and transport properties of tissues using ultrasound elasticity imaging requires accurate estimations of the spatiotemporal distribution of volumetric strain. Due to physical constraints such as pitch limitation and the lack of phase information in the lateral direction, the quality of lateral strain estimation is typically significantly lower than the quality of axial strain estimation. In this paper, a novel lateral strain estimation technique based on the physics of compressible porous media is developed, tested and validated. This technique is referred to as "Poroelastography-based Ultrasound Lateral Strain Estimation" (PULSE). PULSE differs from previously proposed lateral strain estimators as it uses the underlying physics of internal fluid flow within a local region of the tissue as theoretical foundation. PULSE establishes a relation between spatiotemporal changes in the axial strains and corresponding spatiotemporal changes in the lateral strains, effectively allowing assessment of lateral strains with comparable quality of axial strain estimators. We demonstrate that PULSE can also be used to accurately track compression-induced solid stresses and fluid pressure in cancers using ultrasound poroelastography (USPE). In this study, we report the theoretical formulation for PULSE and validation using finite element (FE) and ultrasound simulations. PULSE-generated results exhibit less than 5% percentage relative error (PRE) and greater than 90% structural similarity index (SSIM) compared to ground truth simulations. Experimental results are included to qualitatively assess the performance of PULSE in vivo. The proposed method can be used to overcome the inherent limitations of non-axial strain imaging and improve clinical translatability of USPE.
使用超声弹性成像评估组织的机械和传输特性需要准确估计体积应变的时空分布。由于诸如俯仰限制和横向方向上相位信息的缺乏等物理限制，横向应变估计的质量通常明显低于轴向应变估计的质量。本文开发、测试和验证了一种基于可压缩多孔介质物理学的新型横向应变估计技术。该技术被称为“基于孔隙弹性成像的超声横向应变估计”(PULSE)。 PULSE 与之前提出的横向应变估计器不同，因为它使用组织局部区域内的内部流体流动的基本物理原理作为理论基础。 PULSE 建立了轴向应变的时空变化与横向应变相应的时空变化之间的关系，有效地允许以与轴向应变估计器相当的质量来评估横向应变。我们证明，PULSE 还可用于使用超声孔隙弹性成像 (USPE) 准确跟踪癌症中压缩引起的固体应力和流体压力。在这项研究中，我们报告了 PULSE 的理论公式以及使用有限元 (FE) 和超声模拟进行的验证。与地面真实模拟相比，PULSE 生成的结果显示出小于 5% 的相对误差百分比 (PRE) 和大于 90% 的结构相似性指数 (SSIM)。实验结果用于定性评估 PULSE 的体内性能。该方法可用于克服非轴向应变成像的固有局限性并提高USPE的临床可转化性。

AU Wen, Zhijie Wu, Haixia Ying, Shihui
区文、吴志杰、应海霞、石慧

Histopathology Image Classification With Noisy Labels via The Ranking Margins
通过排名边缘使用噪声标签进行组织病理学图像分类

Clinically, histopathology images always offer a golden standard for disease diagnosis. With the development of artificial intelligence, digital histopathology significantly improves the efficiency of diagnosis. Nevertheless, noisy labels are inevitable in histopathology images, which lead to poor algorithm efficiency. Curriculum learning is one of the typical methods to solve such problems. However, existing curriculum learning methods either fail to measure the training priority between difficult samples and noisy ones or need an extra clean dataset to establish a valid curriculum scheme. Therefore, a new curriculum learning paradigm is designed based on a proposed ranking function, which is named The Ranking Margins (TRM). The ranking function measures the 'distances' between samples and decision boundaries, which helps distinguish difficult samples and noisy ones. The proposed method includes three stages: the warm-up stage, the main training stage and the fine-tuning stage. In the warm-up stage, the margin of each sample is obtained through the ranking function. In the main training stage, samples are progressively fed into the networks for training, starting from those with larger margins to those with smaller ones. Label correction is also performed in this stage. In the fine-tuning stage, the networks are retrained on the samples with corrected labels. In addition, we provide theoretical analysis to guarantee the feasibility of TRM. The experiments on two representative histopathologies image datasets show that the proposed method achieves substantial improvements over the latest Label Noise Learning (LNL) methods.
临床上，组织病理学图像始终为疾病诊断提供黄金标准。随着人工智能的发展，数字组织病理学显着提高了诊断效率。然而，组织病理学图像中不可避免地会出现噪声标签，这导致算法效率较差。课程学习是解决此类问题的典型方法之一。然而，现有的课程学习方法要么无法衡量困难样本和噪声样本之间的训练优先级，要么需要额外干净的数据集来建立有效的课程方案。因此，基于所提出的排名函数设计了一种新的课程学习范式，称为排名利润（TRM）。排序函数测量样本和决策边界之间的“距离”，这有助于区分困难样本和噪声样本。该方法包括三个阶段：热身阶段、主训练阶段和微调阶段。在预热阶段，通过排序函数获得每个样本的margin。在主要训练阶段，样本被逐步输入网络进行训练，从边缘较大的样本开始，到边缘较小的样本。标签校正也在这个阶段进行。在微调阶段，网络在具有正确标签的样本上进行重新训练。此外，我们还提供了理论分析来保证TRM的可行性。在两个代表性组织病理学图像数据集上的实验表明，所提出的方法比最新的标签噪声学习（LNL）方法取得了实质性改进。

C1 Shanghai Univ, Coll Sci, Dept Math, Shanghai 200444, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-08-18 UT WOS:001285367200006 PM 38526889 ER
C1 上海大学理学院数学系, 上海 200444, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-08-18 UT WOS:001285367200006 PM 38526889 ER

AU Muller, Philip Meissen, Felix Kaissis, Georgios Rueckert, Daniel
AU Muller、Philip Meissen、Felix Kaissis、Georgios Rueckert、Daniel

Weakly Supervised Object Detection in Chest X-Rays with Differentiable ROI Proposal Networks and Soft ROI Pooling.
具有可微分 ROI 建议网络和软 ROI 池化的胸部 X 射线弱监督对象检测。

Weakly supervised object detection (WSup-OD) increases the usefulness and interpretability of image classification algorithms without requiring additional supervision. The successes of multiple instance learning in this task for natural images, however, do not translate well to medical images due to the very different characteristics of their objects (i.e. pathologies). In this work, we propose Weakly Supervised ROI Proposal Networks (WSRPN), a new method for generating bounding box proposals on the fly using a specialized region of interest-attention (ROI-attention) module. WSRPN integrates well with classic backbone-head classification algorithms and is end-to-end trainable with only image-label supervision. We experimentally demonstrate that our new method outperforms existing methods in the challenging task of disease localization in chest X-ray images. Code: https://anonymous.4open.science/r/WSRPN-DCA1.
弱监督对象检测 (WSup-OD) 提高了图像分类算法的实用性和可解释性，而无需额外的监督。然而，由于自然图像的对象（即病理）的特征非常不同，多实例学习在自然图像任务中的成功并不能很好地转化为医学图像。在这项工作中，我们提出了弱监督 ROI 建议网络（WSRPN），这是一种使用专门的兴趣关注区域（ROI-注意）模块动态生成边界框建议的新方法。 WSRPN 与经典的骨干头部分类算法很好地集成，并且仅通过图像标签监督即可进行端到端训练。我们通过实验证明，在胸部 X 射线图像中疾病定位的挑战性任务中，我们的新方法优于现有方法。代码：https://anonymous.4open.science/r/WSRPN-DCA1。

AU He, Jie Zhang, Haoran Li, Yimeng Li, Guanghui Lei, Siao Qian, Zhumei Xiong, Fei Feng, Yuan Zhu, Tao An, Yu Tian, Jie
区赫、张杰、李浩然、李亦萌、雷光辉、萧谦、熊朱梅、冯飞、朱元、安涛、田雨、杰

Sequential Scan-Based Single-Dimension Multi-Voxel System Matrix Calibration for Open-Sided Magnetic Particle Imaging.
用于开放式磁粒子成像的基于顺序扫描的单维多体素系统矩阵校准。

Open-sided magnetic particle imaging (OS-MPI) has garnered significant interest due to its potential for interventional applications. However, the system matrix calibration in OS-MPI using sequential scans is a time-consuming task and susceptible to the low signal-to-noise ratio (SNR) resulting from the small calibration sample size. These challenges have hindered the practical implementation of system matrix-based reconstruction for sequentially scanned OS-MPI. To address these issues, we propose a novel calibration method, named sequential scan-based single-dimension multi-voxel calibration (SS-SDMVC), to efficiently obtain a high-SNR system matrix. This method was implemented in a cylindrical field of view (FOV), where a bar calibration sample parallel to the field-free line (FFL) was shifted along a fixed radial direction. A standard image reconstruction process was also introduced to verify the feasibility of SS-SDMVC. Through simulations, we analyzed the effects of noise levels and scanner imperfections on the SS-SDMVC-based reconstruction and demonstrated its robustness. In experiments, we compared the imaging performance of SS-SDMVC and the sequential scan-based traditional cubic-FOV SMC. The results showed that SS-SDMVC reduced the number of measurements by a factor of 210.94 and achieved higher reconstruction quality. Therefore, SS-SDMVC is expected to improve the reconstruction quality of human-scale or high-gradient FFL MPI scanners.
开放式磁粒子成像（OS-MPI）由于其介入应用的潜力而引起了人们的极大兴趣。然而，使用顺序扫描的 OS-MPI 中的系统矩阵校准是一项耗时的任务，并且容易受到校准样本量小导致的低信噪比 (SNR) 的影响。这些挑战阻碍了顺序扫描 OS-MPI 的基于系统矩阵的重建的实际实现。为了解决这些问题，我们提出了一种新颖的校准方法，称为基于顺序扫描的单维多体素校准（SS-SDMVC），以有效地获得高信噪比系统矩阵。该方法在圆柱视场 (FOV) 中实施，其中平行于无场线 (FFL) 的棒校准样本沿固定径向方向移动。还引入了标准图像重建过程来验证SS-SDMVC的可行性。通过仿真，我们分析了噪声水平和扫描仪缺陷对基于 SS-SDMVC 的重建的影响，并证明了其鲁棒性。在实验中，我们比较了 SS-SDMVC 和基于顺序扫描的传统立方视场 SMC 的成像性能。结果表明，SS-SDMVC将测量次数减少了210.94倍，并获得了更高的重建质量。因此，SS-SDMVC有望提高人体尺度或高梯度FFL MPI扫描仪的重建质量。

AU Luo, Xiang Li, Zhongyu Xu, Canhua Zhang, Bite Zhang, Liangliang Zhu, Jihua Huang, Peng Wang, Xin Yang, Meng Chang, Shi
AU罗、李翔、徐中宇、张灿华、张比特、朱亮亮、黄继华、王鹏、杨欣、常孟、石

Semi-Supervised Thyroid Nodule Detection in Ultrasound Videos
超声视频中的半监督甲状腺结节检测

Deep learning techniques have been investigated for the computer-aided diagnosis of thyroid nodules in ultrasound images. However, most existing thyroid nodule detection methods were simply based on static ultrasound images, which cannot well explore spatial and temporal information following the clinical examination process. In this paper, we propose a novel video-based semi-supervised framework for ultrasound thyroid nodule detection. Especially, considering clinical examinations that need to detect thyroid nodules at the ultrasonic probe positions, we first construct an adjacent frame guided detection backbone network by using adjacent supporting reference frames. To further reduce the labour-intensive thyroid nodule annotation in ultrasound videos, we extend the video-based detection in a semi-supervised manner by using both labeled and unlabeled videos. Based on the detection consistency in sequential neighbouring frames, a pseudo label adaptation strategy is proposed for the refinement of unpredicted frames. The proposed framework is validated on 996 transverse viewed and 1088 longitudinal viewed ultrasound videos. Experimental results demonstrated the superior performance of our proposed method in the ultrasound video-based detection of thyroid nodules.
深度学习技术已被研究用于超声图像中甲状腺结节的计算机辅助诊断。然而，现有的甲状腺结节检测方法大多简单地基于静态超声图像，不能很好地挖掘临床检查过程中的空间和时间信息。在本文中，我们提出了一种用于超声甲状腺结节检测的新型基于视频的半监督框架。特别是，考虑到临床检查需要在超声探头位置检测甲状腺结节，我们首先利用相邻支撑参考帧构建相邻帧引导检测主干网络。为了进一步减少超声视频中劳动密集型甲状腺结节注释，我们通过使用标记和未标记视频以半监督方式扩展基于视频的检测。基于连续相邻帧的检测一致性，提出了一种伪标签自适应策略来细化不可预测的帧。所提出的框架在 996 个横向查看和 1088 个纵向查看的超声视频上进行了验证。实验结果证明了我们提出的方法在基于超声视频的甲状腺结节检测中的优越性能。

AU Pei, Chenhao Wu, Fuping Yang, Mingjing Pan, Lin Ding, Wangbin Dong, Jinwei Huang, Liqin Zhuang, Xiahai
裴区、吴晨浩、杨富平、潘明镜、丁林、董王斌、黄金伟、庄立勤、夏海

Multi-Source Domain Adaptation for Medical Image Segmentation
医学图像分割的多源域适应

Unsupervised domain adaptation(UDA) aims to mitigate the performance drop of models tested on the target domain, due to the domain shift from the target to sources. Most UDA segmentation methods focus on the scenario of solely single source domain. However, in practical situations data with gold standard could be available from multiple sources (domains), and the multi-source training data could provide more information for knowledge transfer. How to utilize them to achieve better domain adaptation yet remains to be further explored. This work investigates multi-source UDA and proposes a new framework for medical image segmentation. Firstly, we employ a multi-level adversarial learning scheme to adapt features at different levels between each of the source domains and the target, to improve the segmentation performance. Then, we propose a multi-model consistency loss to transfer the learned multi-source knowledge to the target domain simultaneously. Finally, we validated the proposed framework on two applications, i.e., multi-modality cardiac segmentation and cross-modality liver segmentation. The results showed our method delivered promising performance and compared favorably to state-of-the-art approaches.
无监督域适应（UDA）旨在减轻由于域从目标到源的转移而在目标域上测试的模型的性能下降。大多数 UDA 分割方法侧重于单一源域的场景。然而，在实际情况中，具有黄金标准的数据可以从多个来源（领域）获得，并且多源训练数据可以为知识转移提供更多信息。如何利用它们来实现更好的领域适应仍有待进一步探索。这项工作研究了多源 UDA 并提出了一种新的医学图像分割框架。首先，我们采用多级对抗学习方案来适应每个源域和目标之间不同级别的特征，以提高分割性能。然后，我们提出了多模型一致性损失，将学习到的多源知识同时转移到目标领域。最后，我们在两个应用程序上验证了所提出的框架，即多模态心脏分割和跨模态肝脏分割。结果表明，我们的方法具有良好的性能，并且与最先进的方法相比具有优势。

AU Qu, Gang Orlichenko, Anton Wang, Junqi Zhang, Gemeng Xiao, Li Zhang, Kun Wilson, Tony W. Stephen, Julia M. Calhoun, Vince D. Wang, Yu-Ping
AU Qu、Gang Orlichenko、Anton Wang、张俊奇、肖格猛、张力、Kun Wilson、Tony W. Stephen、Julia M. Calhoun、Vince D. Wang、Yu-Ping

Interpretable Cognitive Ability Prediction: A Comprehensive Gated Graph Transformer Framework for Analyzing Functional Brain Networks
可解释的认知能力预测：用于分析功能性大脑网络的综合门控图转换器框架

Graph convolutional deep learning has emerged as a promising method to explore the functional organization of the human brain in neuroscience research. This paper presents a novel framework that utilizes the gated graph transformer (GGT) model to predict individuals' cognitive ability based on functional connectivity (FC) derived from fMRI. Our framework incorporates prior spatial knowledge and uses a random-walk diffusion strategy that captures the intricate structural and functional relationships between different brain regions. Specifically, our approach employs learnable structural and positional encodings (LSPE) in conjunction with a gating mechanism to efficiently disentangle the learning of positional encoding (PE) and graph embeddings. Additionally, we utilize the attention mechanism to derive multi-view node feature embeddings and dynamically distribute propagation weights between each node and its neighbors, which facilitates the identification of significant biomarkers from functional brain networks and thus enhances the interpretability of the findings. To evaluate our proposed model in cognitive ability prediction, we conduct experiments on two large-scale brain imaging datasets: the Philadelphia Neurodevelopmental Cohort (PNC) and the Human Connectome Project (HCP). The results show that our approach not only outperforms existing methods in prediction accuracy but also provides superior explainability, which can be used to identify important FCs underlying cognitive behaviors.
图卷积深度学习已成为神经科学研究中探索人脑功能组织的一种有前景的方法。本文提出了一种新颖的框架，该框架利用门控图变换器（GGT）模型根据功能磁共振成像得出的功能连接（FC）来预测个体的认知能力。我们的框架结合了先前的空间知识，并使用随机游走扩散策略来捕获不同大脑区域之间复杂的结构和功能关系。具体来说，我们的方法采用可学习的结构和位置编码（LSPE）结合门控机制来有效地解开位置编码（PE）和图嵌入的学习。此外，我们利用注意力机制来导出多视图节点特征嵌入，并在每个节点及其邻居之间动态分配传播权重，这有助于从功能性大脑网络中识别重要的生物标志物，从而增强研究结果的可解释性。为了评估我们提出的认知能力预测模型，我们在两个大规模脑成像数据集上进行了实验：费城神经发育队列（PNC）和人类连接组项目（HCP）。结果表明，我们的方法不仅在预测准确性方面优于现有方法，而且提供了卓越的可解释性，可用于识别认知行为背后的重要 FC。

AU Yang, Han Wang, Qiuli Zhang, Yue An, Zhulin Liu, Chen Zhang, Xiaohong Zhou, S. Kevin
欧阳、王瀚、张秋丽、安悦、刘竹林、张晨、周晓红、S. Kevin

Lung Nodule Segmentation and Uncertain Region Prediction With an Uncertainty-Aware Attention Mechanism
利用不确定性感知机制进行肺结节分割和不确定区域预测

Radiologists possess diverse training and clinical experiences, leading to variations in the segmentation annotations of lung nodules and resulting in segmentation uncertainty. Conventional methods typically select a single annotation as the learning target or attempt to learn a latent space comprising multiple annotations. However, these approaches fail to leverage the valuable information inherent in the consensus and disagreements among the multiple annotations. In this paper, we propose an Uncertainty-Aware Attention Mechanism (UAAM) that utilizes consensus and disagreements among multiple annotations to facilitate better segmentation. To this end, we introduce the Multi-Confidence Mask (MCM), which combines a Low-Confidence (LC) Mask and a High-Confidence (HC) Mask. The LC mask indicates regions with low segmentation confidence, where radiologists may have different segmentation choices. Following UAAM, we further design an Uncertainty-Guide Multi-Confidence Segmentation Network (UGMCS-Net), which contains three modules: a Feature Extracting Module that captures a general feature of a lung nodule, an Uncertainty-Aware Module that produces three features for the annotations' union, intersection, and annotation set, and an Intersection-Union Constraining Module that uses distances between the three features to balance the predictions of final segmentation and MCM. To comprehensively demonstrate the performance of our method, we propose a Complex-Nodule Validation on LIDC-IDRI, which tests UGMCS-Net's segmentation performance on lung nodules that are difficult to segment using common methods. Experimental results demonstrate that our method can significantly improve the segmentation performance on nodules that are difficult to segment using conventional methods.
放射科医生拥有不同的培训和临床经验，导致肺结节的分割注释存在差异，从而导致分割的不确定性。传统方法通常选择单个注释作为学习目标或尝试学习包含多个注释的潜在空间。然而，这些方法无法利用多个注释之间的共识和分歧所固有的有价值的信息。在本文中，我们提出了一种不确定性感知注意机制（UAAM），它利用多个注释之间的共识和分歧来促进更好的分割。为此，我们引入了多置信度掩模（MCM），它结合了低置信度（LC）掩模和高置信度（HC）掩模。 LC 掩模表示分割置信度较低的区域，放射科医生可能有不同的分割选择。继UAAM之后，我们进一步设计了一个不确定性引导多置信分割网络（UGMCS-Net），它包含三个模块：一个捕获肺结节一般特征的特征提取模块，一个为肺结节产生三个特征的不确定性感知模块。注释的并集、交集和注释集，以及一个交集并集约束模块，该模块使用三个特征之间的距离来平衡最终分割和 MCM 的预测。为了全面展示我们方法的性能，我们提出了 LIDC-IDRI 上的复杂结节验证，它测试了 UGMCS-Net 对使用常见方法难以分割的肺结节的分割性能。实验结果表明，我们的方法可以显着提高传统方法难以分割的结节的分割性能。

AU Ling, Yating Wang, Yuling Dai, Wenli Yu, Jie Liang, Ping Kong, Dexing
区玲、王雅婷、戴玉玲、余文丽、梁杰、孔平、德兴

MTANet: Multi-Task Attention Network for Automatic Medical Image Segmentation and Classification
MTANet：用于自动医学图像分割和分类的多任务注意网络

Medical image segmentation and classification are two of the most key steps in computer-aided clinical diagnosis. The region of interest were usually segmented in a proper manner to extract useful features for further disease classification. However, these methods are computationally complex and time-consuming. In this paper, we proposed a one-stage multi-task attention network (MTANet) which efficiently classifies objects in an image while generating a high-quality segmentation mask for each medical object. A reverse addition attention module was designed in the segmentation task to fusion areas in global map and boundary cues in high-resolution features, and an attention bottleneck module was used in the classification task for image feature and clinical feature fusion. We evaluated the performance of MTANet with CNN-based and transformer-based architectures across three imaging modalities for different tasks: CVC-ClinicDB dataset for polyp segmentation, ISIC-2018 dataset for skin lesion segmentation, and our private ultrasound dataset for liver tumor segmentation and classification. Our proposed model outperformed state-of-the-art models on all three datasets and was superior to all 25 radiologists for liver tumor diagnosis.
医学图像分割和分类是计算机辅助临床诊断中最关键的两个步骤。感兴趣的区域通常以适当的方式进行分割，以提取有用的特征以进行进一步的疾病分类。然而，这些方法计算复杂且耗时。在本文中，我们提出了一种单阶段多任务注意网络（MTANet），它可以有效地对图像中的对象进行分类，同时为每个医疗对象生成高质量的分割掩模。在分割任务中设计了反向加法注意力模块，以融合全局地图中的区域和高分辨率特征中的边界线索，并在图像特征和临床特征融合的分类任务中使用注意力瓶颈模块。我们评估了 MTANet 的性能，采用基于 CNN 和基于 Transformer 的架构，跨三种成像模式执行不同的任务：用于息肉分割的 CVC-ClinicDB 数据集、用于皮肤病变分割的 ISIC-2018 数据集以及用于肝脏肿瘤分割和诊断的私人超声数据集。分类。我们提出的模型在所有三个数据集上都优于最先进的模型，并且在肝脏肿瘤诊断方面优于所有 25 名放射科医生。

AU Chen, Zeyuan Zheng, Yuanjie Gee, James C.
AU Chen, 郑泽元, 吉元杰, James C.

TransMatch: A Transformer-Based Multilevel Dual-Stream Feature Matching Network for Unsupervised Deformable Image Registration
TransMatch：基于 Transformer 的多级双流特征匹配网络，用于无监督可变形图像配准

Feature matching, which refers to establishing the correspondence of regions between two images (usually voxel features), is a crucial prerequisite of feature-based registration. For deformable image registration tasks, traditional feature-based registration methods typically use an iterative matching strategy for interest region matching, where feature selection and matching are explicit, but specific feature selection schemes are often useful in solving application-specific problems and require several minutes for each registration. In the past few years, the feasibility of learning-based methods, such as VoxelMorph and TransMorph, has been proven, and their performance has been shown to be competitive compared to traditional methods. However, these methods are usually single-stream, where the two images to be registered are concatenated into a 2-channel whole, and then the deformation field is output directly. The transformation of image features into interimage matching relationships is implicit. In this paper, we propose a novel end-to-end dual-stream unsupervised framework, named TransMatch, where each image is fed into a separate stream branch, and each branch performs feature extraction independently. Then, we implement explicit multilevel feature matching between image pairs via the query-key matching idea of the self-attention mechanism in the Transformer model. Comprehensive experiments are conducted on three 3D brain MR datasets, LPBA40, IXI, and OASIS, and the results show that the proposed method achieves state-of-the-art performance in several evaluation metrics compared to the commonly utilized registration methods, including SyN, NiftyReg, VoxelMorph, CycleMorph, ViT-V-Net, and TransMorph, demonstrating the effectiveness of our model in deformable medical image registration.
特征匹配是指建立两幅图像之间区域（通常是体素特征）的对应关系，是基于特征配准的重要前提。对于可变形图像配准任务，传统的基于特征的配准方法通常使用迭代匹配策略进行兴趣区域匹配，其中特征选择和匹配是明确的，但特定的特征选择方案通常在解决特定应用问题时有用，并且需要几分钟的时间每次注册。在过去的几年中，基于学习的方法（例如VoxelMorph和TransMorph）的可行性已被证明，并且与传统方法相比，它们的性能已被证明具有竞争力。然而，这些方法通常是单流的，将待配准的两个图像连接成一个2通道整体，然后直接输出变形场。图像特征到图像间匹配关系的转换是隐式的。在本文中，我们提出了一种新颖的端到端双流无监督框架，名为 TransMatch，其中每个图像被馈送到单独的流分支中，每个分支独立地执行特征提取。然后，我们通过 Transformer 模型中自注意力机制的查询键匹配思想实现图像对之间的显式多级特征匹配。在三个 3D 大脑 MR 数据集 LPBA40、IXI 和 OASIS 上进行了综合实验，结果表明，与常用的配准方法（包括 SyN、 NiftyReg、VoxelMorph、CycleMorph、ViT-V-Net 和 TransMorph，展示了我们的模型在可变形医学图像配准中的有效性。

AU Lin, Zefan Quan, Guotao Qu, Haixian Du, Yanfeng Zhao, Jun
区林、权泽凡、曲国涛、杜海贤、赵彦峰、Jun

LOQUAT: Low-Rank Quaternion Reconstruction for Photon-Counting CT.
LOQUAT：光子计数 CT 的低阶四元数重建。

Photon-counting computed tomography (PCCT) may dramatically benefit clinical practice due to its versatility such as dose reduction and material characterization. However, the limited number of photons detected in each individual energy bin can induce severe noise contamination in the reconstructed image. Fortunately, the notable low-rank prior inherent in the PCCT image can guide the reconstruction to a denoised outcome. To fully excavate and leverage the intrinsic low-rankness, we propose a novel reconstruction algorithm based on quaternion representation (QR), called low-rank quaternion reconstruction (LOQUAT). First, we organize a group of nonlocal similar patches into a quaternion matrix. Then, an adjusted weighted Schatten-p norm (AWSN) is introduced and imposed on the matrix to enforce its low-rank nature. Subsequently, we formulate an AWSN-regularized model and devise an alternating direction method of multipliers (ADMM) framework to solve it. Experiments on simulated and real-world data substantiate the superiority of the LOQUAT technique over several state-of-the-art competitors in terms of both visual inspection and quantitative metrics. Moreover, our QR-based method exhibits lower computational complexity than some popular tensor representation (TR) based counterparts. Besides, the global convergence of LOQUAT is theoretically established under a mild condition. These properties bolster the robustness and practicality of LOQUAT, facilitating its application in PCCT clinical scenarios. The source code will be available at https://github.com/linzf23/LOQUAT.
光子计数计算机断层扫描（PCCT）由于其多功能性（例如剂量减少和材料表征）可能会极大地有益于临床实践。然而，在每个单独的能量箱中检测到的有限数量的光子可能会在重建图像中引起严重的噪声污染。幸运的是，PCCT 图像中固有的显着低秩先验可以指导重建得到去噪结果。为了充分挖掘和利用内在的低秩性，我们提出了一种基于四元数表示（QR）的新型重建算法，称为低秩四元数重建（LOQUAT）。首先，我们将一组非局部相似斑块组织成一个四元数矩阵。然后，引入调整后的加权 Schatten-p 范数 (AWSN) 并将其强加于矩阵以强化其低秩性质。随后，我们制定了 AWSN 正则化模型，并设计了乘子交替方向法 (ADMM) 框架来解决它。对模拟和真实世界数据的实验证实了 LOQUAT 技术在目视检查和定量指标方面优于几个最先进的竞争对手。此外，我们基于 QR 的方法比一些流行的基于张量表示（TR）的方法表现出更低的计算复杂度。此外，枇杷的全局收敛性在理论上是在温和条件下成立的。这些特性增强了LOQUAT的稳健性和实用性，促进其在PCCT临床场景中的应用。源代码可在 https://github.com/linzf23/LOQUAT 获取。

EI 1558-254X DA 2024-09-04 UT MEDLINE:39226197 PM 39226197 ER
EI 1558-254X DA 2024-09-04 UT MEDLINE：39226197 PM 39226197 ER

AU Wang, Sen Yang, Yirong Stevens, Grant M Yin, Zhye Wang, Adam S

Emulating Low-Dose PCCT Image Pairs with Independent Noise for Self-Supervised Spectral Image Denoising.
模拟具有独立噪声的低剂量 PCCT 图像对，以实现自监督光谱图像去噪。

Photon counting CT (PCCT) acquires spectral measurements and enables generation of material decomposition (MD) images that provide distinct advantages in various clinical situations. However, noise amplification is observed in MD images, and denoising is typically applied. Clean or high-quality references are rare in clinical scans, often making supervised learning (Noise2Clean) impractical. Noise2Noise is a self-supervised counterpart, using noisy images and corresponding noisy references with zero-mean, independent noise. PCCT counts transmitted photons separately, and raw measurements are assumed to follow a Poisson distribution in each energy bin, providing the possibility to create noise-independent pairs. The approach is to use binomial selection to split the counts into two low-dose scans with independent noise. We prove that the reconstructed spectral images inherit the noise independence from counts domain through noise propagation analysis and also validated it in numerical simulation and experimental phantom scans. The method offers the flexibility to split measurements into desired dose levels while ensuring the reconstructed images share identical underlying features, thereby strengthening the model's robustness for input dose levels and capability of preserving fine details. In both numerical simulation and experimental phantom scans, we demonstrated that Noise2Noise with binomial selection outperforms other common self-supervised learning methods based on different presumptive conditions.
光子计数 CT (PCCT) 获取光谱测量结果并生成材料分解 (MD) 图像，在各种临床情况下具有独特的优势。然而，在MD图像中观察到噪声放大，并且通常应用去噪。干净或高质量的参考在临床扫描中很少见，这通常使得监督学习 (Noise2Clean) 不切实际。 Noise2Noise 是一种自我监督的对应物，使用噪声图像和具有零均值、独立噪声的相应噪声参考。 PCCT 分别对发射的光子进行计数，并且假设原始测量结果遵循每个能量仓中的泊松分布，从而提供了创建与噪声无关的对的可能性。该方法是使用二项式选择将计数分成具有独立噪声的两次低剂量扫描。我们通过噪声传播分析证明了重建的光谱图像继承了计数域的噪声独立性，并在数值模拟和实验体模扫描中对其进行了验证。该方法提供了将测量结果划分为所需剂量水平的灵活性，同时确保重建图像具有相同的基础特征，从而增强了模型对输入剂量水平的鲁棒性和保留精细细节的能力。在数值模拟和实验模型扫描中，我们证明了基于不同假设条件的二项式选择的 Noise2Noise 优于其他常见的自监督学习方法。

AU Dabrowski, Oscar Falcone, Jean-Luc Klauser, Antoine Songeon, Julien Kocher, Michel Chopard, Bastien Lazeyras, Francois Courvoisier, Sebastien
AU Dabrowski、奥斯卡·法尔科内、让-吕克·克劳瑟、安托万·松琼、朱利安·科赫、米歇尔·萧邦、巴斯蒂安·拉泽拉斯、弗朗索瓦·库瓦西耶、塞巴斯蒂安

SISMIK for brain MRI: Deep-learning-based motion estimation and model-based motion correction in k-space.
用于脑 MRI 的 SISMIK：k 空间中基于深度学习的运动估计和基于模型的运动校正。

MRI, a widespread non-invasive medical imaging modality, is highly sensitive to patient motion. Despite many attempts over the years, motion correction remains a difficult problem and there is no general method applicable to all situations. We propose a retrospective method for motion estimation and correction to tackle the problem of in-plane rigid-body motion, apt for classical 2D Spin-Echo scans of the brain, which are regularly used in clinical practice. Due to the sequential acquisition of k-space, motion artifacts are well localized. The method leverages the power of deep neural networks to estimate motion parameters in k-space and uses a model-based approach to restore degraded images to avoid "hallucinations". Notable advantages are its ability to estimate motion occurring in high spatial frequencies without the need of a motion-free reference. The proposed method operates on the whole k-space dynamic range and is moderately affected by the lower SNR of higher harmonics. As a proof of concept, we provide models trained using supervised learning on 600k motion simulations based on motion-free scans of 43 different subjects. Generalization performance was tested with simulations as well as in-vivo. Qualitative and quantitative evaluations are presented for motion parameter estimations and image reconstruction. Experimental results show that our approach is able to obtain good generalization performance on simulated data and in-vivo acquisitions. We provide a Python implementation at https://gitlab.unige.ch/Oscar.Dabrowski/sismik_mri/.
MRI 是一种广泛使用的非侵入性医学成像方式，对患者运动高度敏感。尽管多年来进行了许多尝试，运动校正仍然是一个难题，并且没有适用于所有情况的通用方法。我们提出了一种用于运动估计和校正的回顾性方法，以解决平面内刚体运动问题，适用于临床实践中经常使用的经典大脑 2D 自旋回波扫描。由于 k 空间的顺序采集，运动伪影被很好地定位。该方法利用深度神经网络的力量来估计 k 空间中的运动参数，并使用基于模型的方法来恢复退化的图像以避免“幻觉”。显着的优点是它能够估计高空间频率中发生的运动，而无需无运动参考。该方法在整个 k 空间动态范围内运行，并且受到高次谐波较低 SNR 的适度影响。作为概念证明，我们提供了使用监督学习对 600k 运动模拟进行训练的模型，这些运动模拟基于 43 个不同受试者的无运动扫描。泛化性能通过模拟和体内测试进行了测试。对运动参数估计和图像重建进行了定性和定量评估。实验结果表明，我们的方法能够在模拟数据和体内采集上获得良好的泛化性能。我们在 https://gitlab.unige.ch/Oscar.Dabrowski/sismik_mri/ 提供了 Python 实现。

AU Zhang, Mengliang Hu, Xinyue Gu, Lin Liu, Liangchen Kobayashi, Kazuma Harada, Tatsuya Yan, Yan Summers, Ronald M Zhu, Yingying
张AU、胡孟良、顾新月、刘林、小林良臣、原田一马、严达也、严萨默斯、朱明明、莹莹

A New Benchmark: Clinical Uncertainty and Severity Aware Labeled Chest X-Ray Images with Multi-Relationship Graph Learning.
新基准：具有多关系图学习的临床不确定性和严重性感知标记胸部 X 射线图像。

Chest radiography, commonly known as CXR, is frequently utilized in clinical settings to detect cardiopulmonary conditions. However, even seasoned radiologists might offer different evaluations regarding the seriousness and uncertainty associated with observed abnormalities. Previous research has attempted to utilize clinical notes to extract abnormal labels for training deep-learning models in CXR image diagnosis. However, these methods often neglected the varying degrees of severity and uncertainty linked to different labels. In our study, we initially assembled a comprehensive new dataset of CXR images based on clinical textual data, which incorporated radiologists' assessments of uncertainty and severity. Using this dataset, we introduced a multi-relationship graph learning framework that leverages spatial and semantic relationships while addressing expert uncertainty through a dedicated loss function. Our research showcases a notable enhancement in CXR image diagnosis and the interpretability of the diagnostic model, surpassing existing state-of-the-art methodologies. The dataset address of disease severity and uncertainty we extracted is: https://physionet.org/content/cad-chest/1.0/.
胸部X光检查，通常称为CXR，在临床环境中经常用于检测心肺状况。然而，即使是经验丰富的放射科医生也可能对与观察到的异常相关的严重性和不确定性提供不同的评估。先前的研究尝试利用临床记录来提取异常标签，用于训练 CXR 图像诊断中的深度学习模型。然而，这些方法常常忽略了与不同标签相关的不同程度的严重性和不确定性。在我们的研究中，我们最初根据临床文本数据组装了一个全面的新 CXR 图像数据集，其中纳入了放射科医生对不确定性和严重性的评估。使用该数据集，我们引入了一个多关系图学习框架，该框架利用空间和语义关系，同时通过专用损失函数解决专家的不确定性。我们的研究展示了 CXR 图像诊断和诊断模型可解释性的显着增强，超越了现有的最先进方法。我们提取的疾病严重程度和不确定性的数据集地址为：https://physionet.org/content/cad-chest/1.0/。

AU Liu, Jinduo Han, Lu Ji, Junzhong
刘AU、韩金铎、陆吉、俊忠

MCAN: Multimodal Causal Adversarial Networks for Dynamic Effective Connectivity Learning From fMRI and EEG Data
MCAN：多模态因果对抗网络，用于从功能磁共振成像和脑电图数据中进行动态有效连接学习

Dynamic effective connectivity (DEC) is the accumulation of effective connectivity in the time dimension, which can describe the continuous neural activities in the brain. Recently, learning DEC from functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) data has attracted the attention of neuroinformatics researchers. However, the current methods fail to consider the gap between the fMRI and EEG modality, which can not precisely learn the DEC network from multimodal data. In this paper, we propose a multimodal causal adversarial network for DEC learning, named MCAN. The MCAN contains two modules: multimodal causal generator and multimodal causal discriminator. First, MCAN employs a multimodal causal generator with an attention-guided layer to produce a posterior signal and output a set of DEC networks. Then, the proposed method uses a multimodal causal discriminator to unsupervised calculate the joint gradient, which directs the update of the whole network. The experimental results on simulated data sets show that MCAN is superior to other state-of-the-art methods in learning the network structure of DEC and can effectively estimate the brain states. The experimental results on real data sets show that MCAN can better reveal abnormal patterns of brain activity and has good application potential in brain network analysis.
动态有效连接（DEC）是有效连接在时间维度上的积累，可以描述大脑中连续的神经活动。最近，从功能磁共振成像（fMRI）和脑电图（EEG）数据中学习DEC引起了神经信息学研究人员的关注。然而，当前的方法未能考虑fMRI和EEG模态之间的差距，无法从多模态数据中精确地学习DEC网络。在本文中，我们提出了一种用于 DEC 学习的多模态因果对抗网络，名为 MCAN。 MCAN 包含两个模块：多模态因果生成器和多模态因果鉴别器。首先，MCAN 采用带有注意力引导层的多模态因果生成器来生成后验信号并输出一组 DEC 网络。然后，该方法使用多模态因果判别器无监督地计算联合梯度，从而指导整个网络的更新。模拟数据集上的实验结果表明，MCAN 在学习 DEC 网络结构方面优于其他最先进的方法，并且可以有效地估计大脑状态。真实数据集上的实验结果表明，MCAN能够更好地揭示大脑活动的异常模式，在脑网络分析中具有良好的应用潜力。

AU Zhu, Cheng Tan, Ying Yang, Shuqi Miao, Jiaqing Zhu, Jiayi Huang, Huan Yao, Dezhong Luo, Cheng

Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Classification and Lateralization Analysis.
用于精神分裂症分类和偏侧化分析的时间动态同步功能脑网络。

Available evidence suggests that dynamic functional connectivity can capture time-varying abnormalities in brain activity in resting-state cerebral functional magnetic resonance imaging (rs-fMRI) data and has a natural advantage in uncovering mechanisms of abnormal brain activity in schizophrenia (SZ) patients. Hence, an advanced dynamic brain network analysis model called the temporal brain category graph convolutional network (Temporal-BCGCN) was employed. Firstly, a unique dynamic brain network analysis module, DSF-BrainNet, was designed to construct dynamic synchronization features. Subsequently, a revolutionary graph convolution method, TemporalConv, was proposed based on the synchronous temporal properties of features. Finally, the first modular test tool for abnormal hemispherical lateralization in deep learning based on rs-fMRI data, named CategoryPool, was proposed. This study was validated on COBRE and UCLA datasets and achieved 83.62% and 89.71% average accuracies, respectively, outperforming the baseline model and other state-of-the-art methods. The ablation results also demonstrate the advantages of TemporalConv over the traditional edge feature graph convolution approach and the improvement of CategoryPool over the classical graph pooling approach. Interestingly, this study showed that the lower-order perceptual system and higher-order network regions in the left hemisphere are more severely dysfunctional than in the right hemisphere in SZ, reaffirmings the importance of the left medial superior frontal gyrus in SZ. Our code was available at: https://github.com/swfen/Temporal-BCGCN.
现有证据表明，动态功能连接可以捕获静息态脑功能磁共振成像（rs-fMRI）数据中大脑活动随时间变化的异常，并且在揭示精神分裂症（SZ）患者异常大脑活动的机制方面具有天然优势。因此，采用了一种先进的动态脑网络分析模型，称为时间脑类别图卷积网络（Temporal-BCGCN）。首先，设计了独特的动态脑网络分析模块DSF-BrainNet来构建动态同步特征。随后，基于特征的同步时间特性，提出了一种革命性的图卷积方法TemporalConv。最后，提出了第一个基于rs-fMRI数据的深度学习异常半球偏侧化的模块化测试工具，名为CategoryPool。这项研究在 COBRE 和 UCLA 数据集上进行了验证，平均准确率分别达到 83.62% 和 89.71%，优于基线模型和其他最先进的方法。消融结果还证明了 TemporalConv 相对于传统边缘特征图卷积方法的优势以及 CategoryPool 相对于经典图池化方法的改进。有趣的是，这项研究表明，左半球的低阶感知系统和高阶网络区域的功能障碍比右半球更严重，这再次证实了左内侧额上回在 SZ 中的重要性。我们的代码位于：https://github.com/swfen/Temporal-BCGCN。

AU Gros, Romane Rodriguez-Nunez, Omar Felger, Leonard Moriconi, Stefano McKinley, Richard Pierangelo, Angelo Novikova, Tatiana Vassella, Erik Schucht, Philippe Hewer, Ekkehard Maragkou, Theoni
AU Gros、罗曼·罗德里格斯-努涅斯、奥马尔·菲尔格、伦纳德·莫里科尼、斯特凡诺·麦金利、理查德·皮耶朗杰洛、安吉洛·诺维科娃、塔蒂亚娜·瓦塞拉、埃里克·舒赫特、菲利普·休尔、埃克哈德·马拉格库、Theoni

Characterization of Polarimetric Properties in Various Brain Tumor Types Using Wide-Field Imaging Mueller Polarimetry.
使用广域成像穆勒偏振法表征各种脑肿瘤类型的偏振特性。

Neuro-oncological surgery is the primary brain cancer treatment, yet it faces challenges with gliomas due to their invasiveness and the need to preserve neurological function. Hence, radical resection is often unfeasible, highlighting the importance of precise tumor margin delineation to prevent neurological deficits and improve prognosis. Imaging Mueller polarimetry, an effective modality in various organ tissues, seems a promising approach for tumor delineation in neurosurgery. To further assess its use, we characterized the polarimetric properties by analysing 45 polarimetric measurements of 27 fresh brain tumor samples, including different tumor types with a strong focus on gliomas. Our study integrates a wide-field imaging Mueller polarimetric system and a novel neuropathology protocol, correlating polarimetric and histological data for accurate tissue identification. An image processing pipeline facilitated the alignment and overlay of polarimetric images and histological masks. Variations in depolarization values were observed for grey and white matter of brain tumor tissue, while differences in linear retardance were seen only within white matter of brain tumor tissue. Notably, we identified pronounced optical axis azimuth randomization within tumor regions. This study lays the foundation for machine learning-based brain tumor segmentation algorithms using polarimetric data, facilitating intraoperative diagnosis and decision making.
神经肿瘤手术是主要的脑癌治疗方法，但由于神经胶质瘤的侵袭性和保留神经功能的需要，它面临着挑战。因此，根治性切除通常是不可行的，这凸显了精确的肿瘤边缘勾画对于预防神经功能缺损和改善预后的重要性。成像穆勒偏振测定法是多种器官组织中的一种有效方法，似乎是神经外科肿瘤描绘的一种有前途的方法。为了进一步评估其用途，我们通过分析 27 个新鲜脑肿瘤样本（包括重点关注神经胶质瘤的不同肿瘤类型）的 45 个偏振测量值来表征其偏振特性。我们的研究集成了宽视场成像穆勒偏振系统和新颖的神经病理学协议，将偏振和组织学数据关联起来以进行准确的组织识别。图像处理管道促进了偏振图像和组织学掩模的对齐和叠加。观察到脑肿瘤组织的灰质和白质的去极化值的变化，而仅在脑肿瘤组织的白质内观察到线性延迟的差异。值得注意的是，我们在肿瘤区域内发现了明显的光轴方位角随机化。这项研究为使用极化数据的基于机器学习的脑肿瘤分割算法奠定了基础，促进术中诊断和决策。

AU Kuang, Hulin Wang, Yahui Liu, Jin Wang, Jie Cao, Quanliang Hu, Bo Qiu, Wu Wang, Jianxin
区匡、王虎林、刘亚辉、王进、曹杰、胡全良、邱波、王武、建新

Hybrid CNN-Transformer Network With Circular Feature Interaction for Acute Ischemic Stroke Lesion Segmentation on Non-Contrast CT Scans
具有圆形特征交互的混合 CNN-Transformer 网络，用于非对比 CT 扫描上的急性缺血性中风病变分割

Lesion segmentation is a fundamental step for the diagnosis of acute ischemic stroke (AIS). Non-contrast CT (NCCT) is still a mainstream imaging modality for AIS lesion measurement. However, AIS lesion segmentation on NCCT is challenging due to low contrast, noise and artifacts. To achieve accurate AIS lesion segmentation on NCCT, this study proposes a hybrid convolutional neural network (CNN) and Transformer network with circular feature interaction and bilateral difference learning. It consists of parallel CNN and Transformer encoders, a circular feature interaction module, and a shared CNN decoder with a bilateral difference learning module. A new Transformer block is particularly designed to solve the weak inductive bias problem of the traditional Transformer. To effectively combine features from CNN and Transformer encoders, we first design a multi-level feature aggregation module to combine multi-scale features in each encoder and then propose a novel feature interaction module containing circular CNN-to-Transformer and Transformer-to-CNN interaction blocks. Besides, a bilateral difference learning module is proposed at the bottom level of the decoder to learn the different information between the ischemic and contralateral sides of the brain. The proposed method is evaluated on three AIS datasets: the public AISD, a private dataset and an external dataset. Experimental results show that the proposed method achieves Dices of 61.39% and 46.74% on the AISD and the private dataset, respectively, outperforming 17 state-of-the-art segmentation methods. Besides, volumetric analysis on segmented lesions and external validation results imply that the proposed method is potential to provide support information for AIS diagnosis.
病灶分割是诊断急性缺血性卒中（AIS）的基本步骤。非增强 CT (NCCT) 仍然是 AIS 病变测量的主流成像方式。然而，由于对比度低、噪声和伪影，NCCT 上的 AIS 病灶分割具有挑战性。为了在 NCCT 上实现准确的 AIS 病灶分割，本研究提出了一种具有循环特征交互和双边差分学习的混合卷积神经网络（CNN）和 Transformer 网络。它由并行 CNN 和 Transformer 编码器、循环特征交互模块以及带有双边差分学习模块的共享 CNN 解码器组成。新的 Transformer 模块专门为解决传统 Transformer 的弱感应偏置问题而设计。为了有效地结合 CNN 和 Transformer 编码器的特征，我们首先设计了一个多级特征聚合模块来结合每个编码器中的多尺度特征，然后提出一种包含循环 CNN-to-Transformer 和 Transformer-to-CNN 的新型特征交互模块交互块。此外，在解码器的底层提出了双边差异学习模块，以学习缺血侧大脑和对侧大脑之间的差异信息。所提出的方法在三个 AIS 数据集上进行评估：公共 AISD、私有数据集和外部数据集。实验结果表明，该方法在 AISD 和私有数据集上的 Dices 分别达到 61.39% 和 46.74%，优于 17 种最先进的分割方法。此外，分段病灶的体积分析和外部验证结果表明该方法有可能为 AIS 诊断提供支持信息。

AU Lin, Yi Wang, Zeyu Zhang, Dong Cheng, Kwang-Ting Chen, Hao
AU Lin、王毅、张泽宇、程东、陈光廷、郝

BoNuS: Boundary Mining for Nuclei Segmentation With Partial Point Labels
BoNuS：使用部分点标签进行核分割的边界挖掘

Nuclei segmentation is a fundamental prerequisite in the digital pathology workflow. The development of automated methods for nuclei segmentation enables quantitative analysis of the wide existence and large variances in nuclei morphometry in histopathology images. However, manual annotation of tens of thousands of nuclei is tedious and time-consuming, which requires significant amount of human effort and domain-specific expertise. To alleviate this problem, in this paper, we propose a weakly-supervised nuclei segmentation method that only requires partial point labels of nuclei. Specifically, we propose a novel boundary mining framework for nuclei segmentation, named BoNuS, which simultaneously learns nuclei interior and boundary information from the point labels. To achieve this goal, we propose a novel boundary mining loss, which guides the model to learn the boundary information by exploring the pairwise pixel affinity in a multiple-instance learning manner. Then, we consider a more challenging problem, i.e., partial point label, where we propose a nuclei detection module with curriculum learning to detect the missing nuclei with prior morphological knowledge. The proposed method is validated on three public datasets, MoNuSeg, CPM, and CoNIC datasets. Experimental results demonstrate the superior performance of our method to the state-of-the-art weakly-supervised nuclei segmentation methods. Code: https://github.com/hust-linyi/bonus.
细胞核分割是数字病理工作流程的基本先决条件。细胞核分割自动化方法的发展使得能够对组织病理学图像中细胞核形态测量的广泛存在和巨大差异进行定量分析。然而，对数以万计的细胞核进行手动注释既繁琐又耗时，需要大量的人力和特定领域的专业知识。为了缓解这个问题，在本文中，我们提出了一种弱监督的核分割方法，仅需要核的部分点标签。具体来说，我们提出了一种用于核分割的新型边界挖掘框架，名为BoNuS，它同时从点标签中学习核内部和边界信息。为了实现这一目标，我们提出了一种新颖的边界挖掘损失，它引导模型通过以多实例学习方式探索成对像素亲和力来学习边界信息。然后，我们考虑一个更具挑战性的问题，即部分点标签，我们提出了一个具有课程学习的核检测模块，以利用先验形态学知识来检测丢失的核。所提出的方法在三个公共数据集 MoNuSeg、CPM 和 CoNIC 数据集上进行了验证。实验结果证明我们的方法比最先进的弱监督核分割方法具有优越的性能。代码：https://github.com/hust-linyi/bonus。

AU Liu, Xiao Sanchez, Pedro Thermos, Spyridon O'Neil, Alison Q. Tsaftaris, Sotirios A.
AU Liu、Xiao Sanchez、Pedro Thermos、Spyridon O'Neil、Alison Q. Tsaftaris、Sotirios A.

Compositionally Equivariant Representation Learning
组成等变表示学习

Deep learning models often need sufficient supervision (i.e., labelled data) in order to be trained effectively. By contrast, humans can swiftly learn to identify important anatomy in medical images like MRI and CT scans, with minimal guidance. This recognition capability easily generalises to new images from different medical facilities and to new tasks in different settings. This rapid and generalisable learning ability is largely due to the compositional structure of image patterns in the human brain, which are not well represented in current medical models. In this paper, we study the utilisation of compositionality in learning more interpretable and generalisable representations for medical image segmentation. Overall, we propose that the underlying generative factors that are used to generate the medical images satisfy compositional equivariance property, where each factor is compositional (e.g., corresponds to human anatomy) and also equivariant to the task. Hence, a good representation that approximates well the ground truth factor has to be compositionally equivariant. By modelling the compositional representations with learnable von-Mises-Fisher (vMF) kernels, we explore how different design and learning biases can be used to enforce the representations to be more compositionally equivariant under un-, weakly-, and semi-supervised settings. Extensive results show that our methods achieve the best performance over several strong baselines on the task of semi-supervised domain-generalised medical image segmentation. Code will be made publicly available upon acceptance at https://github.com/vios-s.
深度学习模型通常需要足够的监督（即标记数据）才能有效地进行训练。相比之下，人类可以在最少的指导下快速学会识别 MRI 和 CT 扫描等医学图像中的重要解剖结构。这种识别能力可以轻松推广到来自不同医疗机构的新图像以及不同环境中的新任务。这种快速且普遍的学习能力很大程度上归功于人脑图像模式的组成结构，而这种结构在当前的医学模型中并没有得到很好的体现。在本文中，我们研究了组合性在学习医学图像分割的更多可解释和可概括的表示中的利用。总的来说，我们建议用于生成医学图像的底层生成因素满足成分等变性，其中每个因素都是成分性的（例如，对应于人体解剖学）并且也与任务等变。因此，一个很好地近似真实因子的良好表示必须在成分上是等变的。通过使用可学习的 von-Mises-Fisher (vMF) 内核对组合表示进行建模，我们探索了如何使用不同的设计和学习偏差来强制表示在无监督、弱监督和半监督设置下在组合上更加等变。大量结果表明，我们的方法在半监督域广义医学图像分割任务上在几个强基线上实现了最佳性能。代码将在 https://github.com/vios-s 接受后公开发布。

AU Wegierak, Dana Cooley, Michaela B. Perera, Reshani Wulftange, William J. Gurkan, Umut A. Kolios, Michael C. Exner, Agata A.
AU Wegierak、Dana Cooley、Michaela B. Perera、Reshani Wulftange、William J. Gurkan、Umut A. Kolios、Michael C. Exner、Agata A.

Decorrelation Time Mapping as an Analysis Tool for Nanobubble-Based Contrast Enhanced Ultrasound Imaging
去相关时间映射作为基于纳米气泡的对比增强超声成像的分析工具

Nanobubbles (NBs; similar to 100-500 nm diameter) are preclinical ultrasound (US) contrast agents that expand applications of contrast enhanced US (CEUS). Due to their sub-micron size, high particle density, and deformable shell, NBs in pathological states of heightened vascular permeability (e.g. in tumors) extravasate, enabling applications not possible with microbubbles (similar to 1000-10,000 nm diameter). A method that can separate intravascular versus extravascular NB signal is needed as an imaging biomarker for improved tumor detection. We present a demonstration of decorrelation time (DT) mapping for enhanced tumor NB-CEUS imaging. In vitro models validated the sensitivity of DT to agent motion. Prostate cancer mouse models validated in vivo imaging potential and sensitivity to cancerous tissue. Our findings show that DT is inversely related to NB motion, offering enhanced detail of NB dynamics in tumors, and highlighting the heterogeneity of the tumor environment. Average DT was high in tumor regions (similar to 9 s) compared to surrounding normal tissue (similar to 1 s) with higher sensitivity to tumor tissue compared to other mapping techniques. Molecular NB targeting to tumors further extended DT (11 s) over non-targeted NBs (6 s), demonstrating sensitivity to NB adherence. From DT mapping of in vivo NB dynamics we demonstrate the heterogeneity of tumor tissue while quantifying extravascular NB kinetics and delineating intra-tumoral vasculature. This new NB-CEUS-based biomarker can be powerful in molecular US imaging, with improved sensitivity and specificity to diseased tissue and potential for use as an estimator of vascular permeability and the enhanced permeability and retention (EPR) effect in tumors.
纳米气泡（NB；直径类似于 100-500 nm）是临床前超声 (US) 造影剂，可扩展造影增强 US (CEUS) 的应用。由于它们的亚微米尺寸、高颗粒密度和可变形的外壳，处于血管通透性升高的病理状态（例如在肿瘤中）的纳米粒子会外渗，从而实现微泡（类似于1000-10,000 nm直径）不可能的应用。需要一种能够分离血管内和血管外 NB 信号的方法作为成像生物标志物，以改进肿瘤检测。我们展示了增强肿瘤 NB-CEUS 成像的去相关时间 (DT) 映射。体外模型验证了 DT 对药剂运动的敏感性。前列腺癌小鼠模型验证了体内成像潜力和对癌组织的敏感性。我们的研究结果表明，DT 与 NB 运动呈负相关，增强了肿瘤中 NB 动态的细节，并强调了肿瘤环境的异质性。与周围正常组织（大约 1 秒）相比，肿瘤区域的平均 DT 较高（大约 9 秒），与其他绘图技术相比，对肿瘤组织的敏感性更高。靶向肿瘤的分子 NB 比非靶向 NB（6 秒）进一步延长了 DT（11 秒），证明了对 NB 粘附的敏感性。通过体内 NB 动力学的 DT 绘图，我们证明了肿瘤组织的异质性，同时量化了血管外 NB 动力学并描绘了肿瘤内脉管系统。这种基于 NB-CEUS 的新型生物标志物在分子超声成像中具有强大的作用，对病变组织具有更高的敏感性和特异性，并且有可能用作肿瘤中血管通透性和增强通透性和保留 (EPR) 效应的估计器。

AU Kijanka, Piotr Urban, Matthew W.
AU Kijanka、皮奥特·厄本、马修·W.

Ultrasound Shear Elastography With Expanded Bandwidth (USEWEB): A Novel Method for 2D Shear Phase Velocity Imaging of Soft Tissues
扩展带宽超声剪切弹性成像 (USEWEB)：软组织二维剪切相速度成像的新方法

Ultrasound shear wave elastography (SWE) is a noninvasive approach for evaluating mechanical properties of soft tissues. In SWE either group velocity measured in the time-domain or phase velocity measured in the frequency-domain can be reported. Frequency-domain methods have the advantage over time-domain methods in providing a response for a specific frequency, while time-domain methods average the wave velocity over the entire frequency band. Current frequency-domain approaches struggle to reconstruct SWE images over full frequency bandwidth. This is especially important in the case of viscoelastic tissues, where tissue viscoelasticity is often studied by analyzing the shear wave phase velocity dispersion. For characterizing cancerous lesions, it has been shown that considerable biases can occur with group velocity-based measurements. However, using phase velocities at higher frequencies can provide more accurate evaluations. In this paper, we propose a new method called Ultrasound Shear Elastography with Expanded Bandwidth (USEWEB) used for two-dimensional (2D) shear wave phase velocity imaging. We tested the USEWEB method on data from homogeneous tissue-mimicking liver fibrosis phantoms, custom-made viscoelastic phantom measurements, phantoms with cylindrical inclusions experiments, and in vivo renal transplants scanned with a clinical scanner. We compared results from the USEWEB method with a Local Phase Velocity Imaging (LPVI) approach over a wide frequency range, i.e., up to 200-2000 Hz. Tests carried out revealed that the USEWEB approach provides 2D phase velocity images with a coefficient of variation below 5% over a wider frequency band for smaller processing window size in comparison to LPVI, especially in viscoelastic materials. In addition, USEWEB can produce correct phase velocity images for much higher frequencies, up to 1800 Hz, compared to LPVI, which can be used to characterize viscoelastic materials and elastic inclusions.
超声剪切波弹性成像（SWE）是一种评估软组织机械性能的无创方法。在 SWE 中，可以报告时域中测量的群速度或频域中测量的相速度。频域方法在提供特定频率的响应方面比时域方法具有优势，而时域方法则对整个频带上的波速进行平均。当前的频域方法很难在全频率带宽上重建 SWE 图像。这对于粘弹性组织来说尤其重要，组织粘弹性通常通过分析剪切波相速度色散来研究。为了表征癌性病变，已经表明基于群速度的测量可能会出现相当大的偏差。然而，在较高频率下使用相速度可以提供更准确的评估。在本文中，我们提出了一种称为扩展带宽超声剪切弹性成像（USEWEB）的新方法，用于二维（2D）剪切波相速度成像。我们使用来自均质组织模拟肝纤维化模型、定制粘弹性模型测量、具有圆柱形包涵体实验的模型以及使用临床扫描仪扫描的体内肾移植物的数据测试了 USEWEB 方法。我们在较宽的频率范围（即高达 200-2000 Hz）内将 USEWEB 方法与局部相速度成像 (LPVI) 方法的结果进行了比较。进行的测试表明，与 LPVI 相比，USEWEB 方法提供的二维相速度图像在更宽的频带内变化系数低于 5%，处理窗口尺寸更小，尤其是在粘弹性材料中。此外，与 LPVI 相比，USEWEB 可以在更高的频率（高达 1800 Hz）下生成正确的相速度图像，LPVI 可用于表征粘弹性材料和弹性夹杂物。

AU Liu, Han Xu, Zhoubing Gao, Riqiang Li, Hao Wang, Jianing Chabin, Guillaume Oguz, Ipek Grbic, Sasa
AU Liu, 徐涵, 高周兵, 李日强, 王浩, Jianing Chabin, Guillaume Oguz, Ipek Grbic, Sasa

COSST: Multi-Organ Segmentation With Partially Labeled Datasets Using Comprehensive Supervisions and Self-Training
COSST：使用综合监督和自我训练的部分标记数据集的多器官分割

Deep learning models have demonstrated remarkable success in multi-organ segmentation but typically require large-scale datasets with all organs of interest annotated. However, medical image datasets are often low in sample size and only partially labeled, i.e., only a subset of organs are annotated. Therefore, it is crucial to investigate how to learn a unified model on the available partially labeled datasets to leverage their synergistic potential. In this paper, we systematically investigate the partial-label segmentation problem with theoretical and empirical analyses on the prior techniques. We revisit the problem from a perspective of partial label supervision signals and identify two signals derived from ground truth and one from pseudo labels. We propose a novel two-stage framework termed COSST, which effectively and efficiently integrates comprehensive supervision signals with self-training. Concretely, we first train an initial unified model using two ground truth-based signals and then iteratively incorporate the pseudo label signal to the initial model using self-training. To mitigate performance degradation caused by unreliable pseudo labels, we assess the reliability of pseudo labels via outlier detection in latent space and exclude the most unreliable pseudo labels from each self-training iteration. Extensive experiments are conducted on one public and three private partial-label segmentation tasks over 12 CT datasets. Experimental results show that our proposed COSST achieves significant improvement over the baseline method, i.e., individual networks trained on each partially labeled dataset. Compared to the state-of-the-art partial-label segmentation methods, COSST demonstrates consistent superior performance on various segmentation tasks and with different training data sizes.
深度学习模型在多器官分割方面取得了显着的成功，但通常需要注释所有感兴趣器官的大规模数据集。然而，医学图像数据集的样本量通常较小，并且仅进行了部分标记，即仅对器官的子集进行了注释。因此，研究如何在可用的部分标记数据集上学习统一模型以利用其协同潜力至关重要。在本文中，我们通过对现有技术的理论和实证分析，系统地研究了部分标签分割问题。我们从部分标签监督信号的角度重新审视这个问题，并识别出两个来自真实标签的信号和一个来自伪标签的信号。我们提出了一种新颖的两阶段框架，称为 COSST，它有效且高效地将全面的监督信号与自我训练相结合。具体来说，我们首先使用两个基于地面实况的信号训练初始统一模型，然后使用自训练迭代地将伪标签信号合并到初始模型中。为了减轻由不可靠的伪标签引起的性能下降，我们通过潜在空间中的异常值检测来评估伪标签的可靠性，并从每次自训练迭代中排除最不可靠的伪标签。在 12 个 CT 数据集上对一个公共部分标签分割任务和三个私有部分标签分割任务进行了广泛的实验。实验结果表明，我们提出的 COSST 比基线方法（即在每个部分标记的数据集上训练的单独网络）取得了显着的改进。与最先进的部分标签分割方法相比，COSST 在各种分割任务和不同的训练数据大小上表现出一致的优越性能。

AU Yang, Zhuoyue Pan, Junjun Dai, Ju Sun, Zhen Xiao, Yi
欧阳、潘卓跃、戴军军、孙菊、肖震、易

Self-Supervised Lightweight Depth Estimation in Endoscopy Combining CNN and Transformer
结合 CNN 和 Transformer 的内窥镜自监督轻量级深度估计

In recent years, an increasing number of medical engineering tasks, such as surgical navigation, pre-operative registration, and surgical robotics, rely on 3D reconstruction techniques. Self-supervised depth estimation has attracted interest in endoscopic scenarios because it does not require ground truth. Most existing methods depend on expanding the size of parameters to improve their performance. There, designing a lightweight self-supervised model that can obtain competitive results is a hot topic. We propose a lightweight network with a tight coupling of convolutional neural network (CNN) and Transformer for depth estimation. Unlike other methods that use CNN and Transformer to extract features separately and then fuse them on the deepest layer, we utilize the modules of CNN and Transformer to extract features at different scales in the encoder. This hierarchical structure leverages the advantages of CNN in texture perception and Transformer in shape extraction. In the same scale of feature extraction, the CNN is used to acquire local features while the Transformer encodes global information. Finally, we add multi-head attention modules to the pose network to improve the accuracy of predicted poses. Experiments demonstrate that our approach obtains comparable results while effectively compressing the model parameters on two datasets.
近年来，越来越多的医学工程任务，例如手术导航、术前配准和手术机器人，依赖于 3D 重建技术。自监督深度估计引起了人们对内窥镜场景的兴趣，因为它不需要地面实况。大多数现有方法依赖于扩展参数的大小来提高其性能。在那里，设计一个能够获得有竞争力的结果的轻量级自监督模型是一个热门话题。我们提出了一种将卷积神经网络（CNN）和 Transformer 紧密耦合的轻量级网络，用于深度估计。与其他使用 CNN 和 Transformer 分别提取特征然后在最深层融合的方法不同，我们利用 CNN 和 Transformer 的模块在编码器中提取不同尺度的特征。这种层次结构利用了 CNN 在纹理感知方面和 Transformer 在形状提取方面的优势。在相同尺度的特征提取中，CNN用于获取局部特征，而Transformer则对全局信息进行编码。最后，我们将多头注意力模块添加到姿势网络中，以提高预测姿势的准确性。实验表明，我们的方法在有效压缩两个数据集上的模型参数的同时获得了可比较的结果。

AU Zhang, Yiwen Li, Chuanpu Zhong, Liming Chen, Zeli Yang, Wei Wang, Xuetao
张AU、李一文、钟传普、陈黎明、杨泽立、王伟、雪涛

DoseDiff: Distance-aware Diffusion Model for Dose Prediction in Radiotherapy.
DoseDiff：放射治疗剂量预测的距离感知扩散模型。

Treatment planning, which is a critical component of the radiotherapy workflow, is typically carried out by a medical physicist in a time-consuming trial-and-error manner. Previous studies have proposed knowledge-based or deep-learning-based methods for predicting dose distribution maps to assist medical physicists in improving the efficiency of treatment planning. However, these dose prediction methods usually fail to effectively utilize distance information between surrounding tissues and targets or organs-at-risk (OARs). Moreover, they are poor at maintaining the distribution characteristics of ray paths in the predicted dose distribution maps, resulting in a loss of valuable information. In this paper, we propose a distance-aware diffusion model (DoseDiff) for precise prediction of dose distribution. We define dose prediction as a sequence of denoising steps, wherein the predicted dose distribution map is generated with the conditions of the computed tomography (CT) image and signed distance maps (SDMs). The SDMs are obtained by distance transformation from the masks of targets or OARs, which provide the distance from each pixel in the image to the outline of the targets or OARs. We further propose a multi-encoder and multi-scale fusion network (MMFNet) that incorporates multi-scale and transformer-based fusion modules to enhance information fusion between the CT image and SDMs at the feature level. We evaluate our model on two in-house datasets and a public dataset, respectively. The results demonstrate that our DoseDiff method outperforms state-of-the-art dose prediction methods in terms of both quantitative performance and visual quality.
治疗计划是放射治疗工作流程的关键组成部分，通常由医学物理学家以耗时的试错方式进行。先前的研究提出了基于知识或基于深度学习的方法来预测剂量分布图，以帮助医学物理学家提高治疗计划的效率。然而，这些剂量预测方法通常无法有效利用周围组织与目标或危及器官（OAR）之间的距离信息。此外，它们很难维持预测剂量分布图中射线路径的分布特征，从而导致有价值的信息丢失。在本文中，我们提出了一种距离感知扩散模型（DoseDiff），用于精确预测剂量分布。我们将剂量预测定义为一系列去噪步骤，其中预测的剂量分布图是根据计算机断层扫描（CT）图像和符号距离图（SDM）的条件生成的。 SDM 是通过目标或 OAR 掩模的距离变换获得的，它提供了图像中每个像素到目标或 OAR 轮廓的距离。我们进一步提出了一种多编码器和多尺度融合网络（MMFNet），它结合了多尺度和基于变压器的融合模块，以增强 CT 图像和 SDM 之间在特征级别的信息融合。我们分别在两个内部数据集和一个公共数据集上评估我们的模型。结果表明，我们的 DoseDiff 方法在定量性能和视觉质量方面均优于最先进的剂量预测方法。

AU Lachinov, Dmitrii Chakravarty, Arunava Grechenig, Christoph Schmidt-Erfurth, Ursula Bogunovic, Hrvoje CA ADNI
AU Lachinov、Dmitrii Chakravarty、Arunava Grechenig、Christoph Schmidt-Erfurth、Ursula Bogunovic、Hrvoje CA ADNI

Learning Spatio-Temporal Model of Disease Progression With NeuralODEs From Longitudinal Volumetric Data
使用 NeuralODE 从纵向体积数据学习疾病进展的时空模型

Robust forecasting of the future anatomical changes inflicted by an ongoing disease is an extremely challenging task that is out of grasp even for experienced healthcare professionals. Such a capability, however, is of great importance since it can improve patient management by providing information on the speed of disease progression already at the admission stage, or it can enrich the clinical trials with fast progressors and avoid the need for control arms by the means of digital twins. In this work, we develop a deep learning method that models the evolution of age-related disease by processing a single medical scan and providing a segmentation of the target anatomy at a requested future point in time. Our method represents a time-invariant physical process and solves a large-scale problem of modeling temporal pixel-level changes utilizing NeuralODEs. In addition, we demonstrate the approaches to incorporate the prior domain-specific constraints into our method and define temporal Dice loss for learning temporal objectives. To evaluate the applicability of our approach across different age-related diseases and imaging modalities, we developed and tested the proposed method on the datasets with 967 retinal OCT volumes of 100 patients with Geographic Atrophy and 2823 brain MRI volumes of 633 patients with Alzheimer's Disease. For Geographic Atrophy, the proposed method outperformed the related baseline models in the atrophy growth prediction. For Alzheimer's Disease, the proposed method demonstrated remarkable performance in predicting the brain ventricle changes induced by the disease, achieving the state-of-the-art result on TADPOLE cross-sectional prediction challenge dataset.
对当前疾病造成的未来解剖学变化的稳健预测是一项极具挑战性的任务，即使对于经验丰富的医疗保健专业人员来说也是无法掌握的。然而，这种能力非常重要，因为它可以通过提供入院阶段疾病进展速度的信息来改善患者管理，或者可以丰富快速进展者的临床试验，并避免对控制臂的需要。数字孪生的手段。在这项工作中，我们开发了一种深度学习方法，通过处理单个医学扫描并在请求的未来时间点提供目标解剖结构的分割来模拟与年龄相关的疾病的演变。我们的方法代表了一个时不变的物理过程，并解决了利用 NeuralODE 对时间像素级变化进行建模的大规模问题。此外，我们还演示了将先前的特定领域约束纳入我们的方法中的方法，并定义学习时间目标的时间 Dice 损失。为了评估我们的方法在不同年龄相关疾病和成像模式中的适用性，我们在数据集上开发并测试了所提出的方法，其中包括 100 名地理萎缩症患者的 967 个视网膜 OCT 体积和 633 名阿尔茨海默病患者的 2823 个脑部 MRI 体积。对于地理萎缩，所提出的方法在萎缩增长预测方面优于相关基线模型。对于阿尔茨海默病，所提出的方法在预测疾病引起的脑室变化方面表现出卓越的性能，在 TADPOLE 横截面预测挑战数据集上取得了最先进的结果。

AU Nie, Xinyu Ruan, Jialiang Otaduy, Maria Concepcion Garcia Grinberg, Lea Tenenholz Ringman, John Shi, Yonggang
AU Nie, 阮新宇, Jialiang Otaduy, Maria Concepcion Garcia Grinberg, Lea Tenenholz Ringman, John Shi, 永刚

Surface-Based Probabilistic Fiber Tracking in Superficial White Matter
浅表白质中基于表面的概率纤维追踪

The short association fibers or U-fibers travel in the superficial white matter (SWM) beneath the cortical layer. While the U-fibers play a crucial role in various brain disorders, there is a lack of effective tools to reconstruct their highly curved trajectory from diffusion MRI (dMRI). In this work, we propose a novel surface-based framework for the probabilistic tracking of fibers on the triangular mesh representation of the SWM. By deriving a closed-form solution to transform the spherical harmonics (SPHARM) coefficients of 3D fiber orientation distributions (FODs) to local coordinate systems on each triangle, we develop a novel approach to project the FODs onto the tangent space of the SWM. After that, we utilize parallel transport to realize the intrinsic propagation of streamlines on SWM following probabilistically sampled fiber directions. Our intrinsic and surface-based method eliminates the need to perform the necessary but challenging sharp turns in 3D compared with conventional volume-based tractography methods. Using data from the Human Connectome Project (HCP), we performed quantitative comparisons to demonstrate the proposed algorithm can more effectively reconstruct the U-fibers connecting the precentral and postcentral gyrus than previous methods. Quantitative validations were then performed on post-mortem MRIs to show the reconstructed U-fibers from our method more faithfully follow the SWM than volume-based tractography. Finally, we applied our algorithm to study the parietal U-fiber connectivity changes in autosomal dominant Alzheimer's disease (ADAD) patients and successfully detected significant associations between U-fiber connectivity and disease severity.
短联合纤维或 U 纤维在皮质层下方的浅表白质 (SWM) 中传播。虽然 U 型纤维在各种脑部疾病中发挥着至关重要的作用，但缺乏有效的工具来通过扩散 MRI (dMRI) 重建其高度弯曲的轨迹。在这项工作中，我们提出了一种新颖的基于表面的框架，用于在 SWM 的三角网格表示上进行纤维的概率跟踪。通过推导闭式解，将 3D 纤维取向分布 (FOD) 的球谐函数 (SPHARM) 系数变换到每个三角形上的局部坐标系，我们开发了一种将 FOD 投影到 SWM 切线空间的新颖方法。之后，我们利用并行传输来实现 SWM 上流线遵循概率采样纤维方向的固有传播。与传统的基于体积的纤维束成像方法相比，我们的内在和基于表面的方法无需在 3D 中执行必要但具有挑战性的急转弯。利用人类连接组计划 (HCP) 的数据，我们进行了定量比较，证明所提出的算法比以前的方法能够更有效地重建连接中央前回和中央后回的 U 纤维。然后对死后 MRI 进行定量验证，以显示我们的方法重建的 U 纤维比基于体积的纤维束成像更忠实地遵循 SWM。最后，我们应用我们的算法来研究常染色体显性阿尔茨海默病 (ADAD) 患者的顶叶 U 纤维连接变化，并成功检测到 U 纤维连接与疾病严重程度之间的显着关联。

AU Su, Ting Zhu, Jiongtao Zhang, Xin Tan, Yuhang Cui, Han Zeng, Dong Guo, Jinchuan Zheng, Hairong Ma, Jianhua Liang, Dong Ge, Yongshuai
苏AU、朱婷、张炯涛、谭鑫、崔宇航、曾韩、郭东、郑金川、马海荣、梁建华、葛东、永帅

Super Resolution Dual-Energy Cone-Beam CT Imaging With Dual-Layer Flat-Panel Detector
采用双层平板探测器的超分辨率双能锥束 CT 成像

In flat-panel detector (FPD) based cone-beam computed tomography (CBCT) imaging, the native receptor array is usually binned into a smaller matrix size. By doing so, the signal readout speed could be increased by 4-9 times at the expense of a spatial resolution loss of 50%-67%. Clearly, such manipulation poses a key bottleneck in generating high spatial and high temporal resolution CBCT images at the same time. In addition, the conventional FPD is also difficult in generating dual-energy CBCT images. In this paper, we propose an innovative super resolution dual-energy CBCT imaging method, named as suRi, based on dual-layer FPD (DL-FPD) to overcome these aforementioned difficulties at once. With suRi, specifically, a 1D or 2D sub-pixel (half pixel in this study) shifted binning is applied instead of the conventionally aligned binning to double the spatial sampling rate during the dual-energy data acquisition. As a result, the suRi approach provides a new strategy to enable high spatial resolution CBCT imaging while at high readout speed. Moreover, a penalized likelihood material decomposition algorithm is developed to directly reconstruct the high resolution bases from these dual-energy CBCT projections containing sub-pixel shifts. Numerical and physical experiments are performed to validate this newly developed suRi method with phantoms and biological specimen. Results demonstrate that suRi can significantly improve the spatial resolution of the CBCT image. We believe this developed suRi method would greatly enhance the imaging performance of the DL-FPD based dual-energy CBCT systems in future.
在基于平板探测器 (FPD) 的锥形束计算机断层扫描 (CBCT) 成像中，天然受体阵列通常被合并成较小的矩阵尺寸。通过这样做，信号读出速度可以提高4-9倍，但代价是空间分辨率损失50%-67%。显然，这种操作造成了同时生成高空间和高时间分辨率 CBCT 图像的关键瓶颈。此外，传统的FPD也难以生成双能CBCT图像。在本文中，我们提出了一种基于双层FPD（DL-FPD）的创新超分辨率双能CBCT成像方法，称为suRi，以一次性克服上述困难。具体而言，对于 suRi，应用 1D 或 2D 子像素（本研究中的半像素）移位装箱代替传统的对齐装箱，以在双能数据采集期间使空间采样率加倍。因此，suRi 方法提供了一种新策略，可以在高读出速度的同时实现高空间分辨率 CBCT 成像。此外，还开发了惩罚似然材料分解算法，以直接从这些包含子像素移位的双能 CBCT 投影重建高分辨率基础。通过模型和生物样本进行数值和物理实验来验证这种新开发的 suRi 方法。结果表明suRi可以显着提高CBCT图像的空间分辨率。我们相信，这种开发的 suRi 方法将极大地提高未来基于 DL-FPD 的双能 CBCT 系统的成像性能。

AU Zhong, Liming Chen, Zeli Shu, Hai Zheng, Kaiyi Li, Yin Chen, Weicui Wu, Yuankui Ma, Jianhua Feng, Qianjin Yang, Wei
钟AU、陈黎明、舒泽立、郑海、李开义、陈银、吴伟翠、马元奎、冯建华、杨前进、魏

Multi-Scale Tokens-Aware Transformer Network for Multi-Region and Multi-Sequence MR-to-CT Synthesis in a Single Model
用于在单一模型中进行多区域和多序列 MR-to-CT 合成的多尺度令牌感知变压器网络

The superiority of magnetic resonance (MR)-only radiotherapy treatment planning (RTP) has been well demonstrated, benefiting from the synthesis of computed tomography (CT) images which supplements electron density and eliminates the errors of multi-modal images registration. An increasing number of methods has been proposed for MR-to-CT synthesis. However, synthesizing CT images of different anatomical regions from MR images with different sequences using a single model is challenging due to the large differences between these regions and the limitations of convolutional neural networks in capturing global context information. In this paper, we propose a multi-scale tokens-aware Transformer network (MTT-Net) for multi-region and multi-sequence MR-to-CT synthesis in a single model. Specifically, we develop a multi-scale image tokens Transformer to capture multi-scale global spatial information between different anatomical structures in different regions. Besides, to address the limited attention areas of tokens in Transformer, we introduce a multi-shape window self-attention into Transformer to enlarge the receptive fields for learning the multi-directional spatial representations. Moreover, we adopt a domain classifier in generator to introduce the domain knowledge for distinguishing the MR images of different regions and sequences. The proposed MTT-Net is evaluated on a multi-center dataset and an unseen region, and remarkable performance was achieved with MAE of 69.33 +/- 10.39 HU, SSIM of 0.778 +/- 0.028, and PSNR of 29.04 +/- 1.32 dB in head & neck region, and MAE of 62.80 +/- 7.65 HU, SSIM of 0.617 +/- 0.058 and PSNR of 25.94 +/- 1.02 dB in abdomen region. The proposed MTT-Net outperforms state-of-the-art methods in both accuracy and visual quality.
得益于计算机断层扫描（CT）图像的合成，补充了电子密度并消除了多模态图像配准的误差，仅磁共振（MR）放射治疗计划（RTP）的优越性已得到充分证明。越来越多的方法被提出用于 MR 到 CT 的合成。然而，使用单一模型从不同序列的 MR 图像合成不同解剖区域的 CT 图像具有挑战性，因为这些区域之间存在巨大差异，并且卷积神经网络在捕获全局上下文信息方面存在局限性。在本文中，我们提出了一种多尺度令牌感知 Transformer 网络（MTT-Net），用于在单个模型中进行多区域和多序列 MR-to-CT 合成。具体来说，我们开发了一个多尺度图像令牌 Transformer 来捕获不同区域的不同解剖结构之间的多尺度全局空间信息。此外，为了解决 Transformer 中 token 的注意力区域有限的问题，我们在 Transformer 中引入了多形状窗口自注意力，以扩大学习多方向空间表示的感受野。此外，我们在生成器中采用领域分类器来引入领域知识来区分不同区域和序列的MR图像。所提出的 MTT-Net 在多中心数据集和未见区域上进行了评估，取得了显着的性能，MAE 为 69.33 +/- 10.39 HU，SSIM 为 0.778 +/- 0.028，PSNR 为 29.04 +/- 1.32 dB头颈部区域的 MAE 为 62.80 +/- 7.65 HU，SSIM 为 0.617 +/- 0.058，腹部区域的 PSNR 为 25.94 +/- 1.02 dB。所提出的 MTT-Net 在准确性和视觉质量方面均优于最先进的方法。

AU Zhang, Jingyang Pei, Jialun Xu, Dunyuan Jin, Yueming Heng, Pheng-Ann
张AU、裴景阳、徐嘉伦、金敦元、衡月明、彭安

DC2T: Disentanglement-Guided Consolidation and Consistency Training for Semi-Supervised Cross-Site Continual Segmentation.
DC2T：半监督跨站点连续分割的解开引导整合和一致性训练。

Continual Learning (CL) is recognized to be a storage-efficient and privacy-protecting approach for learning from sequentially-arriving medical sites. However, most existing CL methods assume that each site is fully labeled, which is impractical due to budget and expertise constraint. This paper studies the Semi-Supervised Continual Learning (SSCL) that adopts partially-labeled sites arriving over time, with each site delivering only limited labeled data while the majority remains unlabeled. In this regard, it is challenging to effectively utilize unlabeled data under dynamic cross-site domain gaps, leading to intractable model forgetting on such unlabeled data. To address this problem, we introduce a novel Disentanglement-guided Consolidation and Consistency Training (DC2T) framework, which roots in an Online Semi-Supervised representation Disentanglement (OSSD) perspective to excavate content representations of partially labeled data from sites arriving over time. Moreover, these content representations are required to be consolidated for site-invariance and calibrated for style-robustness, in order to alleviate forgetting even in the absence of ground truth. Specifically, for the invariance on previous sites, we retain historical content representations when learning on a new site, via a Content-inspired Parameter Consolidation (CPC) method that prevents altering the model parameters crucial for content preservation. For the robustness against style variation, we develop a Style-induced Consistency Training (SCT) scheme that enforces segmentation consistency over style-related perturbations to recalibrate content encoding. We extensively evaluate our method on fundus and cardiac image segmentation, indicating the advantage over existing SSCL methods for alleviating forgetting on unlabeled data.
持续学习（CL）被认为是一种高效存储且保护隐私的方法，用于从顺序到达的医疗站点进行学习。然而，大多数现有的 CL 方法都假设每个站点都已完全标记，但由于预算和专业知识的限制，这是不切实际的。本文研究了半监督持续学习（SSCL），该学习采用随时间推移到达的部分标记站点，每个站点仅提供有限的标记数据，而大多数站点保持未标记。在这方面，在动态跨站点域间隙下有效利用未标记数据具有挑战性，导致此类未标记数据难以处理的模型遗忘。为了解决这个问题，我们引入了一种新颖的解缠引导巩固和一致性训练（DC2T）框架，该框架植根于在线半监督表示解缠（OSSD）视角，以从随时间到达的站点中挖掘部分标记数据的内容表示。此外，这些内容表示需要针对站点不变性进行整合，并针对风格鲁棒性进行校准，以便即使在没有基本事实的情况下也能减少遗忘。具体来说，为了保持先前站点的不变性，我们在新站点上学习时通过内容启发参数合并（CPC）方法保留历史内容表示，该方法防止更改对内容保存至关重要的模型参数。为了针对风格变化的鲁棒性，我们开发了一种风格诱导的一致性训练（SCT）方案，该方案在与风格相关的扰动上强制执行分段一致性，以重新校准内容编码。我们对眼底和心脏图像分割方面的方法进行了广泛的评估，表明我们的方法比现有的 SSCL 方法在减少未标记数据的遗忘方面具有优势。

AU Han, Bowen Sun, Luhao Li, Chao Yu, Zhiyong Jiang, Wenzong Liu, Weifeng Tao, Dapeng Liu, Baodi
AU Han、孙博文、李路豪、余超、蒋志勇、刘文宗、陶伟峰、刘大鹏、宝迪

Deep Location Soft-Embedding-Based Network With Regional Scoring for Mammogram Classification
基于深度定位软嵌入的网络，具有用于乳房 X 光检查分类的区域评分

Early detection and treatment of breast cancer can significantly reduce patient mortality, and mammogram is an effective method for early screening. Computer-aided diagnosis (CAD) of mammography based on deep learning can assist radiologists in making more objective and accurate judgments. However, existing methods often depend on datasets with manual segmentation annotations. In addition, due to the large image sizes and small lesion proportions, many methods that do not use region of interest (ROI) mostly rely on multi-scale and multi-feature fusion models. These shortcomings increase the labor, money, and computational overhead of applying the model. Therefore, a deep location soft-embedding-based network with regional scoring (DLSEN-RS) is proposed. DLSEN-RS is an end-to-end mammography image classification method containing only one feature extractor and relies on positional embedding (PE) and aggregation pooling (AP) modules to locate lesion areas without bounding boxes, transfer learning, or multi-stage training. In particular, the introduced PE and AP modules exhibit versatility across various CNN models and improve the model's tumor localization and diagnostic accuracy for mammography images. Experiments are conducted on published INbreast and CBIS-DDSM datasets, and compared to previous state-of-the-art mammographic image classification methods, DLSEN-RS performed satisfactorily.
乳腺癌的早期发现和治疗可以显着降低患者死亡率，而乳房X光检查是早期筛查的有效方法。基于深度学习的乳腺X线摄影计算机辅助诊断（CAD）可以帮助放射科医生做出更客观、准确的判断。然而，现有方法通常依赖于具有手动分割注释的数据集。此外，由于图像尺寸大、病变比例小，许多不使用感兴趣区域（ROI）的方法大多依赖于多尺度、多特征融合模型。这些缺点增加了应用该模型的劳动力、金钱和计算开销。因此，提出了一种基于深度位置软嵌入的区域评分网络（DLSEN-RS）。 DLSEN-RS 是一种端到端乳腺 X 线摄影图像分类方法，仅包含一个特征提取器，依靠位置嵌入 (PE) 和聚合池 (AP) 模块来定位病变区域，无需边界框、迁移学习或多阶段训练。特别是，引入的 PE 和 AP 模块在各种 CNN 模型中表现出多功能性，并提高了模型的肿瘤定位和乳腺 X 线摄影图像的诊断准确性。在已发布的 INbreast 和 CBIS-DDSM 数据集上进行了实验，与之前最先进的乳腺 X 光图像分类方法相比，DLSEN-RS 的表现令人满意。

AU Bi, Ning Zakeri, Arezoo Xia, Yan Cheng, Nina Taylor, Zeike A Frangi, Alejandro F Gooya, Ali
AU Bi、Ning Zakeri、Arezoo Xia、Yan Cheng、Nina Taylor、Zeike A Frangi、Alejandro F Gooya、Ali

SegMorph: Concurrent Motion Estimation and Segmentation for Cardiac MRI Sequences.
SegMorph：心脏 MRI 序列的并发运动估计和分割。

We propose a novel recurrent variational network, SegMorph, to perform concurrent segmentation and motion estimation on cardiac cine magnetic resonance image (CMR) sequences. Our model establishes a recurrent latent space that captures spatiotemporal features from cine-MRI sequences for multitask inference and synthesis. The proposed model follows a recurrent variational auto-encoder framework and adopts a learnt prior from the temporal inputs. We utilise a multi-branch decoder to handle bi-ventricular segmentation and motion estimation simultaneously. In addition to the spatiotemporal features from the latent space, motion estimation enriches the supervision of sequential segmentation tasks by providing pseudo-ground truth. On the other hand, the segmentation branch helps with motion estimation by predicting deformation vector fields (DVFs) based on anatomical information. Experimental results demonstrate that the proposed method performs better than state-of-the-art approaches qualitatively and quantitatively for both segmentation and motion estimation tasks. We achieved an 81% average Dice Similarity Coefficient (DSC) and a less than 3.5 mm average Hausdorff distance on segmentation. Meanwhile, we achieved a motion estimation Dice Similarity Coefficient of over 79%, with approximately 0.14% of pixels displaying a negative Jacobian determinant in the estimated DVFs.
我们提出了一种新颖的循环变分网络 SegMorph，用于对心脏电影磁共振图像 (CMR) 序列执行并发分割和运动估计。我们的模型建立了一个循环潜在空间，可以从电影 MRI 序列中捕获时空特征，以进行多任务推理和合成。所提出的模型遵循循环变分自动编码器框架，并采用从时间输入中学习到的先验知识。我们利用多分支解码器同时处理双心室分割和运动估计。除了来自潜在空间的时空特征之外，运动估计还通过提供伪地面实况丰富了顺序分割任务的监督。另一方面，分割分支通过基于解剖信息预测变形矢量场（DVF）来帮助进行运动估计。实验结果表明，对于分割和运动估计任务，所提出的方法在定性和定量方面都比最先进的方法表现得更好。我们在分割方面实现了 81% 的平均 Dice 相似系数 (DSC) 和小于 3.5 毫米的平均 Hausdorff 距离。同时，我们实现了超过 79% 的运动估计骰子相似系数，大约 0.14% 的像素在估计的 DVF 中显示负雅可比行列式。

AU Bui, Doanh C Song, Boram Kim, Kyungeun Kwak, Jin Tae
AU Bui、Doanh C Song、Boram Kim、Kyungeun Kwak、Jin Tae

Spatially-constrained and -unconstrained bi-graph interaction network for multi-organ pathology image classification.
用于多器官病理图像分类的空间约束和无约束双图交互网络。

In computational pathology, graphs have shown to be promising for pathology image analysis. There exist various graph structures that can discover differing features of pathology images. However, the combination and interaction between differing graph structures have not been fully studied and utilized for pathology image analysis. In this study, we propose a parallel, bi-graph neural network, designated as SCUBa-Net, equipped with both graph convolutional networks and Transformers, that processes a pathology image as two distinct graphs, including a spatially-constrained graph and a spatially-unconstrained graph. For efficient and effective graph learning, we introduce two inter-graph interaction blocks and an intra-graph interaction block. The inter-graph interaction blocks learn the node-to-node interactions within each graph. The intra-graph interaction block learns the graph-to-graph interactions at both global- and local-levels with the help of the virtual nodes that collect and summarize the information from the entire graphs. SCUBa-Net is systematically evaluated on four multi-organ datasets, including colorectal, prostate, gastric, and bladder cancers. The experimental results demonstrate the effectiveness of SCUBa-Net in comparison to the state-of-the-art convolutional neural networks, Transformer, and graph neural networks.
在计算病理学中，图表已被证明在病理图像分析中很有前景。存在多种可以发现病理图像的不同特征的图结构。然而，不同图结构之间的组合和相互作用尚未得到充分研究和利用用于病理图像分析。在本研究中，我们提出了一种并行双图神经网络，称为 SCUBa-Net，配备图卷积网络和 Transformer，将病理图像处理为两个不同的图，包括空间约束图和空间约束图。无约束图。为了高效且有效的图学习，我们引入了两个图间交互块和一个图内交互块。图间交互块学习每个图中的节点到节点的交互。图内交互块在收集和总结整个图中信息的虚拟节点的帮助下学习全局和局部级别的图到图交互。 SCUBa-Net 在四个多器官数据集上进行了系统评估，包括结直肠癌、前列腺癌、胃癌和膀胱癌。实验结果证明了 SCUBa-Net 与最先进的卷积神经网络、Transformer 和图神经网络相比的有效性。

AU Deshpande, Rucha Ozbey, Muzaffer Li, Hua Anastasio, Mark A Brooks, Frank J
AU Deshpande、Rucha Ozbey、Muzaffer Li、Hua Anastasio、Mark A Brooks、Frank J

Assessing the capacity of a denoising diffusion probabilistic model to reproduce spatial context.
评估去噪扩散概率模型再现空间上下文的能力。

Diffusion models have emerged as a popular family of deep generative models (DGMs). In the literature, it has been claimed that one class of diffusion models-denoising diffusion probabilistic models (DDPMs)-demonstrate superior image synthesis performance as compared to generative adversarial networks (GANs). To date, these claims have been evaluated using either ensemble-based methods designed for natural images, or conventional measures of image quality such as structural similarity. However, there remains an important need to understand the extent to which DDPMs can reliably learn medical imaging domain-relevant information, which is referred to as 'spatial context' in this work. To address this, a systematic assessment of the ability of DDPMs to learn spatial context relevant to medical imaging applications is reported for the first time. A key aspect of the studies is the use of stochastic context models (SCMs) to produce training data. In this way, the ability of the DDPMs to reliably reproduce spatial context can be quantitatively assessed by use of post-hoc image analyses. Error-rates in DDPM-generated ensembles are reported, and compared to those corresponding to other modern DGMs. The studies reveal new and important insights regarding the capacity of DDPMs to learn spatial context. Notably, the results demonstrate that DDPMs hold significant capacity for generating contextually correct images that are 'interpolated' between training samples, which may benefit data-augmentation tasks in ways that GANs cannot.
扩散模型已成为流行的深度生成模型 (DGM) 系列。在文献中，据称一类扩散模型——去噪扩散概率模型（DDPM）——与生成对抗网络（GAN）相比表现出更优越的图像合成性能。迄今为止，这些声明已使用针对自然图像设计的基于集成的方法或传统的图像质量测量（例如结构相似性）进行了评估。然而，仍然非常需要了解 DDPM 能够在多大程度上可靠地学习医学成像领域相关信息，在本工作中被称为“空间上下文”。为了解决这个问题，首次报告了对 DDPM 学习与医学成像应用相关的空间背景的能力的系统评估。该研究的一个关键方面是使用随机上下文模型（SCM）来生成训练数据。通过这种方式，可以通过使用事后图像分析来定量评估 DDPM 可靠地再现空间上下文的能力。报告了 DDPM 生成的集成中的错误率，并与其他现代 DGM 对应的错误率进行了比较。这些研究揭示了有关 DDPM 学习空间背景能力的新的重要见解。值得注意的是，结果表明 DDPM 具有生成在训练样本之间“插值”的上下文正确图像的强大能力，这可能以 GAN 无法做到的方式有益于数据增强任务。

AU Ren, Jiahao Li, Jian Liu, Chang Chen, Shili Liang, Lin Liu, Yang
任仁、李嘉豪、刘健、陈昌、梁诗莉、刘林、杨

Deep Learning With Physics-Embedded Neural Network for Full Waveform Ultrasonic Brain Imaging
利用物理嵌入式神经网络进行深度学习，实现全波形超声脑成像

The convenience, safety, and affordability of ultrasound imaging make it a vital non-invasive diagnostic technique for examining soft tissues. However, significant differences in acoustic impedance between the skull and soft tissues hinder the successful application of traditional ultrasound for brain imaging. In this study, we propose a physics-embedded neural network with deep learning based full waveform inversion (PEN-FWI), which can achieve reliable quantitative imaging of brain tissues. The network consists of two fundamental components: forward convolutional neural network (FCNN) and inversion sub-neural network (ISNN). The FCNN explores the nonlinear mapping relationship between the brain model and the wavefield, replacing the tedious wavefield calculation process based on the finite difference method. The ISNN implements the mapping from the wavefield to the model. PEN-FWI includes three iterative steps, each embedding the F CNN into the ISNN, ultimately achieving tomography from wavefield to brain models. Simulation and laboratory tests indicate that PEN-FWI can produce high-quality imaging of the skull and soft tissues, even starting from a homogeneous water model. PEN-FWI can achieve excellent imaging of clot models with constant uniform distribution of velocity, randomly Gaussian distribution of velocity, and irregularly shaped randomly distributed velocity. Robust differentiation can also be achieved for brain slices of various tissues and skulls, resulting in high-quality imaging. The imaging time for a horizontal cross-sectional imag e of the brain is only 1.13 seconds. This algorithm can effectively promote ultrasound-based brain tomography and provide feasible solutions in other fields.
超声成像的便利性、安全性和经济性使其成为检查软组织的重要非侵入性诊断技术。然而，颅骨和软组织之间声阻抗的显着差异阻碍了传统超声在脑成像中的成功应用。在本研究中，我们提出了一种基于深度学习的物理嵌入式神经网络全波形反演（PEN-FWI），它可以实现脑组织的可靠定量成像。该网络由两个基本组件组成：前向卷积神经网络（FCNN）和反演子神经网络（ISNN）。 FCNN探索了脑模型与波场之间的非线性映射关系，取代了基于有限差分法的繁琐的波场计算过程。 ISNN 实现了从波场到模型的映射。 PEN-FWI包括三个迭代步骤，每个步骤将F CNN嵌入到ISNN中，最终实现从波场到脑模型的断层扫描。模拟和实验室测试表明，即使从均质水模型开始，PEN-FWI 也可以产生颅骨和软组织的高质量成像。 PEN-FWI可以对速度恒定均匀分布、速度随机高斯分布和不规则形状随机分布速度的凝块模型实现良好的成像。还可以对各种组织和头骨的脑切片进行稳健区分，从而获得高质量的成像。大脑水平横截面图像的成像时间仅为1.13秒。该算法可以有效推广基于超声的脑断层扫描，并为其他领域提供可行的解决方案。

AU Geng, Haixiao Fan, Jingfan Yang, Shuo Chen, Sigeng Xiao, Deqiang Ai, Danni Fu, Tianyu Song, Hong Yuan, Kai Duan, Feng Wang, Yongtian Yang, Jian
区耿、范海晓、杨经帆、陈硕、肖思耕、艾德强、付丹妮、宋天宇、袁宏、段凯、王峰、杨永田、简

DSC-Recon: Dual-Stage Complementary 4D Organ Reconstruction from X-ray Image Sequence for Intraoperative Fusion.
DSC-Recon：根据 X 射线图像序列进行双阶段互补 4D 器官重建，用于术中融合。

Accurately reconstructing 4D critical organs contributes to the visual guidance in X-ray image-guided interventional operation. Current methods estimate intraoperative dynamic meshes by refining a static initial organ mesh from the semantic information in the single-frame X-ray images. However, these methods fall short of reconstructing an accurate and smooth organ sequence due to the distinct respiratory patterns between the initial mesh and X-ray image. To overcome this limitation, we propose a novel dual-stage complementary 4D organ reconstruction (DSC-Recon) model for recovering dynamic organ meshes by utilizing the preoperative and intraoperative data with different respiratory patterns. DSC-Recon is structured as a dual-stage framework: 1) The first stage focuses on addressing a flexible interpolation network applicable to multiple respiratory patterns, which could generate dynamic shape sequences between any pair of preoperative 3D meshes segmented from CT scans. 2) In the second stage, we present a deformation network to take the generated dynamic shape sequence as the initial prior and explore the discriminate feature (i.e., target organ areas and meaningful motion information) in the intraoperative X-ray images, predicting the deformed mesh by introducing a designed feature mapping pipeline integrated into the initialized shape refinement process. Experiments on simulated and clinical datasets demonstrate the superiority of our method over state-of-the-art methods in both quantitative and qualitative aspects.
准确重建4D关键器官有助于X射线图像引导介入手术的视觉引导。当前的方法通过根据单帧 X 射线图像中的语义信息细化静态初始器官网格来估计术中动态网格。然而，由于初始网格和 X 射线图像之间不同的呼吸模式，这些方法无法重建准确且平滑的器官序列。为了克服这一限制，我们提出了一种新型双阶段互补 4D 器官重建 (DSC-Recon) 模型，用于通过利用不同呼吸模式的术前和术中数据来恢复动态器官网格。 DSC-Recon 的结构为双阶段框架：1）第一阶段侧重于解决适用于多种呼吸模式的灵活插值网络，该网络可以在从 CT 扫描分段的任何一对术前 3D 网格之间生成动态形状序列。 2）在第二阶段，我们提出一个变形网络，以生成的动态形状序列作为初始先验，探索术中X射线图像中的判别特征（即目标器官区域和有意义的运动信息），预测变形通过引入集成到初始化形状细化过程中的设计特征映射管道来进行网格划分。模拟和临床数据集的实验证明了我们的方法在定量和定性方面都优于最先进的方法。

AU Lu, Yucheng Xu, Zhixin Choi, Moon Hyung Kim, Jimin Jung, Seung-Won
AU Lu、徐宇成、崔志新、金文亨、Jimin Jung、Seung-Won

Cross-domain Denoising for Low-dose Multi-frame Spiral Computed Tomography.
低剂量多帧螺旋计算机断层扫描的跨域去噪。

Computed tomography (CT) has been used worldwide as a non-invasive test to assist in diagnosis. However, the ionizing nature of X-ray exposure raises concerns about potential health risks such as cancer. The desire for lower radiation doses has driven researchers to improve reconstruction quality. Although previous studies on low-dose computed tomography (LDCT) denoising have demonstrated the effectiveness of learning-based methods, most were developed on the simulated data. However, the real-world scenario differs significantly from the simulation domain, especially when using the multi-slice spiral scanner geometry. This paper proposes a two-stage method for the commercially available multi-slice spiral CT scanners that better exploits the complete reconstruction pipeline for LDCT denoising across different domains. Our approach makes good use of the high redundancy of multi-slice projections and the volumetric reconstructions while leveraging the over-smoothing issue in conventional cascaded frameworks caused by aggressive denoising. The dedicated design also provides a more explicit interpretation of the data flow. Extensive experiments on various datasets showed that the proposed method could remove up to 70% of noise without compromised spatial resolution, while subjective evaluations by two experienced radiologists further supported its superior performance against state-of-the-art methods in clinical practice. Code is available at https://github.com/YCL92/TMD-LDCT.
计算机断层扫描 (CT) 已在全世界范围内用作辅助诊断的非侵入性检查。然而，X 射线暴露的电离性质引起了人们对癌症等潜在健康风险的担忧。对较低辐射剂量的渴望促使研究人员提高重建质量。尽管之前关于低剂量计算机断层扫描（LDCT）去噪的研究已经证明了基于学习的方法的有效性，但大多数都是在模拟数据上开发的。然而，现实世界的场景与模拟域有很大不同，特别是在使用多层螺旋扫描仪几何结构时。本文提出了一种适用于商用多层螺旋 CT 扫描仪的两阶段方法，该方法可以更好地利用完整的重建流程来跨不同领域进行 LDCT 去噪。我们的方法充分利用了多切片投影和体积重建的高冗余，同时利用了传统级联框架中由激进去噪引起的过度平滑问题。专用设计还提供了对数据流的更明确的解释。对各种数据集的大量实验表明，所提出的方法可以在不影响空间分辨率的情况下消除高达 70% 的噪声，而两位经验丰富的放射科医生的主观评估进一步支持了其在临床实践中相对于最先进方法的优越性能。代码可在 https://github.com/YCL92/TMD-LDCT 获取。

AU Li, Yunxiang Shao, Hua-Chieh Liang, Xiao Chen, Liyuan Li, Ruiqi Jiang, Steve Wang, Jing Zhang, You
AU Li、邵云翔、梁华杰、陈晓、李丽媛、蒋瑞琪、Steve Wang、张静、尤

Zero-Shot Medical Image Translation via Frequency-Guided Diffusion Models
通过频率引导扩散模型的零样本医学图像翻译

Recently, the diffusion model has emerged as a superior generative model that can produce high quality and realistic images. However, for medical image translation, the existing diffusion models are deficient in accurately retaining structural information since the structure details of source domain images are lost during the forward diffusion process and cannot be fully recovered through learned reverse diffusion, while the integrity of anatomical structures is extremely important in medical images. For instance, errors in image translation may distort, shift, or even remove structures and tumors, leading to incorrect diagnosis and inadequate treatments. Training and conditioning diffusion models using paired source and target images with matching anatomy can help. However, such paired data are very difficult and costly to obtain, and may also reduce the robustness of the developed model to out-of-distribution testing data. We propose a frequency-guided diffusion model (FGDM) that employs frequency-domain filters to guide the diffusion model for structure-preserving image translation. Based on its design, FGDM allows zero-shot learning, as it can be trained solely on the data from the target domain, and used directly for source-to-target domain translation without any exposure to the source-domain data during training. We evaluated it on three cone-beam CT (CBCT)-to-CT translation tasks for different anatomical sites, and a cross-institutional MR imaging translation task. FGDM outperformed the state-of-the-art methods (GAN-based, VAE-based, and diffusion-based) in metrics of Frechet Inception Distance (FID), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index Measure (SSIM), showing its significant advantages in zero-shot medical image translation.
最近，扩散模型已成为一种优越的生成模型，可以生成高质量且逼真的图像。然而，对于医学图像翻译，现有的扩散模型在准确保留结构信息方面存在缺陷，因为源域图像的结构细节在前向扩散过程中丢失，并且无法通过学习的反向扩散完全恢复，而解剖结构的完整性也受到影响。在医学图像中极其重要。例如，图像翻译中的错误可能会扭曲、移动甚至移除结构和肿瘤，从而导致错误的诊断和不充分的治疗。使用配对的源图像和目标图像以及匹配的解剖结构来训练和调节扩散模型会有所帮助。然而，获得此类配对数据非常困难且成本高昂，并且还可能降低所开发模型对分布外测试数据的鲁棒性。我们提出了一种频率引导扩散模型（FGDM），它采用频域滤波器来指导扩散模型以实现结构保留图像转换。基于其设计，FGDM 允许零样本学习，因为它可以仅根据目标域的数据进行训练，并直接用于源到目标域的转换，而无需在训练期间接触源域数据。我们在不同解剖部位的三个锥束 CT (CBCT) 到 CT 转换任务以及跨机构 MR 成像转换任务上对其进行了评估。 FGDM 在 Frechet 起始距离 (FID)、峰值信噪比 (PSNR) 和结构相似性指数指标方面优于最先进的方法（基于 GAN、基于 VAE 和基于扩散） Measure（SSIM），在零样本医学图像翻译方面展现出显着优势。

AU Lee, Seungeun Lee, Seunghwan Willbrand, Ethan H Parker, Benjamin J Bunge, Silvia A Weiner, Kevin S Lyu, Ilwoo
AU Lee、Seungeun Lee、Seunghwan Willbrand、Ethan H Parker、Benjamin J Bunge、Silvia A Weiner、Kevin S Lyu、Ilwoo

Leveraging Input-Level Feature Deformation with Guided-Attention for Sulcal Labeling.
利用输入级特征变形和引导注意力进行脑沟标记。

The identification of cortical sulci is key for understanding functional and structural development of the cortex. While large, consistent sulci (or primary/secondary sulci) receive significant attention in most studies, the exploration of smaller and more variable sulci (or putative tertiary sulci) remains relatively under-investigated. Despite its importance, automatic labeling of cortical sulci is challenging due to (1) the presence of substantial anatomical variability, (2) the relatively small size of the regions of interest (ROIs) compared to unlabeled regions, and (3) the scarcity of annotated labels. In this paper, we propose a novel end-to-end learning framework using a spherical convolutional neural network (CNN). Specifically, the proposed method learns to effectively warp geometric features in a direction that facilitates the labeling of sulci while mitigating the impact of anatomical variability. Moreover, we introduce a guided-attention mechanism that takes into account the extent of deformation induced by the learned warping. This extracts discriminative features that emphasize sulcal ROIs, while suppressing irrelevant information of unlabeled regions. In the experiments, we evaluate the proposed method on 8 sulci of the posterior medial cortex. Our method outperforms existing methods particularly in the putative tertiary sulci. The code is publicly available at https://github.com/Shape-Lab/DSPHARM-Net.
皮质沟的识别是了解皮质功能和结构发育的关键。虽然大的、一致的脑沟（或初级/次级脑沟）在大多数研究中受到了极大的关注，但对更小且更多变的脑沟（或推定的三级脑沟）的探索仍然相对不足。尽管其重要性，皮质沟的自动标记仍然具有挑战性，因为（1）存在显着的解剖变异性，（2）与未标记区域相比，感兴趣区域（ROI）的尺寸相对较小，以及（3）缺乏带注释的标签。在本文中，我们提出了一种使用球形卷积神经网络（CNN）的新颖的端到端学习框架。具体来说，所提出的方法学习有效地将几何特征扭曲到有利于脑沟标记的方向，同时减轻解剖变异性的影响。此外，我们引入了一种引导注意机制，该机制考虑了学习到的扭曲引起的变形程度。这提取了强调脑沟 ROI 的区分特征，同时抑制了未标记区域的不相关信息。在实验中，我们在后内侧皮质的 8 个脑沟上评估了所提出的方法。我们的方法优于现有方法，特别是在假定的第三沟中。该代码可在 https://github.com/Shape-Lab/DSPHARM-Net 上公开获取。

EI 1558-254X DA 2024-09-28 UT MEDLINE:39325613 PM 39325613 ER
EI 1558-254X DA 2024-09-28 UT MEDLINE：39325613 PM 39325613 ER

AU Chen, Xiongchao Zhou, Bo Guo, Xueqi Xie, Huidong Liu, Qiong Duncan, James S. Sinusas, Albert J. Liu, Chi
AU Chen, 周雄超, 郭博, 谢学奇, 刘慧东, Qiong Duncan, James S. Sinusas, Albert J. Liu, Chi

DuDoCFNet: Dual-Domain Coarse-to-Fine Progressive Network for Simultaneous Denoising, Limited-View Reconstruction, and Attenuation Correction of Cardiac SPECT
DuDoCFNet：双域从粗到细渐进网络，用于心脏 SPECT 的同步去噪、有限视图重建和衰减校正

Single-Photon Emission Computed Tomography (SPECT) is widely applied for the diagnosis of coronary artery diseases. Low-dose (LD) SPECT aims to minimize radiation exposure but leads to increased image noise. Limited-view (LV) SPECT, such as the latest GE MyoSPECT ES system, enables accelerated scanning and reduces hardware expenses but degrades reconstruction accuracy. Additionally, Computed Tomography (CT) is commonly used to derive attenuation maps ( $\mu $ -maps) for attenuation correction (AC) of cardiac SPECT, but it will introduce additional radiation exposure and SPECT-CT misalignments. Although various methods have been developed to solely focus on LD denoising, LV reconstruction, or CT-free AC in SPECT, the solution for simultaneously addressing these tasks remains challenging and under-explored. Furthermore, it is essential to explore the potential of fusing cross-domain and cross-modality information across these interrelated tasks to further enhance the accuracy of each task. Thus, we propose a Dual-Domain Coarse-to-Fine Progressive Network (DuDoCFNet), a multi-task learning method for simultaneous LD denoising, LV reconstruction, and CT-free -map generation of cardiac SPECT. Paired dual-domain networks in DuDoCFNet are cascaded using a multi-layer fusion mechanism for cross-domain and cross-modality feature fusion. Two-stage progressive learning strategies are applied in both projection and image domains to achieve coarse-to-fine estimations of SPECT projections and CT-derived $\mu $ -maps. Our experiments demonstrate DuDoCFNet's superior accuracy in estimating projections, generating $\mu $ -maps, and AC reconstructions compared to existing single- or multi-task learning methods, under various iterations and LD levels. The source code of this work is available at https://github.com/XiongchaoChen/DuDoCFNet-MultiTask.
单光子发射计算机断层扫描（SPECT）广泛应用于冠状动脉疾病的诊断。低剂量 (LD) SPECT 旨在最大限度地减少辐射暴露，但会导致图像噪声增加。有限视野 (LV) SPECT，例如最新的 GE MyoSPECT ES 系统，可以加速扫描并减少硬件费用，但会降低重建精度。此外，计算机断层扫描 (CT) 通常用于导出用于心脏 SPECT 衰减校正 (AC) 的衰减图 ( $\mu $ -maps)，但它会引入额外的辐射暴露和 SPECT-CT 错位。尽管已经开发出各种方法来专门关注 SPECT 中的 LD 去噪、左心室重建或无 CT 交流，但同时解决这些任务的解决方案仍然具有挑战性且尚未得到充分探索。此外，有必要探索在这些相互关联的任务中融合跨域和跨模态信息的潜力，以进一步提高每个任务的准确性。因此，我们提出了一种双域粗到精渐进网络（DuDoCFNet），这是一种多任务学习方法，用于同时 LD 去噪、左心室重建和心脏 SPECT 的无 CT 地图生成。 DuDoCFNet 中的配对双域网络使用多层融合机制进行级联，以实现跨域和跨模态特征融合。两阶段渐进学习策略应用于投影和图像领域，以实现 SPECT 投影和 CT 导出的 $\mu $ 地图的从粗到细的估计。我们的实验证明，在各种迭代和 LD 级别下，与现有的单任务或多任务学习方法相比，DuDoCFNet 在估计投影、生成 $\mu $ -map 和 AC 重建方面具有卓越的准确性。这项工作的源代码可以在 https://github.com/XiongchaoChen/DuDoCFNet-MultiTask 获取。

AU Ghoul, Aya Pan, Jiazhen Lingg, Andreas Kuebler, Jens Krumm, Patrick Hammernik, Kerstin Rueckert, Daniel Gatidis, Sergios Kuestner, Thomas
AU Ghoul、Aya Pan、Jiazhen Lingg、Andreas Kuebler、Jens Krumm、Patrick Hammernik、Kerstin Rueckert、Daniel Gatidis、Sergios Kuestner、Thomas

Attention-Aware Non-Rigid Image Registration for Accelerated MR Imaging
用于加速 MR 成像的注意力感知非刚性图像配准

Accurate motion estimation at high acceleration factors enables rapid motion-compensated reconstruction in Magnetic Resonance Imaging (MRI) without compromising the diagnostic image quality. In this work, we introduce an attention-aware deep learning-based framework that can perform non-rigid pairwise registration for fully sampled and accelerated MRI. We extract local visual representations to build similarity maps between the registered image pairs at multiple resolution levels and additionally leverage long-range contextual information using a transformer-based module to alleviate ambiguities in the presence of artifacts caused by undersampling. We combine local and global dependencies to perform simultaneous coarse and fine motion estimation. The proposed method was evaluated on in-house acquired fully sampled and accelerated data of 101 patients and 62 healthy subjects undergoing cardiac and thoracic MRI. The impact of motion estimation accuracy on the downstream task of motion-compensated reconstruction was analyzed. We demonstrate that our model derives reliable and consistent motion fields across different sampling trajectories (Cartesian and radial) and acceleration factors of up to 16x for cardiac motion and 30x for respiratory motion and achieves superior image quality in motion-compensated reconstruction qualitatively and quantitatively compared to conventional and recent deep learning-based approaches.
高加速因子下的精确运动估计可实现磁共振成像 (MRI) 中的快速运动补偿重建，而不会影响诊断图像质量。在这项工作中，我们引入了一种基于注意力感知的深度学习框架，该框架可以为完全采样和加速的 MRI 执行非刚性成对配准。我们提取局部视觉表示，以在多个分辨率级别的注册图像对之间构建相似性图，并使用基于转换器的模块另外利用远程上下文信息来减轻因欠采样引起的伪像存在时的模糊性。我们结合局部和全局依赖性来同时执行粗略和精细运动估计。该方法根据内部采集的 101 名患者和 62 名接受心脏和胸部 MRI 健康受试者的完全采样和加速数据进行了评估。分析了运动估计精度对运动补偿重建下游任务的影响。我们证明，我们的模型在不同的采样轨迹（笛卡尔和径向）上得出可靠且一致的运动场，心脏运动的加速因子高达 16 倍，呼吸运动的加速因子高达 30 倍，并且与相比，在运动补偿重建中定性和定量地实现了卓越的图像质量。传统和最近基于深度学习的方法。

AU Ma, Wenao Chen, Cheng Gong, Yuqi Chan, Nga Yan Jiang, Meirui Mak, Calvin Hoi-Kwan Abrigo, Jill M. Dou, Qi
AU Ma, Wenao Chen, Cheng Kong, Yuqi Chan, Nga Yan Jiang, Meirui Mak, Calvin Hoi-Kwan Abrigo, Jill M. Dou, Qi

Causal Effect Estimation on Imaging and Clinical Data for Treatment Decision Support of Aneurysmal Subarachnoid Hemorrhage
影像学和临床数据因果效应估计对动脉瘤性蛛网膜下腔出血治疗决策的支持

Aneurysmal subarachnoid hemorrhage is a medical emergency of brain that has high mortality and poor prognosis. Causal effect estimation of treatment strategies on patient outcomes is crucial for aneurysmal subarachnoid hemorrhage treatment decision-making. However, most existing studies on treatment decision-making support of this disease are unable to simultaneously compare the potential outcomes of different treatments for a patient. Furthermore, these studies fail to harmoniously integrate the imaging data with non-imaging clinical data, both of which are useful in clinical scenarios. In this paper, we estimate the causal effect of various treatments on patients with aneurysmal subarachnoid hemorrhage by integrating plain CT with non-imaging clinical data, which is represented using structured tabular data. Specifically, we first propose a novel scheme that uses multi-modality confounders distillation architecture to predict the treatment outcome and treatment assignment simultaneously. With these distilled confounder features, we design an imaging and non-imaging interaction representation learning strategy to use the complementary information extracted from different modalities to balance the feature distribution of different treatment groups. We have conducted extensive experiments using a clinical dataset of 656 subarachnoid hemorrhage cases, which was collected from the Hospital Authority Data Collaboration Laboratory in Hong Kong. Our method shows consistent improvements on the evaluation metrics of treatment effect estimation, achieving state-of-the-art results over strong competitors. Code is released at https://github.com/med-air/TOP-aSAH.
动脉瘤性蛛网膜下腔出血是一种死亡率高、预后差的脑部急症。治疗策略对患者预后的因果效应评估对于动脉瘤性蛛网膜下腔出血的治疗决策至关重要。然而，大多数关于该疾病治疗决策支持的现有研究无法同时比较不同治疗对患者的潜在结果。此外，这些研究未能将成像数据与非成像临床数据和谐地整合在一起，而这两者在临床场景中都是有用的。在本文中，我们通过将平扫 CT 与非影像学临床数据（使用结构化表格数据表示）相结合，评估了各种治疗对动脉瘤性蛛网膜下腔出血患者的因果影响。具体来说，我们首先提出了一种新颖的方案，该方案使用多模态混杂因素蒸馏架构来同时预测治疗结果和治疗分配。利用这些提取的混杂特征，我们设计了一种成像和非成像交互表示学习策略，以使用从不同方式提取的补充信息来平衡不同治疗组的特征分布。我们使用从香港医院管理局数据协作实验室收集的 656 例蛛网膜下腔出血病例的临床数据集进行了广泛的实验。我们的方法显示了治疗效果估计的评估指标的持续改进，取得了超越强大竞争对手的最先进的结果。代码发布于 https://github.com/med-air/TOP-aSAH。

AU Liu, Huabing Huang, Jiawei Jia, Dengqiang Wang, Qian Xu, Jun Shen, Dinggang
刘AU、黄华兵、贾家伟、王登强、徐谦、沉军、丁刚

Transferring Adult-like Phase Images for Robust Multi-view Isointense Infant Brain Segmentation.
传输类似成人的相位图像以实现稳健的多视图等强度婴儿大脑分割。

Accurate tissue segmentation of infant brain in magnetic resonance (MR) images is crucial for charting early brain development and identifying biomarkers. Due to ongoing myelination and maturation, in the isointense phase (6-9 months of age), the gray and white matters of infant brain exhibit similar intensity levels in MR images, posing significant challenges for tissue segmentation. Meanwhile, in the adult-like phase around 12 months of age, the MR images show high tissue contrast and can be easily segmented. In this paper, we propose to effectively exploit adult-like phase images to achieve robustmulti-view isointense infant brain segmentation. Specifically, in one way, we transfer adult-like phase images to the isointense view, which have similar tissue contrast as the isointense phase images, and use the transferred images to train an isointense-view segmentation network. On the other way, we transfer isointense phase images to the adult-like view, which have enhanced tissue contrast, for training a segmentation network in the adult-like view. The segmentation networks of different views form a multi-path architecture that performs multi-view learning to further boost the segmentation performance. Since anatomy-preserving style transfer is key to the downstream segmentation task, we develop a Disentangled Cycle-consistent Adversarial Network (DCAN) with strong regularization terms to accurately transfer realistic tissue contrast between isointense and adult-like phase images while still maintaining their structural consistency. Experiments on both NDAR and iSeg-2019 datasets demonstrate a significant superior performance of our method over the state-of-the-art methods.
磁共振 (MR) 图像中婴儿大脑的准确组织分割对于绘制早期大脑发育图和识别生物标志物至关重要。由于持续的髓鞘形成和成熟，在等信号阶段（6-9个月大），婴儿大脑的灰质和白质在MR图像中表现出相似的强度水平，这对组织分割提出了重大挑战。同时，在 12 个月左右的类成人阶段，MR 图像显示出高组织对比度，并且可以轻松分割。在本文中，我们建议有效地利用类似成人的相位图像来实现鲁棒的多视图等强度婴儿大脑分割。具体来说，在一种方式中，我们将类似成人的相位图像转移到等强度视图，其具有与等强度相位图像相似的组织对比度，并使用转移的图像来训练等强度视图分割网络。另一方面，我们将等强度相位图像传输到类似成人的视图，其具有增强的组织对比度，用于在类似成人的视图中训练分割网络。不同视图的分割网络形成多路径架构，执行多视图学习以进一步提高分割性能。由于保留解剖结构的风格转移是下游分割任务的关键，因此我们开发了一种具有强大正则化项的解缠结循环一致对抗网络（DCAN），可以准确地转移等强度和类成人相位图像之间的真实组织对比度，同时仍然保持其结构一致性。在 NDAR 和 iSeg-2019 数据集上的实验表明，我们的方法比最先进的方法具有显着的优越性能。

AU Chen, Tao Wang, Chenhui Chen, Zhihao Lei, Yiming Shan, Hongming
陈AU、王涛、陈晨辉、雷志豪、单一鸣、洪明

HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation.
HiDiff：用于医学图像分割的混合扩散框架。

Medical image segmentation has been significantly advanced with the rapid development of deep learning (DL) techniques. Existing DL-based segmentation models are typically discriminative; i.e., they aim to learn a mapping from the input image to segmentation masks. However, these discriminative methods neglect the underlying data distribution and intrinsic class characteristics, suffering from unstable feature space. In this work, we propose to complement discriminative segmentation methods with the knowledge of underlying data distribution from generative models. To that end, we propose a novel hybrid diffusion framework for medical image segmentation, termed HiDiff, which can synergize the strengths of existing discriminative segmentation models and new generative diffusion models. HiDiff comprises two key components: discriminative segmentor and diffusion refiner. First, we utilize any conventional trained segmentation models as discriminative segmentor, which can provide a segmentation mask prior for diffusion refiner. Second, we propose a novel binary Bernoulli diffusion model (BBDM) as the diffusion refiner, which can effectively, efficiently, and interactively refine the segmentation mask by modeling the underlying data distribution. Third, we train the segmentor and BBDM in an alternate-collaborative manner to mutually boost each other. Extensive experimental results on abdomen organ, brain tumor, polyps, and retinal vessels segmentation datasets, covering four widely-used modalities, demonstrate the superior performance of HiDiff over existing medical segmentation algorithms, including the state-of-the-art transformer- and diffusion-based ones. In addition, HiDiff excels at segmenting small objects and generalizing to new datasets. Source codes are made available at https://github.com/takimailto/HiDiff.
随着深度学习（DL）技术的快速发展，医学图像分割取得了显着的进步。现有的基于深度学习的分割模型通常具有区分性；即，他们的目标是学习从输入图像到分割掩模的映射。然而，这些判别方法忽略了底层数据分布和内在类别特征，导致特征空间不稳定。在这项工作中，我们建议利用生成模型的基础数据分布知识来补充判别性分割方法。为此，我们提出了一种用于医学图像分割的新型混合扩散框架，称为 HiDiff，它可以协同现有判别分割模型和新的生成扩散模型的优势。 HiDiff 包含两个关键组件：判别分割器和扩散细化器。首先，我们利用任何传统的训练分割模型作为判别分割器，它可以为扩散细化器提供先验分割掩模。其次，我们提出了一种新颖的二元伯努利扩散模型（BBDM）作为扩散细化器，它可以通过对底层数据分布进行建模来有效、高效且交互式地细化分割掩模。第三，我们以交替协作的方式训练分段器和BBDM，相互促进。针对腹部器官、脑肿瘤、息肉和视网膜血管分割数据集的广泛实验结果（涵盖四种广泛使用的模式）证明了 HiDiff 优于现有医学分割算法（包括最先进的 Transformer 和 Diffusion）的性能基于的。此外，HiDiff 擅长分割小对象并推广到新的数据集。源代码可在 https://github.com/takimailto/HiDiff 获取。

AU Li, Zekun Benabdallah, Nadia Laforest, Richard Wahl, Richard L Thorek, Daniel L J Jha, Abhinav K
AU Li、Zekun Benabdallah、Nadia Laforest、Richard Wahl、Richard L Thorek、Daniel LJ Jha、Abhinav K

Joint regional uptake quantification of thorium-227 and radium-223 using a multiple-energy-window projection-domain quantitative SPECT method.
使用多能量窗口投影域定量 SPECT 方法对钍 227 和镭 223 进行联合区域吸收定量。

Thorium-227 (227Th)-based alpha-particle radiopharmaceutical therapies (alpha-RPTs) are currently being investigated in several clinical and pre-clinical studies. After administration, 227Th decays to 223Ra, another alpha-particle-emitting isotope, which redistributes within the patient. Reliable dose quantification of both 227Th and 223Ra is clinically important, and SPECT may perform this quantification as these isotopes also emit X- and gamma-ray photons. However, reliable quantification is challenging for several reasons: the orders-of-magnitude lower activity compared to conventional SPECT, resulting in a very low number of detected counts, the presence of multiple photopeaks, substantial overlap in the emission spectra of these isotopes, and the image-degrading effects in SPECT. To address these issues, we propose a multiple-energy-window projection-domain quantification (MEW-PDQ) method that jointly estimates the regional activity uptake of both 227Th and 223Ra directly using the SPECT projection data from multiple energy windows. We evaluated the method with realistic simulation studies conducted with anthropomorphic digital phantoms, including a virtual imaging trial, in the context of imaging patients with bone metastases of prostate cancer who were treated with 227Th-based alpha-RPTs. The proposed method yielded reliable (accurate and precise) regional uptake estimates of both isotopes and outperformed state-of-the-art methods across different lesion sizes and contrasts, as well as in the virtual imaging trial. This reliable performance was also observed with moderate levels of intra-regional heterogeneous uptake as well as when there were moderate inaccuracies in the definitions of the support of various regions. Additionally, we demonstrated the effectiveness of using multiple energy windows and the variance of the estimated uptake using the proposed method approached the Cramer-Rao-lower-bound-defined theoretical limit. These results provide strong evidence in support of this method for reliable uptake quantification in 227Th-based alpha-RPTs.
基于钍 227 (227Th) 的 α 粒子放射性药物疗法 (α-RPT) 目前正在多项临床和临床前研究中进行研究。给药后，227Th 衰变成 223Ra，这是另一种发射α粒子的同位素，在患者体内重新分布。 227Th 和 223Ra 的可靠剂量定量在临床上很重要，SPECT 可以执行这种定量，因为这些同位素也发射 X 射线和伽马射线光子。然而，可靠的定量由于以下几个原因而具有挑战性：与传统 SPECT 相比，活性降低了几个数量级，导致检测到的计数数量非常少、存在多个光峰、这些同位素的发射光谱存在大量重叠，以及SPECT 中的图像降级效应。为了解决这些问题，我们提出了一种多能量窗口投影域量化（MEW-PDQ）方法，该方法直接使用来自多个能量窗口的 SPECT 投影数据联合估计 227Th 和 223Ra 的区域活动吸收。我们通过使用拟人化数字模型进行的真实模拟研究来评估该方法，包括虚拟成像试验，对接受基于 227Th 的 α-RPT 治疗的前列腺癌骨转移患者进行成像。所提出的方法对同位素进行了可靠（准确和精确）的区域摄取估计，并且在不同病变大小和对比度以及虚拟成像试验中优于最先进的方法。在中等水平的区域内异质性吸收以及各个区域的支持定义存在一定程度的不准确的情况下，也观察到了这种可靠的表现。此外，我们证明了使用多个能量窗口的有效性，以及使用所提出的方法估计吸收的方差接近 Cramer-Rao 下界定义的理论极限。这些结果为支持该方法在基于 227Th 的 α-RPT 中进行可靠的摄取定量提供了强有力的证据。

AU Li, Shiyu Qiao, Pengchong Wang, Lin Ning, Munan Yuan, Li Zheng, Yefeng Chen, Jie
AU Li、乔诗雨、王鹏冲、林宁、袁穆南、李峥、陈业峰、杰

An Organ-aware Diagnosis Framework for Radiology Report Generation.
用于生成放射学报告的器官感知诊断框架。

Radiology report generation (RRG) is crucial to save the valuable time of radiologists in drafting the report, therefore increasing their work efficiency. Compared to typical methods that directly transfer image captioning technologies to RRG, our approach incorporates organ-wise priors into the report generation. Specifically, in this paper, we propose Organ-aware Diagnosis (OaD) to generate diagnostic reports containing descriptions of each physiological organ. During training, we first develop a task distillation (TD) module to extract organ-level descriptions from reports. We then introduce an organ-aware report generation module that, for one thing, provides a specific description for each organ, and for another, simulates clinical situations to provide short descriptions for normal cases. Furthermore, we design an auto-balance mask loss to ensure balanced training for normal/abnormal descriptions and various organs simultaneously. Being intuitively reasonable and practically simple, our OaD outperforms SOTA alternatives by large margins on commonly used IU-Xray and MIMIC-CXR datasets, as evidenced by a 3.4% BLEU-1 improvement on MIMIC-CXR and 2.0% BLEU-2 improvement on IU-Xray.
放射学报告生成（RRG）对于节省放射科医生起草报告的宝贵时间，从而提高他们的工作效率至关重要。与直接将图像字幕技术转移到 RRG 的典型方法相比，我们的方法将器官方面的先验纳入报告生成中。具体来说，在本文中，我们提出器官感知诊断（OaD）来生成包含每个生理器官描述的诊断报告。在训练过程中，我们首先开发一个任务蒸馏（TD）模块来从报告中提取器官级别的描述。然后，我们引入一个器官感知报告生成模块，一方面，为每个器官提供具体描述，另一方面，模拟临床情况，为正常病例提供简短描述。此外，我们设计了自动平衡掩模损失，以确保正常/异常描述和各个器官同时进行平衡训练。由于直观合理且实际上简单，我们的 OaD 在常用的 IU-Xray 和 MIMIC-CXR 数据集上远远优于 SOTA 替代方案，MIMIC-CXR 上的 BLEU-1 改进为 3.4%，IU 上的 BLEU-2 改进为 2.0% -X射线。

AU He, Hailong Paetzold, Johannes C. Boerner, Nils Riedel, Erik Gerl, Stefan Schneider, Simon Fisher, Chiara Ezhov, Ivan Shit, Suprosanna Li, Hongwei Ruckert, Daniel Aguirre, Juan Biedermann, Tilo Darsow, Ulf Menze, Bjoern Ntziachristos, Vasilis
AU He, Hailong Paetzold, Johannes C. Boerner, Nils Riedel, Erik Gerl, Stefan Schneider, Simon Fisher, Chiara Ezhov, Ivan Shit, Suprosanna Li, Hongwei Ruckert, Daniel Aguirre, Juan Biedermann, Tilo Darsow, Ulf Menze, Bjoern Ntziachristos,瓦西利斯

Machine Learning Analysis of Human Skin by Optoacoustic Mesoscopy for Automated Extraction of Psoriasis and Aging Biomarkers
通过光声介观镜对人体皮肤进行机器学习分析，自动提取牛皮癣和衰老生物标志物

Ultra-wideband raster-scan optoacoustic mesoscopy (RSOM) is a novel modality that has demonstrated unprecedented ability to visualize epidermal and dermal structures in-vivo. However, an automatic and quantitative analysis of three-dimensional RSOM datasets remains unexplored. In this work we present our framework: Deep Learning RSOM Analysis Pipeline (DeepRAP), to analyze and quantify morphological skin features recorded by RSOM and extract imaging biomarkers for disease characterization. DeepRAP uses a multi-network segmentation strategy based on convolutional neural networks with transfer learning. This strategy enabled the automatic recognition of skin layers and subsequent segmentation of dermal microvasculature with an accuracy equivalent to human assessment. DeepRAP was validated against manual segmentation on 25 psoriasis patients under treatment and our biomarker extraction was shown to characterize disease severity and progression well with a strong correlation to physician evaluation and histology. In a unique validation experiment, we applied DeepRAP in a time series sequence of occlusion-induced hyperemia from 10 healthy volunteers. We observe how the biomarkers decrease and recover during the occlusion and release process, demonstrating accurate performance and reproducibility of DeepRAP. Furthermore, we analyzed a cohort of 75 volunteers and defined a relationship between aging and microvascular features in-vivo. More precisely, this study revealed that fine microvascular features in the dermal layer have the strongest correlation to age. The ability of our newly developed framework to enable the rapid study of human skin morphology and microvasculature in-vivo promises to replace biopsy studies, increasing the translational potential of RSOM.
超宽带光栅扫描光声介观镜（RSOM）是一种新颖的模式，已证明具有前所未有的体内表皮和真皮结构可视化能力。然而，三维 RSOM 数据集的自动定量分析仍有待探索。在这项工作中，我们提出了我们的框架：深度学习 RSOM 分析管道 (DeepRAP)，用于分析和量化 RSOM 记录的形态皮肤特征，并提取用于疾病表征的成像生物标志物。 DeepRAP 使用基于具有迁移学习的卷积神经网络的多网络分割策略。该策略能够自动识别皮肤层并随后分割真皮微血管系统，其准确度相当于人类评估。 DeepRAP 针对 25 名接受治疗的银屑病患者的手动分割进行了验证，我们的生物标志物提取被证明可以很好地表征疾病的严重程度和进展，并与医生评估和组织学密切相关。在一项独特的验证实验中，我们在 10 名健康志愿者的闭塞引起充血的时间序列中应用了 DeepRAP。我们观察了生物标志物在闭塞和释放过程中如何减少和恢复，证明了 DeepRAP 的准确性能和可重复性。此外，我们分析了 75 名志愿者的队列，并定义了衰老与体内微血管特征之间的关系。更准确地说，这项研究表明真皮层的精细微血管特征与年龄的相关性最强。我们新开发的框架能够快速研究人体皮肤形态和体内微脉管系统，有望取代活检研究，从而增加 RSOM 的转化潜力。

AU Wang, Hongyi Luo, Luyang Wang, Fang Tong, Ruofeng Chen, Yen-Wei Hu, Hongjie Lin, Lanfen Chen, Hao
王AU、罗宏毅、王路阳、童芳、陈若峰、胡彦伟、林宏杰、陈兰芬、郝

Rethinking Multiple Instance Learning for Whole Slide Image Classification: A Bag-Level Classifier is a Good Instance-Level Teacher.
重新思考整个幻灯片图像分类的多实例学习：袋级分类器是一个好的实例级老师。

Multiple Instance Learning (MIL) has demonstrated promise in Whole Slide Image (WSI) classification. However, a major challenge persists due to the high computational cost associated with processing these gigapixel images. Existing methods generally adopt a two-stage approach, comprising a non-learnable feature embedding stage and a classifier training stage. Though it can greatly reduce memory consumption by using a fixed feature embedder pre-trained on other domains, such a scheme also results in a disparity between the two stages, leading to suboptimal classification accuracy. To address this issue, we propose that a bag-level classifier can be a good instance-level teacher. Based on this idea, we design Iteratively Coupled Multiple Instance Learning (ICMIL) to couple the embedder and the bag classifier at a low cost. ICMIL initially fixes the patch embedder to train the bag classifier, followed by fixing the bag classifier to fine-tune the patch embedder. The refined embedder can then generate better representations in return, leading to a more accurate classifier for the next iteration. To realize more flexible and more effective embedder fine-tuning, we also introduce a teacher-student framework to efficiently distill the category knowledge in the bag classifier to help the instance-level embedder fine-tuning. Intensive experiments were conducted on four distinct datasets to validate the effectiveness of ICMIL. The experimental results consistently demonstrated that our method significantly improves the performance of existing MIL backbones, achieving state-of-the-art results. The code and the organized datasets can be accessed by: https://github.com/Dootmaan/ICMIL/tree/confidence-based.
多实例学习 (MIL) 在整个幻灯片图像 (WSI) 分类方面展现出了良好的前景。然而，由于处理这些十亿像素图像的计算成本很高，因此仍然存在一个重大挑战。现有方法通常采用两阶段方法，包括不可学习特征嵌入阶段和分类器训练阶段。虽然通过使用在其他领域预先训练的固定特征嵌入器可以大大减少内存消耗，但这种方案也会导致两个阶段之间的差异，从而导致分类精度不佳。为了解决这个问题，我们建议袋级分类器可以成为很好的实例级老师。基于这个想法，我们设计了迭代耦合多实例学习（ICMIL）来以低成本耦合嵌入器和袋分类器。 ICMIL 首先修复补丁嵌入器来训练袋分类器，然后修复袋分类器以微调补丁嵌入器。然后，经过改进的嵌入器可以生成更好的表示作为回报，从而为下一次迭代提供更准确的分类器。为了实现更灵活、更有效的嵌入器微调，我们还引入了师生框架来有效提炼袋分类器中的类别知识，以帮助实例级嵌入器微调。在四个不同的数据集上进行了深入的实验，以验证 ICMIL 的有效性。实验结果一致表明，我们的方法显着提高了现有 MIL 主干的性能，实现了最先进的结果。代码和组织的数据集可以通过以下方式访问：https://github.com/Dootmaan/ICMIL/tree/confidence-based。

AU Gras, V. Boulant, N. Luong, M. Morel, L. Le Touz, N. Adam, J. -P. Joly, J. -C.
AU Gras、V. Boulant、N. Luong、M. Morel、L. Le Touz、N. Adam、J. -P。乔利，J.-C。

A Mathematical Analysis of Clustering-Free Local SAR Compression Algorithms for MRI Safety in Parallel Transmission
并行传输中 MRI 安全性无聚类局部 SAR 压缩算法的数学分析

Parallel transmission (pTX) is a versatile solution to enable UHF MRI of the human body, where radiofrequency (RF) field inhomogeneity appears very challenging. Today, state of the art monitoring of the local SAR in pTX consists in evaluating the RF power deposition on specific SAR matrices called Virtual Observation Points (VOPs). It essentially relies on accurate electromagnetic simulations able to return the local SAR distribution inside the body in response to any applied pTX RF waveform. In order to reduce the number of SAR matrices to a value compatible with real time SAR monitoring (<< 10(3)) , a VOP set is obtained by partitioning the SAR model into clusters, and associating a so- called dominant SAR matrix to every cluster. More recently, a clustering-free compression method was proposed, allowing for a significant reduction in the number of SAR matrices. The concept and derivation however assumed static RF shims and their extension to dynamic pTX is not straightforward, thereby casting doubt on the strict validity of the compression approach for these more complicated RF waveforms. In this work, we provide the mathematical framework to tackle this problem and find a rigorous justification of this criterion in the light of convex optimization theory. Our analysis led us to a variant of the clustering-free compression approach exploiting convex optimization. This new compression algorithm offers computational gains for large SAR models and for high-channel count pTX RF coils.
并行传输 (pTX) 是一种多功能解决方案，可实现人体 UHF MRI，其中射频 (RF) 场不均匀性显得非常具有挑战性。如今，pTX 中局部 SAR 的最先进监测包括评估特定 SAR 矩阵（称为虚拟观测点 (VOP)）上的射频功率沉积。它本质上依赖于精确的电磁模拟，能够响应任何应用的 pTX RF 波形返回体内的局部 SAR 分布。为了将 SAR 矩阵的数量减少到与实时 SAR 监测兼容的值 (<< 10(3))，通过将 SAR 模型划分为簇并关联所谓的主导 SAR 来获得 VOP 集。每个簇的矩阵。最近，提出了一种无聚类压缩方法，可以显着减少 SAR 矩阵的数量。然而，假设静态射频匀场及其对动态 pTX 的扩展的概念和推导并不简单，因此对这些更复杂的射频波形的压缩方法的严格有效性产生了怀疑。在这项工作中，我们提供了解决这个问题的数学框架，并根据凸优化理论找到了这个标准的严格证明。我们的分析使我们得出了一种利用凸优化的无聚类压缩方法的变体。这种新的压缩算法为大型 SAR 模型和高通道数 pTX RF 线圈提供了计算增益。

AU Ranjbaran, Seyed Mohsen Aghamiry, Hossein S. Gholami, Ali Operto, Stephane Avanaki, Kamran
AU Ranjbaran、Seyed Mohsen Aghamiry、Hossein S. Gholami、Ali Operto、Stephane Avanaki、Kamran

Quantitative Photoacoustic Tomography Using Iteratively Refined Wavefield Reconstruction Inversion: A Simulation Study
使用迭代细化波场重建反演的定量光声层析成像：模拟研究

The ultimate goal of photoacoustic tomography is to accurately map the absorption coefficient throughout the imaged tissue. Most studies either assume that acoustic properties of biological tissues such as speed of sound (SOS) and acoustic attenuation are homogeneous or fluence is uniform throughout the entire tissue. These assumptions reduce the accuracy of estimations of derived absorption coefficients (DeACs). Our quantitative photoacoustic tomography (qPAT) method estimates DeACs using iteratively refined wavefield reconstruction inversion (IR-WRI) which incorporates the alternating direction method of multipliers to solve the cycle skipping challenge associated with full wave inversion algorithms. Our method compensates for SOS inhomogeneity, fluence decay, and acoustic attenuation. We evaluate the performance of our method on a neonatal head digital phantom.
光声断层扫描的最终目标是准确绘制整个成像组织的吸收系数。大多数研究要么假设生物组织的声学特性（例如声速 (SOS) 和声衰减）是均匀的，要么假设整个组织的注量是均匀的。这些假设降低了导出吸收系数 (DeAC) 估计的准确性。我们的定量光声断层扫描 (qPAT) 方法使用迭代细化波场重建反演 (IR-WRI) 来估计 DeAC，该方法结合了乘法器的交替方向方法，以解决与全波反演算法相关的周期跳跃挑战。我们的方法补偿了 SOS 不均匀性、能量密度衰减和声学衰减。我们评估了我们的方法在新生儿头部数字模型上的性能。

AU Xing, Paul Poree, Jonathan Rauby, Brice Malescot, Antoine Martineau, Eric Perrot, Vincent Rungta, Ravi L. Provost, Jean
AU Xing、Paul Poree、Jonathan Rauby、Brice Malescot、Antoine Martineau、Eric Perrot、Vincent Rungta、Ravi L. Provost、Jean

Phase Aberration Correction for In Vivo Ultrasound Localization Microscopy Using a Spatiotemporal Complex-Valued Neural Network
使用时空复值神经网络对体内超声定位显微镜进行相位像差校正

Ultrasound Localization Microscopy (ULM) can map microvessels at a resolution of a few micrometers (mu m). Transcranial ULM remains challenging in presence of aberrations caused by the skull, which lead to localization errors. Herein, we propose a deep learning approach based on recently introduced complex-valued convolutional neural networks (CV-CNNs) to retrieve the aberration function, which can then be used to form enhanced images using standard delay-and-sum beamforming. CV-CNNs were selected as they can apply time delays through multiplication with in-phase quadrature input data. Predicting the aberration function rather than corrected images also confers enhanced explainability to the network. In addition, 3D spatiotemporal convolutions were used for the network to leverage entire microbubble tracks. For training and validation, we used an anatomically and hemodynamically realistic mouse brain microvascular network model to simulate the flow of microbubbles in presence of aberration. The proposed CV-CNN performance was compared to the coherence-based method by using microbubble tracks. We then confirmed the capability of the proposed network to generalize to transcranial in vivo data in the mouse brain (n=3). Vascular reconstructions using a locally predicted aberration function included additional and sharper vessels. The CV-CNN was more robust than the coherence-based method and could perform aberration correction in a 6-month-old mouse. After correction, we measured a resolution of 15.6 mu m for younger mice, representing an improvement of 25.8%, while the resolution was improved by 13.9% for the 6-month-old mouse. This work leads to different applications for complex-valued convolutions in biomedical imaging and strategies to perform transcranial ULM.
超声定位显微镜 (ULM) 可以以几微米 (μ m) 的分辨率绘制微血管图。经颅 ULM 仍然具有挑战性，因为存在由颅骨引起的畸变，导致定位错误。在这里，我们提出了一种基于最近引入的复值卷积神经网络（CV-CNN）的深度学习方法来检索像差函数，然后可以使用标准延迟求和波束形成来形成增强图像。选择 CV-CNN 是因为它们可以通过与同相正交输入数据相乘来应用时间延迟。预测像差函数而不是校正图像也增强了网络的可解释性。此外，网络还使用 3D 时空卷积来利用整个微气泡轨迹。为了进行训练和验证，我们使用了解剖学和血流动力学上真实的小鼠大脑微血管网络模型来模拟存在畸变时的微泡流动。使用微泡轨迹将所提出的 CV-CNN 性能与基于相干性的方法进行了比较。然后，我们证实了所提出的网络能够推广到小鼠大脑中经颅体内数据的能力（n=3）。使用局部预测的像差函数进行的血管重建包括额外的和更清晰的血管。 CV-CNN 比基于相干性的方法更稳健，可以对 6 个月大的小鼠进行像差校正。校正后，我们测得年幼小鼠的分辨率为 15.6 μm，提高了 25.8%，而 6 个月大的小鼠分辨率提高了 13.9%。这项工作导致了复值卷积在生物医学成像中的不同应用以及执行经颅 ULM 的策略。

AU Liu, Xuan Xie, Yaoqin Diao, Songhui Tan, Shan Liang, Xiaokun
刘AU、谢轩、刁耀琴、谭松辉、梁善、晓坤

Unsupervised CT Metal Artifact Reduction by Plugging Diffusion Priors in Dual Domains.
通过在双域中插入扩散先验来减少无监督 CT 金属伪影。

During the process of computed tomography (CT), metallic implants often cause disruptive artifacts in the reconstructed images, impeding accurate diagnosis. Many supervised deep learning-based approaches have been proposed for metal artifact reduction (MAR). However, these methods heavily rely on training with paired simulated data, which are challenging to acquire. This limitation can lead to decreased performance when applying these methods in clinical practice. Existing unsupervised MAR methods, whether based on learning or not, typically work within a single domain, either in the image domain or the sinogram domain. In this paper, we propose an unsupervised MAR method based on the diffusion model, a generative model with a high capacity to represent data distributions. Specifically, we first train a diffusion model using CT images without metal artifacts. Subsequently, we iteratively introduce the diffusion priors in both the sinogram domain and image domain to restore the degraded portions caused by metal artifacts. Besides, we design temporally dynamic weight masks for the image-domian fusion. The dual-domain processing empowers our approach to outperform existing unsupervised MAR methods, including another MAR method based on diffusion model. The effectiveness has been qualitatively and quantitatively validated on synthetic datasets. Moreover, our method demonstrates superior visual results among both supervised and unsupervised methods on clinical datasets. Codes are available in github.com/DeepXuan/DuDoDp-MAR.
在计算机断层扫描 (CT) 过程中，金属植入物常常会在重建图像中造成破坏性伪影，从而妨碍准确诊断。人们提出了许多基于监督深度学习的方法来减少金属伪影（MAR）。然而，这些方法严重依赖于配对模拟数据的训练，而获取这些数据具有挑战性。在临床实践中应用这些方法时，这种限制可能会导致性能下降。现有的无监督 MAR 方法，无论是否基于学习，通常在单个域内工作，无论是图像域还是正弦图域。在本文中，我们提出了一种基于扩散模型的无监督 MAR 方法，这是一种具有高能力表示数据分布的生成模型。具体来说，我们首先使用没有金属伪影的 CT 图像训练扩散模型。随后，我们迭代地在正弦图域和图像域中引入扩散先验，以恢复由金属伪影引起的退化部分。此外，我们为图像域融合设计了时间动态权重掩模。双域处理使我们的方法能够超越现有的无监督 MAR 方法，包括另一种基于扩散模型的 MAR 方法。其有效性已在合成数据集上进行了定性和定量验证。此外，我们的方法在临床数据集上的监督和非监督方法中展示了优越的视觉结果。代码可在 github.com/DeepXuan/DuDoDp-MAR 中找到。

AU Beuret, Samuel Thiran, Jean-Philippe
AU Beuret、塞缪尔·蒂兰、让·菲利普

Windowed Radon Transform and Tensor Rank-1 Decomposition for Adaptive Beamforming in Ultrafast Ultrasound
超快超声中自适应波束形成的窗氡变换和张量Rank-1分解

Ultrafast ultrasound has recently emerged as an alternative to traditional focused ultrasound. By virtue of the low number of insonifications it requires, ultrafast ultrasound enables the imaging of the human body at potentially very high frame rates. However, unaccounted for speed-of-sound variations in the insonified medium often result in phase aberrations in the reconstructed images. The diagnosis capability of ultrafast ultrasound is thus ultimately impeded. Therefore, there is a strong need for adaptive beamforming methods that are resilient to speed-of-sound aberrations. Several of such techniques have been proposed recently but they often lack parallelizability or the ability to directly correct both transmit and receive phase aberrations. In this article, we introduce an adaptive beamforming method designed to address these shortcomings. To do so, we compute the windowed Radon transform of several complex radio-frequency images reconstructed using delay-and-sum. Then, we apply to the obtained local sinograms weighted tensor rank-1 decompositions and their results are eventually used to reconstruct a corrected image. We demonstrate using simulated and in-vitro data that our method is able to successfully recover aberration-free images and that it outperforms both coherent compounding and the recently introduced SVD beamformer. Finally, we validate the proposed beamforming technique on in-vivo data, resulting in a significant improvement of image quality compared to the two reference methods.
超快超声波最近已成为传统聚焦超声波的替代品。由于所需的声穿透次数较少，超快超声波能够以非常高的帧速率对人体进行成像。然而，未考虑到声穿透介质中的声速变化通常会导致重建图像中的相位畸变。超快超声的诊断能力因此最终受到阻碍。因此，强烈需要能够适应声速像差的自适应波束形成方法。最近已经提出了几种这样的技术，但它们通常缺乏并行性或直接校正发射和接收相位畸变的能力。在本文中，我们介绍了一种旨在解决这些缺点的自适应波束形成方法。为此，我们计算使用延迟求和重建的几个复杂射频图像的加窗氡变换。然后，我们对获得的局部正弦图进行加权张量Rank-1分解，其结果最终用于重建校正图像。我们使用模拟和体外数据证明，我们的方法能够成功恢复无像差图像，并且其性能优于相干复合和最近推出的 SVD 波束形成器。最后，我们在体内数据上验证了所提出的波束形成技术，与两种参考方法相比，图像质量显着提高。

AU Nguyen, Huy Hoang Blaschko, Matthew B. Saarakkala, Simo Tiulpin, Aleksei
AU Nguyen、Huy Hoang Blaschko、Matthew B. Saarakkala、Simo Tiulpin、Aleksei

Clinically-Inspired Multi-Agent Transformers for Disease Trajectory Forecasting From Multimodal Data
受临床启发的多智能体变压器，用于根据多模态数据进行疾病轨迹预测

Deep neural networks are often applied to medical images to automate the problem of medical diagnosis. However, a more clinically relevant question that practitioners usually face is how to predict the future trajectory of a disease. Current methods for prognosis or disease trajectory forecasting often require domain knowledge and are complicated to apply. In this paper, we formulate the prognosis prediction problem as a one-to-many prediction problem. Inspired by a clinical decision-making process with two agents-a radiologist and a general practitioner - we predict prognosis with two transformer-based components that share information with each other. The first transformer in this framework aims to analyze the imaging data, and the second one leverages its internal states as inputs, also fusing them with auxiliary clinical data. The temporal nature of the problem is modeled within the transformer states, allowing us to treat the forecasting problem as a multi-task classification, for which we propose a novel loss. We show the effectiveness of our approach in predicting the development of structural knee osteoarthritis changes and forecasting Alzheimer's disease clinical status directly from raw multi-modal data. The proposed method outperforms multiple state-of-the-art baselines with respect to performance and calibration, both of which are needed for real-world applications. An open-source implementation of our method is made publicly available at https://github.com/Oulu-IMEDS/CLIMATv2.
深度神经网络通常应用于医学图像以自动化医学诊断问题。然而，从业者通常面临的一个与临床更相关的问题是如何预测疾病的未来轨迹。当前的预后或疾病轨迹预测方法通常需要领域知识并且应用起来很复杂。在本文中，我们将预后预测问题表述为一对多预测问题。受到两名代理人（一名放射科医生和一名全科医生）临床决策过程的启发，我们使用两个基于变压器的相互共享信息的组件来预测预后。该框架中的第一个变压器旨在分析成像数据，第二个变压器利用其内部状态作为输入，并将它们与辅助临床数据融合。问题的时间性质是在变压器状态内建模的，使我们能够将预测问题视为多任务分类，为此我们提出了一种新颖的损失。我们展示了我们的方法在预测结构性膝骨关节炎变化的发展以及直接从原始多模态数据预测阿尔茨海默病临床状态方面的有效性。所提出的方法在性能和校准方面优于多个最先进的基线，这两者都是实际应用所需要的。我们的方法的开源实现已在 https://github.com/Oulu-IMEDS/CLIMATv2 上公开发布。

AU Wu, Lingyun Gao, Xiang Hu, Zhiqiang Zhang, Shaoting
吴区、高凌云、胡翔、张志强、绍婷

Pattern-Aware Transformer: Hierarchical Pattern Propagation in Sequential Medical Images
模式感知变压器：序列医学图像中的分层模式传播

This paper investigates how to effectively mine contextual information among sequential images and jointly model them in medical imaging tasks. Different from state-of-the-art methods that model sequential correlations via point-wise token encoding, this paper develops a novel hierarchical pattern-aware tokenization strategy. It handles distinct visual patterns independently and hierarchically, which not only ensures the full flexibility of attention aggregation under different pattern representations but also preserves both local and global information simultaneously. Based on this strategy, we propose a Pattern-Aware Transformer (PATrans) featuring a global-local dual-path pattern-aware cross-attention mechanism to achieve hierarchical pattern matching and propagation among sequential images. Furthermore, PATrans is plug-and-play and can be seamlessly integrated into various backbone networks for diverse downstream sequence modeling tasks. We demonstrate its general application paradigm across four domains and five benchmarks in video object detection and 3D volumetric semantic segmentation tasks, respectively. Impressively, PATrans sets new state-of-the-art across all these benchmarks, i.e., CVC-Video (92.3% detection F1), ASU-Mayo (99.1% localization F1), Lung Tumor (78.59% DSC), Nasopharynx Tumor (75.50% DSC), and Kidney Tumor (87.53% DSC). Codes and models are available at https://github.com/GGaoxiang/PATrans.
本文研究了如何有效地挖掘序列图像之间的上下文信息并在医学成像任务中对其进行联合建模。与通过逐点标记编码对顺序相关性进行建模的最先进方法不同，本文开发了一种新颖的分层模式感知标记化策略。它独立且分层地处理不同的视觉模式，这不仅确保了不同模式表示下注意力聚合的充分灵活性，而且同时保留了局部和全局信息。基于该策略，我们提出了一种模式感知变压器（PATrans），具有全局-局部双路径模式感知交叉注意机制，以实现序列图像之间的分层模式匹配和传播。此外，PATrans 是即插即用的，可以无缝集成到各种骨干网络中，以执行各种下游序列建模任务。我们分别在视频对象检测和 3D 体积语义分割任务中展示了其跨四个领域和五个基准的通用应用范例。令人印象深刻的是，PATrans 在所有这些基准测试中都设定了新的最先进水平，即 CVC-Video（92.3% 检测 F1）、ASU-Mayo（99.1% 定位 F1）、肺肿瘤（78.59% DSC）、鼻咽肿瘤（ 75.50% DSC) 和肾肿瘤 (87.53% DSC)。代码和模型可在 https://github.com/GGaoxiang/PATrans 获取。

C1 SenseTime Res, Shanghai 200233, Peoples R China C1 Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China C3 SenseTime Res C3 Shanghai Artificial Intelligence Lab SN 0278-0062 EI 1558-254X DA 2024-03-13 UT WOS:001158081600002 PM 37594875 ER
C1 SenseTime Res，上海 200233，人民 R 中国 C1 上海人工智能实验室，上海 200232，人民 R 中国 C3 SenseTime Res C3 上海人工智能实验室 SN 0278-0062 EI 1558-254X DA 2024-03-13 UT WOS:001158081600002 PM 37594875急诊室

AU Cui, Zhuo-Xu Cao, Chentao Wang, Yue Jia, Sen Cheng, Jing Liu, Xin Zheng, Hairong Liang, Dong Zhu, Yanjie
崔AU、曹卓旭、王晨涛、贾悦、程森、刘静、郑鑫、梁海蓉、朱东、燕杰

SPIRiT-Diffusion: Self-Consistency Driven Diffusion Model for Accelerated MRI.
SPIRiT-Diffusion：加速 MRI 的自洽驱动扩散模型。

Diffusion models have emerged as a leading methodology for image generation and have proven successful in the realm of magnetic resonance imaging (MRI) reconstruction. However, existing reconstruction methods based on diffusion models are primarily formulated in the image domain, making the reconstruction quality susceptible to inaccuracies in coil sensitivity maps (CSMs). k-space interpolation methods can effectively address this issue but conventional diffusion models are not readily applicable in k-space interpolation. To overcome this challenge, we introduce a novel approach called SPIRiT-Diffusion, which is a diffusion model for k-space interpolation inspired by the iterative self-consistent SPIRiT method. Specifically, we utilize the iterative solver of the self-consistent term (i.e., k-space physical prior) in SPIRiT to formulate a novel stochastic differential equation (SDE) governing the diffusion process. Subsequently, k-space data can be interpolated by executing the diffusion process. This innovative approach highlights the optimization model's role in designing the SDE in diffusion models, enabling the diffusion process to align closely with the physics inherent in the optimization model-a concept referred to as model-driven diffusion. We evaluated the proposed SPIRiT-Diffusion method using a 3D joint intracranial and carotid vessel wall imaging dataset. The results convincingly demonstrate its superiority over image-domain reconstruction methods, achieving high reconstruction quality even at a substantial acceleration rate of 10. Our code are available at https://github.com/zhyjSIAT/SPIRiT-Diffusion.
扩散模型已成为图像生成的领先方法，并在磁共振成像 (MRI) 重建领域取得了成功。然而，现有的基于扩散模型的重建方法主要是在图像域中制定的，使得重建质量容易受到线圈灵敏度图（CSM）不准确的影响。 k空间插值方法可以有效解决这个问题，但传统的扩散模型并不容易应用于k空间插值。为了克服这一挑战，我们引入了一种称为 SPIRiT-Diffusion 的新方法，它是受迭代自洽 SPIRiT 方法启发的 k 空间插值扩散模型。具体来说，我们利用 SPIRiT 中自洽项（即 k 空间物理先验）的迭代求解器来制定控制扩散过程的新颖随机微分方程 (SDE)。随后，可以通过执行扩散处理来对k空间数据进行插值。这种创新方法突出了优化模型在设计扩散模型中的 SDE 中的作用，使扩散过程能够与优化模型固有的物理原理紧密结合，这一概念称为模型驱动扩散。我们使用 3D 联合颅内和颈动脉血管壁成像数据集评估了所提出的 SPIRiT-Diffusion 方法。结果令人信服地证明了其相对于图像域重建方法的优越性，即使在 10 的大幅加速率下也能实现高重建质量。我们的代码可在 https://github.com/zhyjSIAT/SPIRiT-Diffusion 上获取。

EI 1558-254X DA 2024-10-05 UT MEDLINE:39361455 PM 39361455 ER
EI 1558-254X DA 2024-10-05 UT MEDLINE：39361455 PM 39361455 ER

AU Gong, Shizhan Long, Yonghao Chen, Kai Liu, Jiaqi Xiao, Yuliang Cheng, Alexis Wang, Zerui Dou, Qi
AU龚、龙世展、陈永浩、刘凯、肖佳琪、程玉良、王俊杰、窦泽瑞、齐

Self-Supervised Cyclic Diffeomorphic Mapping for Soft Tissue Deformation Recovery in Robotic Surgery Scenes.
用于机器人手术场景中软组织变形恢复的自监督循环微分形映射。

The ability to recover tissue deformation from visual features is fundamental for many robotic surgery applications. This has been a long-standing research topic in computer vision, however, is still unsolved due to complex dynamics of soft tissues when being manipulated by surgical instruments. The ambiguous pixel correspondence caused by homogeneous texture makes achieving dense and accurate tissue tracking even more challenging. In this paper, we propose a novel self-supervised framework to recover tissue deformations from stereo surgical videos. Our approach integrates semantics, cross-frame motion flow, and long-range temporal dependencies to enable the recovered deformations to represent actual tissue dynamics. Moreover, we incorporate diffeomorphic mapping to regularize the warping field to be physically realistic. To comprehensively evaluate our method, we collected stereo surgical video clips containing three types of tissue manipulation (i.e., pushing, dissection and retraction) from two different types of surgeries (i.e., hemicolectomy and mesorectal excision). Our method has achieved impressive results in capturing deformation in 3D mesh, and generalized well across manipulations and surgeries. It also outperforms current state-of-the-art methods on non-rigid registration and optical flow estimation. To the best of our knowledge, this is the first work on self-supervised learning for dense tissue deformation modeling from stereo surgical videos. Our code will be released.
从视觉特征中恢复组织变形的能力是许多机器人手术应用的基础。这一直是计算机视觉领域的一个长期研究课题，然而，由于手术器械操纵时软组织的复杂动力学，该课题仍未得到解决。由均匀纹理引起的模糊像素对应使得实现密集且准确的组织跟踪更具挑战性。在本文中，我们提出了一种新颖的自监督框架来从立体手术视频中恢复组织变形。我们的方法集成了语义、跨帧运动流和长程时间依赖性，使恢复的变形能够代表实际的组织动力学。此外，我们结合微分同胚映射来规范扭曲场以使其物理上真实。为了全面评估我们的方法，我们收集了来自两种不同类型的手术（即半结肠切除术和直肠系膜切除术）的包含三种类型的组织操作（即推动、解剖和牵拉）的立体手术视频片段。我们的方法在捕获 3D 网格变形方面取得了令人印象深刻的结果，并且在操作和手术中得到了很好的推广。它在非刚性配准和光流估计方面也优于当前最先进的方法。据我们所知，这是第一个利用立体手术视频进行致密组织变形建模的自监督学习的工作。我们的代码将被发布。

AU Rajagopal, Abhejit Westphalen, Antonio C. Velarde, Nathan Simko, Jeffry P. Nguyen, Hao Hope, Thomas A. Larson, Peder E. Z. Magudia, Kirti
AU Rajagopal、Abhejit Westphalen、Antonio C. Velarde、Nathan Simko、Jeffry P. Nguyen、Hao Hope、Thomas A. Larson、Peder EZ Magudia、Kirti

Mixed Supervision of Histopathology Improves Prostate Cancer Classification From MRI
组织病理学的混合监督改善了 MRI 的前列腺癌分类

Non-invasive prostate cancer classification from MRI has the potential to revolutionize patient care by providing early detection of clinically significant disease, but has thus far shown limited positive predictive value. To address this, we present a image-based deep learning method to predict clinically significant prostate cancer from screening MRI in patients that subsequently underwent biopsy with results ranging from benign pathology to the highest grade tumors. Specifically, we demonstrate that mixed supervision via diverse histopathological ground truth improves classification performance despite the cost of reduced concordance with image-based segmentation. Where prior approaches have utilized pathology results as ground truth derived from targeted biopsies and whole-mount prostatectomy to strongly supervise the localization of clinically significant cancer, our approach also utilizes weak supervision signals extracted from nontargeted systematic biopsies with regional localization to improve overall performance. Our key innovation is performing regression by distribution rather than simply by value, enabling use of additional pathology findings traditionally ignored by deep learning strategies. We evaluated our model on a dataset of 973 (testing n=198 ) multi-parametric prostate MRI exams collected at UCSF from 2016-2019 followed by MRI/ultrasound fusion (targeted) biopsy and systematic (nontargeted) biopsy of the prostate gland, demonstrating that deep networks trained with mixed supervision of histopathology can feasibly exceed the performance of the Prostate Imaging-Reporting and Data System (PI-RADS) clinical standard for prostate MRI interpretation (71.6% vs 66.7% balanced accuracy and 0.724 vs 0.716 AUC).
MRI 的非侵入性前列腺癌分类有可能通过提供临床重大疾病的早期检测来彻底改变患者护理，但迄今为止显示的阳性预测价值有限。为了解决这个问题，我们提出了一种基于图像的深度学习方法，通过对随后接受活检的患者进行 MRI 筛查来预测具有临床意义的前列腺癌，这些患者的结果范围从良性病理到最高级别的肿瘤。具体来说，我们证明，尽管与基于图像的分割的一致性降低了成本，但通过不同的组织病理学基本事实进行的混合监督可以提高分类性能。先前的方法利用来自靶向活检和全前列腺切除术的病理结果作为基本事实，以强有力地监督临床上有意义的癌症的定位，而我们的方法还利用从具有区域定位的非靶向系统活检中提取的弱监督信号来提高整体性能。我们的关键创新是通过分布而不是简单地通过值进行回归，从而能够使用传统上被深度学习策略忽略的额外病理学发现。我们在 2016 年至 2019 年在 UCSF 收集的 973 例（测试 n=198）多参数前列腺 MRI 检查数据集上评估了我们的模型，随后进行了 MRI/超声融合（靶向）活检和前列腺的系统（非靶向）活检，证明研究表明，经过组织病理学混合监督训练的深层网络可以超过前列腺 MRI 解释的前列腺成像报告和数据系统 (PI-RADS) 临床标准的性能（71.6% 与 66.7% 的平衡准确度以及 0.724 与 0.716 AUC）。

AU Urban, Theresa Noichl, Wolfgang Engel, Klaus Juergen Koehler, Thomas Pfeiffer, Franz
AU Urban、特蕾莎·诺希尔、沃尔夫冈·恩格尔、克劳斯·于尔根·克勒、托马斯·菲佛、弗朗兹

Correction for X-Ray Scatter and Detector Crosstalk in Dark-Field Radiography
暗场射线照相中 X 射线散射和探测器串扰的校正

Dark-field radiography, a new X-ray imaging method, has recently been applied to human chest imaging for the first time. It employs conventional X-ray devices in combination with a Talbot-Lau interferometer with a large field of view, providing both attenuation and dark-field radiographs. It is well known that sample scatter creates artifacts in both modalities. Here, we demonstrate that also X-ray scatter generated by the interferometer as well as detector crosstalk create artifacts in the dark-field radiographs, in addition to the expected loss of spatial resolution. We propose deconvolution-based correction methods for the induced artifacts. The kernel for detector crosstalk is measured and fitted to a model, while the kernel for scatter from the analyzer grating is calculated by a Monte-Carlo simulation. To correct for scatter from the sample, we adapt an algorithm used for scatter correction in conventional radiography. We validate the obtained corrections with a water phantom. Finally, we show the impact of detector crosstalk, scatter from the analyzer grating and scatter from the sample and their successful correction on dark-field images of a human thorax.
暗场摄影作为一种新的X射线成像方法，最近首次应用于人体胸部成像。它采用传统的 X 射线设备与大视场 Talbot-Lau 干涉仪相结合，提供衰减和暗场射线照片。众所周知，样本分散会在两种模式中产生伪影。在这里，我们证明，除了预期的空间分辨率损失之外，干涉仪产生的 X 射线散射以及探测器串扰也会在暗场射线照片中产生伪影。我们提出了基于反卷积的校正方法来校正引起的伪影。测量探测器串扰的内核并将其拟合到模型中，而分析仪光栅散射的内核则通过蒙特卡罗模拟计算。为了校正样本的散射，我们采用了传统放射线照相中用于散射校正的算法。我们用水模型验证所获得的校正。最后，我们展示了探测器串扰、分析仪光栅散射和样本散射的影响，以及它们对人体胸部暗场图像的成功校正。

AU Wang, Qi Wen, Zhijie Shi, Jun Wang, Qian Shen, Dinggang Ying, Shihui
王AU、文琪、史志杰、王军、沉谦、应丁刚、石慧

Spatial and Modal Optimal Transport for Fast Cross-Modal MRI Reconstruction.
用于快速跨模态 MRI 重建的空间和模态最佳传输。

Multi-modal magnetic resonance imaging (MRI) plays a crucial role in comprehensive disease diagnosis in clinical medicine. However, acquiring certain modalities, such as T2-weighted images (T2WIs), is time-consuming and prone to be with motion artifacts. It negatively impacts subsequent multi-modal image analysis. To address this issue, we propose an end-to-end deep learning framework that utilizes T1-weighted images (T1WIs) as auxiliary modalities to expedite T2WIs' acquisitions. While image pre-processing is capable of mitigating misalignment, improper parameter selection leads to adverse pre-processing effects, requiring iterative experimentation and adjustment. To overcome this shortage, we employ Optimal Transport (OT) to synthesize T2WIs by aligning T1WIs and performing cross-modal synthesis, effectively mitigating spatial misalignment effects. Furthermore, we adopt an alternating iteration framework between the reconstruction task and the cross-modal synthesis task to optimize the final results. Then, we prove that the reconstructed T2WIs and the synthetic T2WIs become closer on the T2 image manifold with iterations increasing, and further illustrate that the improved reconstruction result enhances the synthesis process, whereas the enhanced synthesis result improves the reconstruction process. Finally, experimental results from FastMRI and internal datasets confirm the effectiveness of our method, demonstrating significant improvements in image reconstruction quality even at low sampling rates.
多模态磁共振成像（MRI）在临床医学的综合疾病诊断中发挥着至关重要的作用。然而，获取某些模式（例如 T2 加权图像 (T2WI)）非常耗时，并且容易出现运动伪影。它会对后续的多模态图像分析产生负面影响。为了解决这个问题，我们提出了一种端到端深度学习框架，利用 T1 加权图像（T1WI）作为辅助模式来加速 T2WI 的采集。虽然图像预处理能够减轻错位，但参数选择不当会导致预处理效果不佳，需要反复实验和调整。为了克服这一不足，我们采用最优传输（OT）通过对齐 T1WI 并执行跨模态合成来合成 T2WI，有效减轻空间错位效应。此外，我们在重建任务和跨模态合成任务之间采用交替迭代框架来优化最终结果。然后，我们证明了随着迭代次数的增加，重建的T2WI和合成的T2WI在T2图像流形上变得更加接近，并进一步说明改进的重建结果增强了合成过程，而增强的合成结果改善了重建过程。最后，FastMRI 和内部数据集的实验结果证实了我们方法的有效性，证明即使在低采样率下图像重建质量也有显着改善。

AU Zhu, Hui Zeng, Yi Cai, Xiran
AU朱、曾辉、蔡毅、习然

Passive Acoustic Mapping for Convex Arrays With the Helical Wave Spectrum Method
采用螺旋波谱法进行凸阵无源声学测绘

Passive acoustic mapping (PAM) has emerged as a valuable imaging modality for monitoring the cavitation activity in focused ultrasound therapies. When it comes to imaging in the human abdomen, convex arrays are preferred due to their large acoustic window. However, existing PAM methods for convex arrays rely on the computationally expensive delay-and-sum (DAS) operation limiting the image reconstruction speed when the field-of-view (FOV) is large. In this work, we propose an efficient and frequency-selective PAM method for convex arrays. This method is based on projecting the helical wave spectrum (HWS) between cylindrical surfaces in the imaging field. Both the in silico and in vitro experiments showed that the HWS method has comparable image quality and similar acoustic cavitation source localization accuracy as the DAS-based methods. Compared to the frequency-domain and time-domain DAS methods, the time-complexity of the HWS method is reduced by one order and two orders of magnitude, respectively. A parallel implementation of the HWS method realized millisecond-level image reconstruction speed. We also show that the HWS method is inherently capable of mapping microbubble (MB) cavitation activity of different status, i.e., no cavitation, stable cavitation, or inertial cavitation. After compensating for the lens effects of the convex array, we further combined PAM formed by the HWS method and B-mode imaging as a real-time dual-mode imaging approach to map the anatomical location where MBs cavitate in a liver phantom experiment. This method may find use in applications where convex arrays are required for cavitation activity monitoring in real time.
被动声学测绘 (PAM) 已成为一种有价值的成像方式，用于监测聚焦超声治疗中的空化活动。当涉及到人体腹部成像时，凸阵由于其声窗较大而成为首选。然而，现有的凸阵 PAM 方法依赖于计算成本昂贵的延迟求和 (DAS) 操作，当视场 (FOV) 较大时，这限制了图像重建速度。在这项工作中，我们提出了一种用于凸阵的高效且频率选择性的 PAM 方法。该方法基于在成像场中的圆柱表面之间投影螺旋波谱（HWS）。计算机和体外实验均表明，HWS 方法具有与基于 DAS 的方法相当的图像质量和相似的声空化源定位精度。与频域和时域DAS方法相比，HWS方法的时间复杂度分别降低了一个数量级和两个数量级。 HWS方法的并行实现实现了毫秒级的图像重建速度。我们还表明，HWS 方法本质上能够绘制不同状态的微泡 (MB) 空化活动，即无空化、稳定空化或惯性空化。在补偿凸阵透镜效应后，我们进一步将HWS方法形成的PAM和B模式成像结合作为实时双模式成像方法，以绘制肝脏模型实验中MBs空化的解剖位置。该方法可用于需要凸阵列来实时监测空化活动的应用。

C1 ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai 201210, Peoples R China C1 Chinese Acad Sci, Shanghai Adv Res Inst, Shanghai 201210, Peoples R China C1 Univ Chinese Acad Sci, Beijing 100049, Peoples R China C1 ShanghaiTech Univ, Shanghai Engn Res Ctr Intelligent Vis & Imaging, Shanghai 201210, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-05-23 UT WOS:001214547800026 PM 38198274 ER
C1 上海科技大学，科学信息科学与技术学院，上海 201210，人民 R 中国 C1 中国科学院，上海先进研究中心，上海 201210，人民 R 中国 C1 中国科学技术大学，北京 100049，人民 R 中国 C1 上海科技大学，上海工程技术大学Res Ctr 智能视觉与成像，上海 201210，Peoples R China SN 0278-0062 EI 1558-254X DA 2024-05-23 UT WOS:001214547800026 PM 38198274 ER

AU Feng, Ruimin Wu, Qing Feng, Jie She, Huajun Liu, Chunlei Zhang, Yuyao Wei, Hongjiang
AU Feng、吴瑞敏、冯庆、佘杰、刘华军、张春雷、魏玉耀、江宏

IMJENSE: Scan-Specific Implicit Representation for Joint Coil Sensitivity and Image Estimation in Parallel MRI
IMJENSE：并行 MRI 中联合线圈灵敏度和图像估计的扫描特定隐式表示

Parallel imaging is a commonly used technique to accelerate magnetic resonance imaging (MRI) data acquisition. Mathematically, parallel MRI reconstruction can be formulated as an inverse problem relating the sparsely sampled k-space measurements to the desired MRI image. Despite the success of many existing reconstruction algorithms, it remains a challenge to reliably reconstruct a high-quality image from highly reduced k-space measurements. Recently, implicit neural representation has emerged as a powerful paradigm to exploit the internal information and the physics of partially acquired data to generate the desired object. In this study, we introduced IMJENSE, a scan-specific implicit neural representation-based method for improving parallel MRI reconstruction. Specifically, the underlying MRI image and coil sensitivities were modeled as continuous functions of spatial coordinates, parameterized by neural networks and polynomials, respectively. The weights in the networks and coefficients in the polynomials were simultaneously learned directly from sparsely acquired k-space measurements, without fully sampled ground truth data for training. Benefiting from the powerful continuous representation and joint estimation of the MRI image and coil sensitivities, IMJENSE outperforms conventional image or k-space domain reconstruction algorithms. With extremely limited calibration data, IMJENSE is more stable than supervised calibrationless and calibration-based deep-learning methods. Results show that IMJENSE robustly reconstructs the images acquired at 5x and 6x accelerations with only 4 or 8 calibration lines in 2D Cartesian acquisitions, corresponding to 22.0% and 19.5% undersampling rates. The high-quality results and scanning specificity make the proposed method hold the potential for further accelerating the data acquisition of parallel MRI.
并行成像是加速磁共振成像 (MRI) 数据采集的常用技术。从数学上讲，并行 MRI 重建可以表示为将稀疏采样的 k 空间测量与所需 MRI 图像相关联的反问题。尽管许多现有的重建算法取得了成功，但从高度简化的 k 空间测量中可靠地重建高质量图像仍然是一个挑战。最近，隐式神经表示已经成为一种强大的范例，可以利用内部信息和部分获取的数据的物理特性来生成所需的对象。在这项研究中，我们引入了 IMJENSE，一种基于扫描特定隐式神经表示的方法，用于改进并行 MRI 重建。具体来说，底层 MRI 图像和线圈灵敏度被建模为空间坐标的连续函数，分别由神经网络和多项式参数化。网络中的权重和多项式中的系数是直接从稀疏获取的 k 空间测量中同时学习的，无需完全采样的地面实况数据进行训练。受益于 MRI 图像和线圈灵敏度的强大连续表示和联合估计，IMJENSE 优于传统图像或 k 空间域重建算法。由于校准数据极其有限，IMJENSE 比有监督的无校准和基于校准的深度学习方法更稳定。结果表明，IMJENSE 在 2D 笛卡尔采集中仅使用 4 或 8 条校准线即可稳健地重建以 5 倍和 6 倍加速度采集的图像，对应于 22.0% 和 19.5% 的欠采样率。高质量的结果和扫描特异性使得该方法具有进一步加速并行 MRI 数据采集的潜力。

AU Li, Heng Lin, Ziqin Qiu, Zhongxi Li, Zinan Niu, Ke Guo, Na Fu, Huazhu Hu, Yan Liu, Jiang
AU Li, 林恒, 邱子勤, 李中希, 牛子楠, 郭克, 付娜, 胡华珠, 刘岩, 江

Enhancing and Adapting in the Clinic: Source-Free Unsupervised Domain Adaptation for Medical Image Enhancement
临床中的增强和适应：用于医学图像增强的无源无监督域适应

Medical imaging provides many valuable clues involving anatomical structure and pathological characteristics. However, image degradation is a common issue in clinical practice, which can adversely impact the observation and diagnosis by physicians and algorithms. Although extensive enhancement models have been developed, these models require a well pre-training before deployment, while failing to take advantage of the potential value of inference data after deployment. In this paper, we raise an algorithm for source-free unsupervised domain adaptive medical image enhancement (SAME), which adapts and optimizes enhancement models using test data in the inference phase. A structure-preserving enhancement network is first constructed to learn a robust source model from synthesized training data. Then a teacher-student model is initialized with the source model and conducts source-free unsupervised domain adaptation (SFUDA) by knowledge distillation with the test data. Additionally, a pseudo-label picker is developed to boost the knowledge distillation of enhancement tasks. Experiments were implemented on ten datasets from three medical image modalities to validate the advantage of the proposed algorithm, and setting analysis and ablation studies were also carried out to interpret the effectiveness of SAME. The remarkable enhancement performance and benefits for downstream tasks demonstrate the potential and generalizability of SAME. The code is available at https://github.com/liamheng/Annotation-free-Medical-Image-Enhancement.
医学影像提供了许多涉及解剖结构和病理特征的有价值的线索。然而，图像质量下降是临床实践中的常见问题，这会对医生和算法的观察和诊断产生不利影响。尽管已经开发了广泛的增强模型，但这些模型在部署前需要进行良好的预训练，而在部署后无法利用推理数据的潜在价值。在本文中，我们提出了一种无源无监督域自适应医学图像增强（SAME）算法，该算法在推理阶段使用测试数据来适应和优化增强模型。首先构建结构保持增强网络，以从合成的训练数据中学习鲁棒的源模型。然后用源模型初始化师生模型，并通过测试数据的知识蒸馏进行无源无监督域适应（SFUDA）。此外，还开发了伪标签选择器来促进增强任务的知识蒸馏。在来自三种医学图像模态的十个数据集上进行了实验，以验证所提出算法的优势，并且还进行了设置分析和消融研究以解释 SAME 的有效性。下游任务的显着增强性能和优势证明了 SAME 的潜力和通用性。该代码可在 https://github.com/liamheng/Annotation-free-Medical-Image-Enhancement 获取。

AU Murali, Aditya Alapatt, Deepak Mascagni, Pietro Vardazaryan, Armine Garcia, Alain Okamoto, Nariaki Mutter, Didier Padoy, Nicolas
AU Murali、阿迪亚·阿拉帕特、迪帕克·马斯卡尼、皮特罗·瓦尔达扎良、阿米恩·加西亚、阿兰·冈本、Nariaki Mutter、迪迪埃·帕多伊、尼古拉斯

Latent Graph Representations for Critical View of Safety Assessment
安全评估批判性观点的潜在图表示

Assessing the critical view of safety in laparoscopic cholecystectomy requires accurate identification and localization of key anatomical structures, reasoning about their geometric relationships to one another, and determining the quality of their exposure. Prior works have approached this task by including semantic segmentation as an intermediate step, using predicted segmentation masks to then predict the CVS. While these methods are effective, they rely on extremely expensive ground-truth segmentation annotations and tend to fail when the predicted segmentation is incorrect, limiting generalization. In this work, we propose a method for CVS prediction wherein we first represent a surgical image using a disentangled latent scene graph, then process this representation using a graph neural network. Our graph representations explicitly encode semantic information - object location, class information, geometric relations - to improve anatomy-driven reasoning, as well as visual features to retain differentiability and thereby provide robustness to semantic errors. Finally, to address annotation cost, we propose to train our method using only bounding box annotations, incorporating an auxiliary image reconstruction objective to learn fine-grained object boundaries. We show that our method not only outperforms several baseline methods when trained with bounding box annotations, but also scales effectively when trained with segmentation masks, maintaining state-of-the-art performance.
评估腹腔镜胆囊切除术安全性的关键观点需要准确识别和定位关键解剖结构，推理它们彼此之间的几何关系，并确定它们的暴露质量。之前的工作通过将语义分割作为中间步骤来完成此任务，然后使用预测的分割掩码来预测 CVS。虽然这些方法很有效，但它们依赖于极其昂贵的真实分割注释，并且当预测的分割不正确时往往会失败，从而限制了泛化。在这项工作中，我们提出了一种 CVS 预测方法，其中我们首先使用解开的潜在场景图表示手术图像，然后使用图神经网络处理该表示。我们的图形表示显式编码语义信息（对象位置、类信息、几何关系）以改进解剖驱动的推理，以及视觉特征以保留可微性，从而提供对语义错误的鲁棒性。最后，为了解决注释成本，我们建议仅使用边界框注释来训练我们的方法，并结合辅助图像重建目标来学习细粒度的对象边界。我们表明，我们的方法不仅在使用边界框注释进行训练时优于几种基线方法，而且在使用分割掩模进行训练时也能有效地扩展，从而保持最先进的性能。

AU Zhang, Yuanming Li, Zheng Han, Xiangmin Ding, Saisai Li, Juncheng Wang, Jun Ying, Shihui Shi, Jun
张AU、李元明、韩正、丁向民、李赛赛、王俊成、英俊、石慧、Jun

Pseudo-Data Based Self-Supervised Federated Learning for Classification of Histopathological Images
基于伪数据的自监督联邦学习用于组织病理学图像分类

Computer-aided diagnosis (CAD) can help pathologists improve diagnostic accuracy together with consistency and repeatability for cancers. However, the CAD models trained with the histopathological images only from a single center (hospital) generally suffer from the generalization problem due to the straining inconsistencies among different centers. In this work, we propose a pseudo-data based self-supervised federated learning (FL) framework, named SSL-FT-BT, to improve both the diagnostic accuracy and generalization of CAD models. Specifically, the pseudo histopathological images are generated from each center, which contain both inherent and specific properties corresponding to the real images in this center, but do not include the privacy information. These pseudo images are then shared in the central server for self-supervised learning (SSL) to pre-train the backbone of global mode. A multi-task SSL is then designed to effectively learn both the center-specific information and common inherent representation according to the data characteristics. Moreover, a novel Barlow Twins based FL (FL-BT) algorithm is proposed to improve the local training for the CAD models in each center by conducting model contrastive learning, which benefits the optimization of the global model in the FL procedure. The experimental results on four public histopathological image datasets indicate the effectiveness of the proposed SSL-FL-BT on both diagnostic accuracy and generalization.
计算机辅助诊断 (CAD) 可以帮助病理学家提高癌症诊断的准确性以及一致性和可重复性。然而，由于不同中心之间的应变不一致，仅使用来自单个中心（医院）的组织病理学图像训练的 CAD 模型通常会遇到泛化问题。在这项工作中，我们提出了一种基于伪数据的自监督联邦学习 (FL) 框架，名为 SSL-FT-BT，以提高 CAD 模型的诊断准确性和泛化能力。具体而言，从每个中心生成伪组织病理学图像，其包含与该中心的真实图像相对应的固有的和特定的属性，但不包括隐私信息。然后，这些伪图像在中央服务器中共享，用于自我监督学习（SSL），以预训练全局模式的主干。然后设计多任务SSL，根据数据特征有效地学习中心特定信息和共同的固有表示。此外，提出了一种新的基于Barlow Twins的FL（FL-BT）算法，通过模型对比学习来改进每个中心CAD模型的局部训练，这有利于FL过程中全局模型的优化。在四个公共组织病理学图像数据集上的实验结果表明了所提出的 SSL-FL-BT 在诊断准确性和泛化方面的有效性。

AU Xu, Kai Lu, Shiyu Huang, Bin Wu, Weiwen Liu, Qiegen
AU徐、陆凯、黄世宇、吴斌、刘伟文、Qiegen

Stage-by-stage Wavelet Optimization Refinement Diffusion Model for Sparse-View CT Reconstruction.
用于稀疏视图 CT 重建的分阶段小波优化细化扩散模型。

Diffusion model has emerged as a potential tool to tackle the challenge of sparse-view CT reconstruction, displaying superior performance compared to conventional methods. Nevertheless, these prevailing diffusion models predominantly focus on the sinogram or image domains, which can lead to instability during model training, potentially culminating in convergence towards local minimal solutions. The wavelet transform serves to disentangle image contents and features into distinct frequency-component bands at varying scales, adeptly capturing diverse directional structures. Employing the wavelet transform as a guiding sparsity prior significantly enhances the robustness of diffusion models. In this study, we present an innovative approach named the Stage-by-stage Wavelet Optimization Refinement Diffusion (SWORD) model for sparse-view CT reconstruction. Specifically, we establish a unified mathematical model integrating low-frequency and high-frequency generative models, achieving the solution with an optimization procedure. Furthermore, we perform the low-frequency and high-frequency generative models on wavelet's decomposed components rather than the original sinogram, ensuring the stability of model training. Our method is rooted in established optimization theory, comprising three distinct stages, including low-frequency generation, high-frequency refinement and domain transform. The experimental results demonstrated that the proposed method outperformed existing state-of-the-art methods both quantitatively and qualitatively.
扩散模型已成为应对稀疏视图 CT 重建挑战的潜在工具，与传统方法相比表现出优越的性能。然而，这些流行的扩散模型主要集中在正弦图或图像域，这可能导致模型训练过程中的不稳定，最终可能会收敛到局部最小解。小波变换用于将图像内容和特征分解成不同尺度的不同频率分量频带，从而熟练地捕获不同的方向结构。采用小波变换作为指导稀疏先验显着增强了扩散模型的鲁棒性。在这项研究中，我们提出了一种用于稀疏视图 CT 重建的创新方法，称为逐级小波优化细化扩散 (SWORD) 模型。具体来说，我们建立了一个集成低频和高频生成模型的统一数学模型，通过优化过程实现求解。此外，我们对小波分解后的分量而不是原始正弦图进行低频和高频生成模型，保证了模型训练的稳定性。我们的方法植根于已建立的优化理论，包括三个不同的阶段，包括低频生成、高频细化和域变换。实验结果表明，所提出的方法在数量和质量上都优于现有的最先进方法。

EI 1558-254X DA 2024-01-20 UT MEDLINE:38236666 PM 38236666 ER
EI 1558-254X DA 2024-01-20 UT MEDLINE：38236666 PM 38236666 ER

AU Acciavatti, Raymond J. Choi, Chloe J. Vent, Trevor L. Barufaldi, Bruno Cohen, Eric A. Wileyto, E. Paul Maidment, Andrew D. A.
AU Acciavatti、Raymond J. Choi、Chloe J. Vent、Trevor L. Barufaldi、Bruno Cohen、Eric A. Wileyto、E. Paul Maidment、Andrew DA

Non-Isocentric Geometry for Next-Generation Tomosynthesis With Super-Resolution
用于具有超分辨率的下一代断层合成的非等中心几何结构

Our lab at the University of Pennsylvania (UPenn) is investigating novel designs for digital breast tomosynthesis. We built a next-generation tomosynthesis system with a non-isocentric geometry (superior-to-inferior detector motion). This paper examines four metrics of image quality affected by this design. First, aliasing was analyzed in reconstructions prepared with smaller pixelation than the detector. Aliasing was assessed with a theoretical model of r-factor, a metric calculating amplitudes of alias signal relative to input signal in the Fourier transform of the reconstruction of a sinusoidal object. Aliasing was also assessed experimentally with a bar pattern (illustrating spatial variations in aliasing) and 360(degrees)-star pattern (illustrating directional anisotropies in aliasing). Second, the point spread function (PSF) was modeled in the direction perpendicular to the detector to assess out-of-plane blurring. Third, power spectra were analyzed in an anthropomorphic phantom developed by UPenn and manufactured by Computerized Imaging Reference Systems (CIRS), Inc. (Norfolk, VA). Finally, calcifications were analyzed in the CIRS Model 020 BR3D Breast Imaging Phantom in terms of signal-to-noise ratio (SNR); i.e., mean calcification signal relative to background-tissue noise. Image quality was generally superior in the non-isocentric geometry: Aliasing artifacts were suppressed in both theoretical and experimental reconstructions prepared with smaller pixelation than the detector. PSF width was also reduced at most positions. Anatomic noise was reduced. Finally, SNR in calcification detection was improved. (A potential trade-off of smaller-pixel reconstructions was reduced SNR; however, SNR was still improved by the detector-motion acquisition.) In conclusion, the non-isocentric geometry improved image quality in several ways.
我们位于宾夕法尼亚大学 (UPenn) 的实验室正在研究数字乳房断层合成的新颖设计。我们构建了具有非等中心几何结构（从上到下探测器运动）的下一代断层合成系统。本文研究了受此设计影响的图像质量的四个指标。首先，在用比检测器更小的像素化准备的重建中分析混叠。使用 r 因子的理论模型来评估混叠，r 因子是一种计算混叠信号相对于正弦对象重建的傅里叶变换中的输入信号的幅度的度量。还使用条形图案（说明混叠中的空间变化）和 360（度）星形图案（说明混叠中的方向各向异性）对混叠进行了实验评估。其次，在垂直于探测器的方向上对点扩散函数 (PSF) 进行建模，以评估面外模糊。第三，在由 UPenn 开发并由 Computerized Imaging Reference Systems (CIRS), Inc.（诺福克，弗吉尼亚州）制造的拟人体模中分析功率谱。最后，在 CIRS Model 020 BR3D 乳腺成像体模中对钙化点进行了信噪比 (SNR) 分析；即，相对于背景组织噪声的平均钙化信号。非等中心几何结构中的图像质量通常较高：在使用比探测器更小的像素化准备的理论和实验重建中，混叠伪影均得到抑制。大多数位置的 PSF 宽度也有所减小。解剖噪音减少。最后，钙化检测的信噪比得到了提高。（较小像素重建的潜在权衡是降低了信噪比；然而，检测器运动采集仍然提高了信噪比。) 总之，非等心几何结构通过多种方式提高了图像质量。

AU Ban, Yutong Eckhoff, Jennifer A. Ward, Thomas M. Hashimoto, Daniel A. Meireles, Ozanan R. Rus, Daniela Rosman, Guy
AU Ban、Yutong Eckhoff、Jennifer A. Ward、Thomas M. Hashimoto、Daniel A. Meireles、Ozanan R. Rus、Daniela Rosman、Guy

Concept Graph Neural Networks for Surgical Video Understanding
用于理解手术视频的概念图神经网络

Analysis of relations between objects and comprehension of abstract concepts in the surgical video is important in AI-augmented surgery. However, building models that integrate our knowledge and understanding of surgery remains a challenging endeavor. In this paper, we propose a novel way to integrate conceptual knowledge into temporal analysis tasks using temporal concept graph networks. In the proposed networks, a knowledge graph is incorporated into the temporal video analysis of surgical notions, learning the meaning of concepts and relations as they apply to the data. We demonstrate results in surgical video data for tasks such as verification of the critical view of safety, estimation of the Parkland grading scale as well as recognizing instrument-action-tissue triplets. The results show that our method improves the recognition and detection of complex benchmarks as well as enables other analytic applications of interest.
手术视频中对象之间关系的分析和抽象概念的理解对于人工智能增强手术非常重要。然而，建立整合我们对手术的知识和理解的模型仍然是一项具有挑战性的工作。在本文中，我们提出了一种使用时态概念图网络将概念知识集成到时态分析任务中的新方法。在所提出的网络中，知识图被纳入手术概念的时间视频分析中，学习概念和关系应用于数据时的含义。我们展示了手术视频数据的结果，用于验证安全性批判性观点、估计 Parkland 分级量表以及识别器械-动作-组织三元组等任务。结果表明，我们的方法提高了复杂基准的识别和检测，并支持其他感兴趣的分析应用。

AU Yang, Yanwu Ye, Chenfei Guo, Xutao Wu, Tao Xiang, Yang Ma, Ting
欧阳、叶彦武、郭晨飞、吴旭涛、翔涛、马洋、婷

Mapping Multi-Modal Brain Connectome for Brain Disorder Diagnosis via Cross-Modal Mutual Learning
通过跨模态相互学习绘制多模态脑连接组用于脑部疾病诊断

Recently, the study of multi-modal brain connectome has recorded a tremendous increase and facilitated the diagnosis of brain disorders. In this paradigm, functional and structural networks, e.g., functional and structural connectivity derived from fMRI and DTI, are in some manner interacted but are not necessarily linearly related. Accordingly, there remains a great challenge to leverage complementary information for brain connectome analysis. Recently, Graph Convolutional Networks (GNN) have been widely applied to the fusion of multi-modal brain connectome. However, most existing GNN methods fail to couple inter-modal relationships. In this regard, we propose a Cross-modal Graph Neural Network (Cross-GNN) that captures inter-modal dependencies through dynamic graph learning and mutual learning. Specifically, the inter-modal representations are attentively coupled into a compositional space for reasoning inter-modal dependencies. Additionally, we investigate mutual learning in explicit and implicit ways: (1) Cross-modal representations are obtained by cross-embedding explicitly based on the inter-modal correspondence matrix. (2) We propose a cross-modal distillation method to implicitly regularize latent representations with cross-modal semantic contexts. We carry out statistical analysis on the attentively learned correspondence matrices to evaluate inter-modal relationships for associating disease biomarkers. Our extensive experiments on three datasets demonstrate the superiority of our proposed method for disease diagnosis with promising prediction performance and multi-modal connectome biomarker location.
最近，多模式脑连接组的研究有了巨大的增长，并促进了脑部疾病的诊断。在这个范式中，功能和结构网络，例如源自fMRI和DTI的功能和结构连接，以某种方式相互作用，但不一定是线性相关的。因此，利用补充信息进行脑连接组分析仍然存在巨大挑战。近年来，图卷积网络（GNN）已被广泛应用于多模态脑连接组的融合。然而，大多数现有的 GNN 方法无法耦合模态间的关系。在这方面，我们提出了一种跨模态图神经网络（Cross-GNN），通过动态图学习和相互学习来捕获模态间依赖关系。具体来说，将模态间表示仔细地耦合到组合空间中以推理模态间依赖性。此外，我们以显式和隐式方式研究相互学习：（1）通过基于模间对应矩阵显式交叉嵌入来获得跨模态表示。（2）我们提出了一种跨模态蒸馏方法，通过跨模态语义上下文隐式正则化潜在表示。我们对仔细学习的对应矩阵进行统计分析，以评估关联疾病生物标志物的模态关系。我们对三个数据集进行的广泛实验证明了我们提出的疾病诊断方法的优越性，具有良好的预测性能和多模式连接组生物标志物定位。

AU Yue, Zheng Jiang, Jiayao Hou, Wenguang Zhou, Quan David Spence, J Fenster, Aaron Qiu, Wu Ding, Mingyue
区越、姜正、侯佳耀、周文广、全·大卫·斯宾塞、J·芬斯特、邱亚伦、丁武、明月

Prior-knowledge Embedded U-Net based Fully Automatic Vessel Wall Volume Measurement of the Carotid Artery in 3D Ultrasound Image.
基于先验知识嵌入式 U-Net 的 3D 超声图像中颈动脉的全自动血管壁体积测量。

The vessel-wall-volume (VWV) measured based on three-dimensional (3D) carotid artery (CA) ultrasound (US) images can help to assess carotid atherosclerosis and manage patients at risk for stroke. Manual involvement for measurement work is subjective and requires well-trained operators, and fully automatic measurement tools are not yet available. Thereby, we proposed a fully automatic VWV measurement framework (Auto-VWV) using a CA prior-knowledge embedded U-Net (CAP-UNet) to measure the VWV from 3D CA US images without manual intervention. The Auto-VWV framework is designed to improve the repeated VWV measuring consistency, which resulted in the first fully automatic framework for VWV measurement. CAP-UNet is developed to improve segmentation accuracy on the whole CA, which composed of a U-Net type backbone and three additional prior-knowledge learning modules. Specifically, a continuity learning module is used to learn the spatial continuity of the arteries in a sequence of image slices. A voxel evolution learning module was designed to learn the evolution of the artery in adjacent slices, and a topology learning module was used to learn the unique topology of the carotid artery. In two 3D CA US datasets, CAP-UNet architecture achieved state-of-the-art performance compared to eight competing models. Furthermore, CAP-UNet-based Auto-VWV achieved better accuracy and consistency than Auto-VWV based on competing models in the simulated repeated measurement. Finally, using 10 pairs of real repeatedly scanned samples, Auto-VWV achieved better VWV measurement reproducibility than intra- and inter-operator manual measurements.
基于三维 (3D) 颈动脉 (CA) 超声 (US) 图像测量的血管壁体积 (VWV) 有助于评估颈动脉粥样硬化并管理有中风风险的患者。人工参与测量工作具有主观性，需要训练有素的操作人员，目前还没有全自动测量工具。因此，我们提出了一种全自动 VWV 测量框架 (Auto-VWV)，使用 CA 先验知识嵌入式 U-Net (CAP-UNet) 来测量 3D CA US 图像的 VWV，无需人工干预。 Auto-VWV 框架旨在提高重复 VWV 测量的一致性，从而产生了第一个全自动 VWV 测量框架。 CAP-UNet 是为了提高整个 CA 的分割精度而开发的，它由 U-Net 型主干和三个附加先验知识学习模块组成。具体来说，连续性学习模块用于学习图像切片序列中动脉的空间连续性。设计体素进化学习模块来学习相邻切片中动脉的进化，并使用拓扑学习模块来学习颈动脉的独特拓扑。在两个 3D CA US 数据集中，与八个竞争模型相比，CAP-UNet 架构实现了最先进的性能。此外，在模拟重复测量中，基于CAP-UNet的Auto-VWV比基于竞争模型的Auto-VWV取得了更好的准确性和一致性。最后，使用 10 对真实的重复扫描样本，自动 VWV 实现了比操作员内部和操作员之间手动测量更好的 VWV 测量再现性。

AU Yang, Xiaoyu Xu, Lijian Yu, Simon Xia, Qing Li, Hongsheng Zhang, Shaoting
AU Yang, 徐晓宇, 于立建, 夏西蒙, 李庆, 张宏生, 绍婷

Segmentation and Vascular Vectorization for Coronary Artery by Geometry-based Cascaded Neural Network.
基于几何的级联神经网络对冠状动脉进行分割和血管矢量化。

Segmentation of the coronary artery is an important task for the quantitative analysis of coronary computed tomography angiography (CCTA) images and is being stimulated by the field of deep learning. However, the complex structures with tiny and narrow branches of the coronary artery bring it a great challenge. Coupled with the medical image limitations of low resolution and poor contrast, fragmentations of segmented vessels frequently occur in the prediction. Therefore, a geometry-based cascaded segmentation method is proposed for the coronary artery, which has the following innovations: 1) Integrating geometric deformation networks, we design a cascaded network for segmenting the coronary artery and vectorizing results. The generated meshes of the coronary artery are continuous and accurate for twisted and sophisticated coronary artery structures, without fragmentations. 2) Different from mesh annotations generated by the traditional marching cube method from voxel-based labels, a finer vectorized mesh of the coronary artery is reconstructed with the regularized morphology. The novel mesh annotation benefits the geometry-based segmentation network, avoiding bifurcation adhesion and point cloud dispersion in intricate branches. 3) A dataset named CCA-200 is collected, consisting of 200 CCTA images with coronary artery disease. The ground truths of 200 cases are coronary internal diameter annotations by professional radiologists. Extensive experiments verify our method on our collected dataset CCA-200 and public ASOCA dataset, with a Dice of 0.778 on CCA-200 and 0.895 on ASOCA, showing superior results. Especially, our geometry-based model generates an accurate, intact and smooth coronary artery, devoid of any fragmentations of segmented vessels.
冠状动脉分割是冠状动脉计算机断层扫描血管造影（CCTA）图像定量分析的一项重要任务，并且受到深度学习领域的推动。然而，冠状动脉结构复杂、分支细小、狭窄，给它带来了巨大的挑战。再加上医学图像分辨率低、对比度差的限制，预测中经常出现分割血管的碎片。因此，提出了一种基于几何的冠状动脉级联分割方法，该方法具有以下创新点：1）集成几何变形网络，设计了用于分割冠状动脉并对结果进行矢量化的级联网络。对于扭曲和复杂的冠状动脉结构，生成的冠状动脉网格是连续且准确的，没有碎片。 2）与传统的行进立方体方法根据基于体素的标签生成的网格注释不同，用正则化形态重建了更精细的冠状动脉矢量化网格。新颖的网格注释有利于基于几何的分割网络，避免复杂分支中的分叉粘附和点云分散。 3) 收集名为CCA-200的数据集，由200张冠状动脉疾病的CCTA图像组成。 200个病例的基本事实是专业放射科医生的冠状动脉内径注释。大量实验在我们收集的数据集 CCA-200 和公共 ASOCA 数据集上验证了我们的方法，CCA-200 上的 Dice 为 0.778，ASOCA 上的 Dice 为 0.895，显示出优异的结果。特别是，我们基于几何的模型生成了准确、完整且光滑的冠状动脉，没有任何分段血管的碎片。

AU Li, Zirong Chang, Dingyue Zhang, Zhenxi Luo, Fulin Liu, Qiegen Zhang, Jianjia Yang, Guang Wu, Weiwen
AU Li、常子荣、张丁月、罗振西、刘福林、张切根、杨健佳、吴光、伟文

Dual-domain Collaborative Diffusion Sampling for Multi-Source Stationary Computed Tomography Reconstruction.
用于多源固定计算机断层扫描重建的双域协作扩散采样。

The multi-source stationary CT, where both the detector and X-ray source are fixed, represents a novel imaging system with high temporal resolution that has garnered significant interest. Limited space within the system restricts the number of X-ray sources, leading to sparse-view CT imaging challenges. Recent diffusion models for reconstructing sparse-view CT have generally focused separately on sinogram or image domains. Sinogram-centric models effectively estimate missing projections but may introduce artifacts, lacking mechanisms to ensure image correctness. Conversely, image-domain models, while capturing detailed image features, often struggle with complex data distribution, leading to inaccuracies in projections. Addressing these issues, the Dual-domain Collaborative Diffusion Sampling (DCDS) model integrates sinogram and image domain diffusion processes for enhanced sparse-view reconstruction. This model combines the strengths of both domains in an optimized mathematical framework. A collaborative diffusion mechanism underpins this model, improving sinogram recovery and image generative capabilities. This mechanism facilitates feedback-driven image generation from the sinogram domain and uses image domain results to complete missing projections. Optimization of the DCDS model is further achieved through the alternative direction iteration method, focusing on data consistency updates. Extensive testing, including numerical simulations, real phantoms, and clinical cardiac datasets, demonstrates the DCDS model's effectiveness. It consistently outperforms various state-of-the-art benchmarks, delivering exceptional reconstruction quality and precise sinogram.
多源固定 CT 的探测器和 X 射线源都是固定的，代表了一种具有高时间分辨率的新型成像系统，引起了人们的极大兴趣。系统内有限的空间限制了 X 射线源的数量，导致稀疏视图 CT 成像面临挑战。最近用于重建稀疏视图 CT 的扩散模型通常分别关注正弦图或图像域。以正弦图为中心的模型可以有效地估计缺失的投影，但可能会引入伪影，缺乏确保图像正确性的机制。相反，图像域模型在捕获详细图像特征的同时，通常会与复杂的数据分布作斗争，从而导致投影不准确。为了解决这些问题，双域协作扩散采样 (DCDS) 模型集成了正弦图和图像域扩散过程，以增强稀疏视图重建。该模型在优化的数学框架中结合了两个领域的优势。协作扩散机制支撑了该模型，提高了正弦图恢复和图像生成能力。这种机制有利于从正弦图域生成反馈驱动的图像，并使用图像域结果来完成缺失的投影。通过交替方向迭代方法进一步实现DCDS模型的优化，重点关注数据一致性更新。广泛的测试，包括数值模拟、真实模型和临床心脏数据集，证明了 DCDS 模型的有效性。它始终优于各种最先进的基准，提供卓越的重建质量和精确的正弦图。

AU Chen, Wenting Liu, Jie Chow, Tommy W S Yuan, Yixuan
AU Chen、刘文婷、周杰、Tommy WS Yuan、Yixuan

STAR-RL: Spatial-temporal Hierarchical Reinforcement Learning for Interpretable Pathology Image Super-Resolution.
STAR-RL：用于可解释病理学图像超分辨率的时空分层强化学习。

Pathology image are essential for accurately interpreting lesion cells in cytopathology screening, but acquiring high-resolution digital slides requires specialized equipment and long scanning times. Though super-resolution (SR) techniques can alleviate this problem, existing deep learning models recover pathology image in a black-box manner, which can lead to untruthful biological details and misdiagnosis. Additionally, current methods allocate the same computational resources to recover each pixel of pathology image, leading to the sub-optimal recovery issue due to the large variation of pathology image. In this paper, we propose the first hierarchical reinforcement learning framework named Spatial-Temporal hierARchical Reinforcement Learning (STAR-RL), mainly for addressing the aforementioned issues in pathology image super-resolution problem. We reformulate the SR problem as a Markov decision process of interpretable operations and adopt the hierarchical recovery mechanism in patch level, to avoid sub-optimal recovery. Specifically, the higher-level spatial manager is proposed to pick out the most corrupted patch for the lower-level patch worker. Moreover, the higher-level temporal manager is advanced to evaluate the selected patch and determine whether the optimization should be stopped earlier, thereby avoiding the over-processed problem. Under the guidance of spatial-temporal managers, the lower-level patch worker processes the selected patch with pixel-wise interpretable actions at each time step. Experimental results on medical images degraded by different kernels show the effectiveness of STAR-RL. Furthermore, STAR-RL validates the promotion in tumor diagnosis with a large margin and shows generalizability under various degradation. The source code is to be released.
病理图像对于细胞病理学筛查中准确解释病变细胞至关重要，但获取高分辨率数字切片需要专门的设备和较长的扫描时间。尽管超分辨率（SR）技术可以缓解这个问题，但现有的深度学习模型以黑盒方式恢复病理图像，这可能导致不真实的生物学细节和误诊。此外，当前的方法分配相同的计算资源来恢复病理图像的每个像素，由于病理图像的巨大变化而导致次优恢复问题。在本文中，我们提出了第一个分层强化学习框架，名为时空分层强化学习（STAR-RL），主要用于解决病理图像超分辨率问题中的上述问题。我们将SR问题重新表述为可解释操作的马尔可夫决策过程，并采用补丁级别的分层恢复机制，以避免次优恢复。具体来说，建议较高级别的空间管理器为较低级别的补丁工作人员挑选出损坏最严重的补丁。此外，高级时间管理器可以评估所选补丁并确定是否应提前停止优化，从而避免过度处理问题。在时空管理器的指导下，较低级别的补丁工作人员在每个时间步骤通过像素级可解释的动作来处理选定的补丁。不同内核降解的医学图像的实验结果表明了STAR-RL的有效性。此外，STAR-RL 极大地验证了其在肿瘤诊断中的促进作用，并在各种退化下显示出普遍性。源代码即将发布。

AU Chen, Zhihao Niu, Chuang Gao, Qi Wang, Ge Shan, Hongming
陈AU、牛志浩、高闯、王奇、葛山、洪明

LIT-Former: Linking In-Plane and Through-Plane Transformers for Simultaneous CT Image Denoising and Deblurring
LIT-Former：连接平面内和平面内变压器以同时进行 CT 图像去噪和去模糊

This paper studies 3D low-dose computed tomography (CT) imaging. Although various deep learning methods were developed in this context, typically they focus on 2D images and perform denoising due to low-dose and deblurring for super-resolution separately. Up to date, little work was done for simultaneous in-plane denoising and through-plane deblurring, which is important to obtain high-quality 3D CT images with lower radiation and faster imaging speed. For this task, a straightforward method is to directly train an end-to-end 3D network. However, it demands much more training data and expensive computational costs. Here, we propose to link in-plane and through-plane transformers for simultaneous in-plane denoising and through-plane deblurring, termed as LIT-Former, which can efficiently synergize in-plane and through-plane sub-tasks for 3D CT imaging and enjoy the advantages of both convolution and transformer networks. LIT-Former has two novel designs: efficient multi-head self-attention modules (eMSM) and efficient convolutional feed-forward networks (eCFN). First, eMSM integrates in-plane 2D self-attention and through-plane 1D self-attention to efficiently capture global interactions of 3D self-attention, the core unit of transformer networks. Second, eCFN integrates 2D convolution and 1D convolution to extract local information of 3D convolution in the same fashion. As a result, the proposed LIT-Former synergizes these two sub-tasks, significantly reducing the computational complexity as compared to 3D counterparts and enabling rapid convergence. Extensive experimental results on simulated and clinical datasets demonstrate superior performance over state-of-the-art models. The source code is made available at https://github.com/hao1635/LIT-Former.
本文研究 3D 低剂量计算机断层扫描 (CT) 成像。尽管在此背景下开发了各种深度学习方法，但它们通常专注于 2D 图像，并分别执行低剂量去噪和超分辨率去模糊。迄今为止，同时进行面内去噪和穿面去模糊的工作还很少，这对于获得具有较低辐射和更快成像速度的高质量 3D CT 图像非常重要。对于此任务，一种简单的方法是直接训练端到端 3D 网络。然而，它需要更多的训练数据和昂贵的计算成本。在这里，我们建议链接平面内和穿过平面变压器以同时进行平面内去噪和穿过平面去模糊，称为 LIT-Former，它可以有效地协同 3D CT 成像的平面内和穿过平面子任务并享受卷积网络和变压器网络的优点。 LIT-Former 有两种新颖的设计：高效的多头自注意力模块（eMSM）和高效的卷积前馈网络（eCFN）。首先，eMSM 集成了平面内 2D 自注意力和跨平面 1D 自注意力，以有效捕获变压器网络核心单元 3D 自注意力的全局交互。其次，eCFN 集成了 2D 卷积和 1D 卷积，以相同的方式提取 3D 卷积的局部信息。因此，所提出的 LIT-Former 协同这两个子任务，与 3D 对应任务相比显着降低了计算复杂性，并实现快速收敛。模拟和临床数据集的大量实验结果表明，其性能优于最先进的模型。源代码位于 https://github.com/hao1635/LIT-Former。

AU Liu, Pan Huang, Gao Jing, Jing Bian, Suyan Cheng, Liuquan Lu, Xin Yang Rao, Chongyou Liu, Yu Hua, Yun Wang, Yongjun He, Kunlun
刘AU、黄潘、高静、边静、程苏彦、陆柳泉、饶欣阳、刘崇友、余华、王云、何永军、昆仑

An Energy Matching Vessel Segmentation Framework in 3-D Medical Images
3D 医学图像中的能量匹配血管分割框架

Accurate vascular segmentation from High Resolution 3-Dimensional (HR3D) medical scans is crucial for clinicians to visualize complex vasculature and diagnose related vascular diseases. However, a reliable and scalable vessel segmentation framework for HR3D scans remains a challenge. In this work, we propose a High-resolution Energy-matching Segmentation (HrEmS) framework that utilizes deep learning to directly process the entire HR3D scan and segment the vasculature to the finest level. The HrEmS framework introduces two novel components. Firstly, it uses the real-order total variation operator to construct a new loss function that guides the segmentation network to obtain the correct topology structure by matching the energy of the predicted segment to the energy of the manual label. This is different from traditional loss functions such as dice loss, which matches the pixels between predicted segment and manual label. Secondly, a curvature-based weight-correction module is developed, which directs the network to focus on crucial and complex structural parts of the vasculature instead of the easy parts. The proposed HrEmS framework was tested on three in-house multi-center datasets and three public datasets, and demonstrated improved results in comparison with the state-of-the-art methods using both topology-relevant and volumetric-relevant metrics. Furthermore, a double-blind assessment by three experienced radiologists on the critical points of the clinical diagnostic processes provided additional evidence of the superiority of the HrEmS framework.
高分辨率三维 (HR3D) 医学扫描的准确血管分割对于临床医生可视化复杂血管系统和诊断相关血管疾病至关重要。然而，用于 HR3D 扫描的可靠且可扩展的血管分割框架仍然是一个挑战。在这项工作中，我们提出了一种高分辨率能量匹配分割（HrEmS）框架，该框架利用深度学习直接处理整个 HR3D 扫描并将脉管系统分割到最精细的水平。 HrEmS 框架引入了两个新颖的组件。首先，它使用实阶全变分算子构造一个新的损失函数，通过将预测分段的能量与手动标签的能量相匹配来引导分段网络获得正确的拓扑结构。这与骰子损失等传统损失函数不同，后者匹配预测片段和手动标签之间的像素。其次，开发了基于曲率的权重校正模块，引导网络关注脉管系统的关键和复杂的结构部分，而不是简单的部分。所提出的 HrEmS 框架在三个内部多中心数据集和三个公共数据集上进行了测试，并与使用拓扑相关和体积相关指标的最先进方法相比，展示了改进的结果。此外，三位经验丰富的放射科医生对临床诊断过程的关键点进行的双盲评估为 HrEmS 框架的优越性提供了额外的证据。

AU Miao, Juzheng Zhou, Si-Ping Zhou, Guang-Quan Wang, Kai-Ni Yang, Meng Zhou, Shoujun Chen, Yang
区苗、周居正、周思平、王广泉、杨凯妮、周猛、陈守军、杨

SC-SSL: Self-Correcting Collaborative and Contrastive Co-Training Model for Semi-Supervised Medical Image Segmentation
SC-SSL：半监督医学图像分割的自校正协作和对比协同训练模型

Image segmentation achieves significant improvements with deep neural networks at the premise of a large scale of labeled training data, which is laborious to assure in medical image tasks. Recently, semi-supervised learning (SSL) has shown great potential in medical image segmentation. However, the influence of the learning target quality for unlabeled data is usually neglected in these SSL methods. Therefore, this study proposes a novel self-correcting co-training scheme to learn a better target that is more similar to ground-truth labels from collaborative network outputs. Our work has three-fold highlights. First, we advance the learning target generation as a learning task, improving the learning confidence for unannotated data with a self-correcting module. Second, we impose a structure constraint to encourage the shape similarity further between the improved learning target and the collaborative network outputs. Finally, we propose an innovative pixel-wise contrastive learning loss to boost the representation capacity under the guidance of an improved learning target, thus exploring unlabeled data more efficiently with the awareness of semantic context. We have extensively evaluated our method with the state-of-the-art semi-supervised approaches on four public-available datasets, including the ACDC dataset, M&Ms dataset, Pancreas-CT dataset, and Task_07 CT dataset. The experimental results with different labeled-data ratios show our proposed method's superiority over other existing methods, demonstrating its effectiveness in semi-supervised medical image segmentation.
在大规模标记训练数据的前提下，深度神经网络的图像分割取得了显着的改进，而这在医学图像任务中很难保证。最近，半监督学习（SSL）在医学图像分割中显示出巨大的潜力。然而，这些 SSL 方法通常忽略了学习目标质量对未标记数据的影响。因此，本研究提出了一种新颖的自校正协同训练方案，以学习与协作网络输出中的真实标签更相似的更好目标。我们的工作有三个亮点。首先，我们将学习目标生成作为一项学习任务来推进，通过自校正模块提高未注释数据的学习信心。其次，我们施加结构约束以进一步鼓励改进的学习目标和协作网络输出之间的形状相似性。最后，我们提出了一种创新的逐像素对比学习损失，以在改进的学习目标的指导下提高表示能力，从而在语义上下文的感知下更有效地探索未标记的数据。我们使用最先进的半监督方法在四个公开数据集（包括 ACDC 数据集、M&Ms 数据集、Pancreas-CT 数据集和 Task_07 CT 数据集）上广泛评估了我们的方法。不同标记数据比例的实验结果表明我们提出的方法相对于其他现有方法的优越性，证明了其在半监督医学图像分割中的有效性。

AU Liu, Jicheng Liu, Hui Fu, Huazhu Ye, Yu Chen, Kun Lu, Yu Mao, Jianbo Xu, Ronald X. Sun, Mingzhai
刘AU、刘继成、付辉、叶华珠、陈宇、陆坤、毛宇、徐建波、孙旭、明斋

Edge-Guided Contrastive Adaptation Network for Arteriovenous Nicking Classification Using Synthetic Data
使用合成数据进行动静脉切口分类的边缘引导对比适应网络

Retinal arteriovenous nicking (AVN) manifests as a reduced venular caliber of an arteriovenous crossing. AVNs are signs of many systemic, particularly cardiovascular diseases. Studies have shown that people with AVN are twice as likely to have a stroke. However, AVN classification faces two challenges. One is the lack of data, especially AVNs compared to the normal arteriovenous (AV) crossings. The other is the significant intra-class variations and minute inter-class differences. AVNs may look different in shape, scale, pose, and color. On the other hand, the AVN could be different from the normal AV crossing only by slight thinning of the vein. To address these challenges, first, we develop a data synthesis method to generate AV crossings, including normal and AVNs. Second, to mitigate the domain shift between the synthetic and real data, an edge-guided unsupervised domain adaptation network is designed to guide the transfer of domain invariant information. Third, a semantic contrastive learning branch (SCLB) is introduced and a set of semantically related images, as a semantic triplet, are input to the network simultaneously to guide the network to focus on the subtle differences in venular width and to ignore the differences in appearance. These strategies effectively mitigate the lack of data, domain shift between synthetic and real data, and significant intra- but minute inter-class differences. Extensive experiments have been performed to demonstrate the outstanding performance of the proposed method.
视网膜动静脉缺损（AVN）表现为动静脉交叉口的小静脉口径减小。 AVN 是许多全身性疾病，特别是心血管疾病的征兆。研究表明，患有 AVN 的人中风的可能性是普通人的两倍。然而，AVN 分类面临两个挑战。一是缺乏数据，尤其是 AVN 与正常动静脉 (AV) 交叉点的比较。另一个是显着的类内差异和微小的类间差异。 AVN 的形状、大小、姿势和颜色可能有所不同。另一方面，AVN 与正常 AV 交叉的不同之处仅在于静脉稍微变细。为了解决这些挑战，首先，我们开发了一种数据合成方法来生成 AV 交叉，包括正常和 AVN。其次，为了减轻合成数据和真实数据之间的域转移，设计了边缘引导的无监督域适应网络来指导域不变信息的传输。第三，引入语义对比学习分支（SCLB），将一组语义相关的图像作为语义三元组同时输入到网络中，引导网络关注小静脉宽度的细微差异，而忽略静脉宽度的差异。外貌。这些策略有效地缓解了数据缺乏、合成数据与真实数据之间的领域转移以及类内但微小的类间显着差异。已经进行了大量的实验来证明所提出的方法的出色性能。

AU Spieker, Veronika Eichhorn, Hannah Hammernik, Kerstin Rueckert, Daniel Preibisch, Christine Karampinos, Dimitrios C. Schnabel, Julia A.
AU Spieker、Veronika Eichhorn、Hannah Hammernik、Kerstin Rueckert、Daniel Preibisch、Christine Karampinos、Dimitrios C. Schnabel、Julia A.

Deep Learning for Retrospective Motion Correction in MRI: A Comprehensive Review
MRI 中回顾性运动校正的深度学习：综合综述

Motion represents one of the major challenges in magnetic resonance imaging (MRI). Since the MR signal is acquired in frequency space, any motion of the imaged object leads to complex artefacts in the reconstructed image in addition to other MR imaging artefacts. Deep learning has been frequently proposed for motion correction at several stages of the reconstruction process. The wide range of MR acquisition sequences, anatomies and pathologies of interest, and motion patterns (rigid vs. deformable and random vs. regular) makes a comprehensive solution unlikely. To facilitate the transfer of ideas between different applications, this review provides a detailed overview of proposed methods for learning-based motion correction in MRI together with their common challenges and potentials. This review identifies differences and synergies in underlying data usage, architectures, training and evaluation strategies. We critically discuss general trends and outline future directions, with the aim to enhance interaction between different application areas and research fields.
运动是磁共振成像 (MRI) 的主要挑战之一。由于 MR 信号是在频率空间中采集的，因此除了其他 MR 成像伪影之外，成像对象的任何运动都会导致重建图像中出现复杂的伪影。深度学习经常被提出用于重建过程的几个阶段的运动校正。广泛的 MR 采集序列、感兴趣的解剖结构和病理学以及运动模式（刚性与可变形、随机与规则）使得全面的解决方案不太可能。为了促进不同应用之间的思想转移，本综述详细概述了 MRI 中基于学习的运动校正所提出的方法及其常见挑战和潜力。本次审查确定了基础数据使用、架构、培训和评估策略方面的差异和协同作用。我们批判性地讨论总体趋势并概述未来方向，旨在加强不同应用领域和研究领域之间的互动。

AU Cui, Zhuo-Xu Liu, Congcong Fan, Xiaohong Cao, Chentao Cheng, Jing Zhu, Qingyong Liu, Yuanyuan Jia, Sen Wang, Haifeng Zhu, Yanjie Zhou, Yihang Zhang, Jianping Liu, Qiegen Liang, Dong
崔AU、刘卓旭、范聪聪、曹晓红、程晨涛、朱静、刘庆勇、贾媛媛、王森、朱海峰、周艳杰、张一航、刘建平、梁切根、董

Physics-Informed DeepMRI: k-Space Interpolation Meets Heat Diffusion.
基于物理的 DeepMRI：k 空间插值遇到热扩散。

Recently, diffusion models have shown considerable promise for MRI reconstruction. However, extensive experimentation has revealed that these models are prone to generating artifacts due to the inherent randomness involved in generating images from pure noise. To achieve more controlled image reconstruction, we reexamine the concept of interpolatable physical priors in k-space data, focusing specifically on the interpolation of high-frequency (HF) k-space data from low-frequency (LF) k-space data. Broadly, this insight drives a shift in the generation paradigm from random noise to a more deterministic approach grounded in the existing LF k-space data. Building on this, we first establish a relationship between the interpolation of HF k-space data from LF k-space data and the reverse heat diffusion process, providing a fundamental framework for designing diffusion models that generate missing HF data. To further improve reconstruction accuracy, we integrate a traditional physics-informed k-space interpolation model into our diffusion framework as a data fidelity term. Experimental validation using publicly available datasets demonstrates that our approach significantly surpasses traditional k-space interpolation methods, deep learning-based k-space interpolation techniques, and conventional diffusion models, particularly in HF regions. Finally, we assess the generalization performance of our model across various out-of-distribution datasets. Our code are available at https://github.com/ZhuoxuCui/Heat-Diffusion.
最近，扩散模型在 MRI 重建方面显示出了巨大的前景。然而，大量的实验表明，由于从纯噪声生成图像所涉及的固有随机性，这些模型很容易产生伪影。为了实现更受控的图像重建，我们重新审视了 k 空间数据中可插值物理先验的概念，特别关注从低频 (LF) k 空间数据插值高频 (HF) k 空间数据。从广义上讲，这种见解推动了生成范式从随机噪声转向基于现有 LF k 空间数据的更具确定性的方法。在此基础上，我们首先建立了从 LF k 空间数据插值 HF k 空间数据与反向热扩散过程之间的关系，为设计生成缺失 HF 数据的扩散模型提供了基本框架。为了进一步提高重建精度，我们将传统的物理信息 k 空间插值模型作为数据保真度项集成到我们的扩散框架中。使用公开数据集进行的实验验证表明，我们的方法显着优于传统的 k 空间插值方法、基于深度学习的 k 空间插值技术和传统的扩散模型，特别是在 HF 区域。最后，我们评估模型在各种分布外数据集上的泛化性能。我们的代码可在 https://github.com/ZhuoxuCui/Heat-Diffusion 获取。

AU Jiang, Yikun Pei, Yuru Xu, Tianmin Yuan, Xiaoru Zha, Hongbin
区江、裴一琨、徐玉如、袁天民、查晓如、洪斌

Towards Semantically-Consistent Deformable 2D-3D Registration for 3D Craniofacial Structure Estimation from A Single-View Lateral Cephalometric Radiograph.
通过单视图侧位头影测量射线照片进行 3D 颅面结构估计的语义一致的可变形 2D-3D 配准。

The deep neural networks combined with the statistical shape model have enabled efficient deformable 2D-3D registration and recovery of 3D anatomical structures from a single radiograph. However, the recovered volumetric image tends to lack the volumetric fidelity of fine-grained anatomical structures and explicit consideration of cross-dimensional semantic correspondence. In this paper, we introduce a simple but effective solution for semantically-consistent deformable 2D-3D registration and detailed volumetric image recovery by inferring a voxel-wise registration field between the cone-beam computed tomography and a single lateral cephalometric radiograph (LC). The key idea is to refine the initial statistical model-based registration field with craniofacial structural details and semantic consistency from the LC. Specifically, our framework employs a self-supervised scheme to learn a voxel-level refiner of registration fields to provide fine-grained craniofacial structural details and volumetric fidelity. We also present a weakly supervised semantic consistency measure for semantic correspondence, relieving the requirements of volumetric image collections and annotations. Experiments showcase that our method achieves deformable 2D-3D registration with performance gains over state-of-the-art registration and radiograph-based volumetric reconstruction methods. The source code is available at https://github.com/Jyk-122/SC-DREG.
深度神经网络与统计形状模型相结合，实现了高效的可变形 2D-3D 配准，并从单张 X 光照片中恢复 3D 解剖结构。然而，恢复的体积图像往往缺乏细粒度解剖结构的体积保真度和对跨维度语义对应的明确考虑。在本文中，我们通过推断锥形束计算机断层扫描和单侧头影测量 X 光片 (LC) 之间的体素配准场，介绍了一种简单但有效的解决方案，用于语义一致的可变形 2D-3D 配准和详细的体积图像恢复。关键思想是利用 LC 的颅面结构细节和语义一致性来完善基于初始统计模型的配准字段。具体来说，我们的框架采用自我监督方案来学习配准字段的体素级细化器，以提供细粒度的颅面结构细节和体积保真度。我们还提出了一种用于语义对应的弱监督语义一致性度量，减轻了体积图像收集和注释的要求。实验表明，我们的方法实现了可变形 2D-3D 配准，其性能优于最先进的配准和基于射线照相的体积重建方法。源代码可在 https://github.com/Jyk-122/SC-DREG 获取。

AU Zhang, Jianjia Mao, Haiyang Chang, Dingyue Yu, Hengyong Wu, Weiwen Shen, Dinggang
张AU、毛健佳、常海洋、于丁月、吴恒勇、沉伟文、丁刚

Adaptive and Iterative Learning With Multi-Perspective Regularizations for Metal Artifact Reduction
通过多视角正则化进行自适应和迭代学习，以减少金属伪影

Metal artifact reduction (MAR) is important for clinical diagnosis with CT images. The existing state-of-the-art deep learning methods usually suppress metal artifacts in sinogram or image domains or both. However, their performance is limited by the inherent characteristics of the two domains, i.e., the errors introduced by local manipulations in the sinogram domain would propagate throughout the whole image during backprojection and lead to serious secondary artifacts, while it is difficult to distinguish artifacts from actual image features in the image domain. To alleviate these limitations, this study analyzes the desirable properties of wavelet transform in-depth and proposes to perform MAR in the wavelet domain. First, wavelet transform yields components that possess spatial correspondence with the image, thereby preventing the spread of local errors to avoid secondary artifacts. Second, using wavelet transform could facilitate identification of artifacts from image since metal artifacts are mainly high-frequency signals. Taking these advantages of the wavelet transform, this paper decomposes an image into multiple wavelet components and introduces multi-perspective regularizations into the proposed MAR model. To improve the transparency and validity of the model, all the modules in the proposed MAR model are designed to reflect their mathematical meanings. In addition, an adaptive wavelet module is also utilized to enhance the flexibility of the model. To optimize the model, an iterative algorithm is developed. The evaluation on both synthetic and real clinical datasets consistently confirms the superior performance of the proposed method over the competing methods.
金属伪影减少 (MAR) 对于 CT 图像的临床诊断非常重要。现有最先进的深度学习方法通常会抑制正弦图或图像域或两者中的金属伪影。然而，它们的性能受到两个域的固有特征的限制，即正弦图域中的局部操作引入的误差会在反投影过程中传播到整个图像并导致严重的二次伪影，而很难将伪影与伪影区分开来。图像域中的实际图像特征。为了缓解这些限制，本研究深入分析了小波变换的理想特性，并提出在小波域中执行 MAR。首先，小波变换产生与图像具有空间对应关系的分量，从而防止局部误差的扩散，从而避免二次伪影。其次，使用小波变换可以促进从图像中识别伪影，因为金属伪影主要是高频信号。利用小波变换的这些优点，本文将图像分解为多个小波分量，并将多视角正则化引入到所提出的 MAR 模型中。为了提高模型的透明度和有效性，所提出的MAR模型中的所有模块都旨在反映其数学含义。此外，还利用自适应小波模块来增强模型的灵活性。为了优化模型，开发了迭代算法。对合成和真实临床数据集的评估一致证实了所提出的方法相对于竞争方法的优越性能。

AU Kyung, Sunggu Won, Jongjun Pak, Seongyong Kim, Sunwoo Lee, Sangyoon Park, Kanggil Hong, Gil-Sun Kim, Namkug
AU Kyung、Sunggu Won、Jongjun Pak、Seongyong Kim、Sunwoo Lee、Sangyoon Park、Kanggil Hong、Gil-Sun Kim、Namkug

Generative Adversarial Network with Robust Discriminator Through Multi-Task Learning for Low-Dose CT Denoising.
通过低剂量 CT 去噪的多任务学习，具有鲁棒鉴别器的生成对抗网络。

Reducing the dose of radiation in computed tomography (CT) is vital to decreasing secondary cancer risk. However, the use of low-dose CT (LDCT) images is accompanied by increased noise that can negatively impact diagnoses. Although numerous deep learning algorithms have been developed for LDCT denoising, several challenges persist, including the visual incongruence experienced by radiologists, unsatisfactory performances across various metrics, and insufficient exploration of the networks' robustness in other CT domains. To address such issues, this study proposes three novel accretions. First, we propose a generative adversarial network (GAN) with a robust discriminator through multi-task learning that simultaneously performs three vision tasks: restoration, image-level, and pixel-level decisions. The more multi-tasks that are performed, the better the denoising performance of the generator, which means multi-task learning enables the discriminator to provide more meaningful feedback to the generator. Second, two regulatory mechanisms, restoration consistency (RC) and non-difference suppression (NDS), are introduced to improve the discriminator's representation capabilities. These mechanisms eliminate irrelevant regions and compare the discriminator's results from the input and restoration, thus facilitating effective GAN training. Lastly, we incorporate residual fast Fourier transforms with convolution (Res-FFT-Conv) blocks into the generator to utilize both frequency and spatial representations. This approach provides mixed receptive fields by using spatial (or local), spectral (or global), and residual connections. Our model was evaluated using various pixel- and feature-space metrics in two denoising tasks. Additionally, we conducted visual scoring with radiologists. The results indicate superior performance in both quantitative and qualitative measures compared to state-of-the-art denoising techniques.
减少计算机断层扫描 (CT) 中的辐射剂量对于降低继发性癌症风险至关重要。然而，低剂量 CT (LDCT) 图像的使用伴随着噪声的增加，可能会对诊断产生负面影响。尽管已经开发了许多用于 LDCT 去噪的深度学习算法，但仍然存在一些挑战，包括放射科医生遇到的视觉不一致、各种指标的性能不令人满意，以及对网络在其他 CT 领域的鲁棒性探索不足。为了解决这些问题，本研究提出了三个新颖的增长点。首先，我们通过多任务学习提出了一种具有鲁棒判别器的生成对抗网络（GAN），该网络同时执行三个视觉任务：恢复、图像级和像素级决策。执行的多任务越多，生成器的去噪性能就越好，这意味着多任务学习使鉴别器能够为生成器提供更有意义的反馈。其次，引入恢复一致性（RC）和无差异抑制（NDS）两种调节机制来提高判别器的表示能力。这些机制消除了不相关的区域，并比较了判别器的输入和恢复结果，从而促进有效的 GAN 训练。最后，我们将残差快速傅里叶变换与卷积（Res-FFT-Conv）块合并到生成器中，以利用频率和空间表示。这种方法通过使用空间（或局部）、光谱（或全局）和残差连接来提供混合感受野。我们的模型在两个去噪任务中使用各种像素和特征空间指标进行了评估。此外，我们与放射科医生一起进行了视觉评分。结果表明，与最先进的去噪技术相比，在定量和定性测量方面都具有优越的性能。

AU Luo, Mengting Zhou, Nan Wang, Tao He, Linchao Wang, Wang Chen, Hu Liao, Peixi Zhang, Yi
AU罗、周梦婷、王楠、何涛、王林超、王晨、廖胡、张佩曦、易

Bi-Constraints Diffusion: A Conditional Diffusion Model with Degradation Guidance for Metal Artifact Reduction.
双约束扩散：具有用于减少金属伪影的降解指导的条件扩散模型。

In recent years, score-based diffusion models have emerged as effective tools for estimating score functions from empirical data distributions, particularly in integrating implicit priors with inverse problems like CT reconstruction. However, score-based diffusion models are rarely explored in challenging tasks such as metal artifact reduction (MAR). In this paper, we introduce the BiConstraints Diffusion Model for Metal Artifact Reduction (BCDMAR), an innovative approach that enhances iterative reconstruction with a conditional diffusion model for MAR. This method employs a metal artifact degradation operator in place of the traditional metal-excluded projection operator in the data-fidelity term, thereby preserving structure details around metal regions. However, scorebased diffusion models tend to be susceptible to grayscale shifts and unreliable structures, making it challenging to reach an optimal solution. To address this, we utilize a precorrected image as a prior constraint, guiding the generation of the score-based diffusion model. By iteratively applying the score-based diffusion model and the data-fidelity step in each sampling iteration, BCDMAR effectively maintains reliable tissue representation around metal regions and produces highly consistent structures in non-metal regions. Through extensive experiments focused on metal artifact reduction tasks, BCDMAR demonstrates superior performance over other state-of-the-art unsupervised and supervised methods, both quantitatively and in terms of visual results.
近年来，基于分数的扩散模型已成为根据经验数据分布估计分数函数的有效工具，特别是在将隐式先验与 CT 重建等逆问题相结合时。然而，在金属伪影减少（MAR）等具有挑战性的任务中，很少探索基于评分的扩散模型。在本文中，我们介绍了用于金属伪影减少的 BiConstraints 扩散模型 (BCDMAR)，这是一种利用 MAR 条件扩散模型增强迭代重建的创新方法。该方法在数据保真度方面采用金属伪影退化算子代替传统的金属排除投影算子，从而保留金属区域周围的结构细节。然而，基于分数的扩散模型往往容易受到灰度变化和不可靠结构的影响，使得达到最佳解决方案具有挑战性。为了解决这个问题，我们利用预先校正的图像作为先验约束，指导基于分数的扩散模型的生成。通过在每次采样迭代中迭代应用基于分数的扩散模型和数据保真度步骤，BCDMAR 有效地保持金属区域周围可靠的组织表示，并在非金属区域中产生高度一致的结构。通过针对金属伪影减少任务的大量实验，BCDMAR 在定量和视觉结果方面都表现出了优于其他最先进的无监督和监督方法的性能。

AU Yan, Siyuan Yu, Zhen Liu, Chi Ju, Lie Mahapatra, Dwarikanath Betz-Stablein, Brigid Mar, Victoria Janda, Monika Soyer, Peter Ge, Zongyuan
AU Yan, 于思源, 刘震, Chi Ju, Lie Mahapatra, Dwarikanath Betz-Stablein, Brigid Mar, Victoria Janda, Monika Soyer, Peter Ge, 宗源

Prompt-driven Latent Domain Generalization for Medical Image Classification.
用于医学图像分类的提示驱动的潜在域泛化。

Deep learning models for medical image analysis easily suffer from distribution shifts caused by dataset artifact bias, camera variations, differences in the imaging station, etc., leading to unreliable diagnoses in real-world clinical settings. Domain generalization (DG) methods, which aim to train models on multiple domains to perform well on unseen domains, offer a promising direction to solve the problem. However, existing DG methods assume domain labels of each image are available and accurate, which is typically feasible for only a limited number of medical datasets. To address these challenges, we propose a unified DG framework for medical image classification without relying on domain labels, called Prompt-driven Latent Domain Generalization (PLDG). PLDG consists of unsupervised domain discovery and prompt learning. This framework first discovers pseudo domain labels by clustering the bias-associated style features, then leverages collaborative domain prompts to guide a Vision Transformer to learn knowledge from discovered diverse domains. To facilitate cross-domain knowledge learning between different prompts, we introduce a domain prompt generator that enables knowledge sharing between domain prompts and a shared prompt. A domain mixup strategy is additionally employed for more flexible decision margins and mitigates the risk of incorrect domain assignments. Extensive experiments on three medical image classification tasks and one debiasing task demonstrate that our method can achieve comparable or even superior performance than conventional DG algorithms without relying on domain labels. Our code is publicly available at https://github.com/SiyuanYan1/PLDG/tree/main.
用于医学图像分析的深度学习模型很容易受到数据集伪影偏差、相机变化、成像站差异等引起的分布变化的影响，从而导致现实临床环境中的诊断不可靠。领域泛化（DG）方法旨在训练多个领域的模型，使其在未见过的领域中表现良好，为解决该问题提供了一个有希望的方向。然而，现有的 DG 方法假设每个图像的域标签可用且准确，这通常仅适用于有限数量的医学数据集。为了应对这些挑战，我们提出了一种不依赖域标签的医学图像分类统一 DG 框架，称为提示驱动的潜在域泛化（PLDG）。 PLDG 由无监督领域发现和即时学习组成。该框架首先通过聚类与偏差相关的风格特征来发现伪域标签，然后利用协作域提示来指导 Vision Transformer 从发现的不同域中学习知识。为了促进不同提示之间的跨领域知识学习，我们引入了领域提示生成器，它可以实现领域提示和共享提示之间的知识共享。另外还采用了域混合策略，以获得更灵活的决策裕度，并降低了错误域分配的风险。对三个医学图像分类任务和一个去偏任务的广泛实验表明，我们的方法可以在不依赖域标签的情况下实现与传统 DG 算法相当甚至更好的性能。我们的代码可在 https://github.com/SiyuanYan1/PLDG/tree/main 上公开获取。

AU Wang, Yuyang Liu, Xiaomo Li, Liang
王AU、刘雨阳、李小沫、梁

Metal Artifacts Reducing Method Based on Diffusion Model Using Intraoral Optical Scanning Data for Dental Cone-beam CT.
基于扩散模型的金属伪影减少方法，利用牙科锥束CT口内光学扫描数据。

In dental cone-beam computed tomography (CBCT), metal implants can cause metal artifacts, affecting image quality and the final medical diagnosis. To reduce the impact of metal artifacts, our proposed metal artifacts reduction (MAR) method takes a novel approach by integrating CBCT data with intraoral optical scanning data, utilizing information from these two different modalities to correct metal artifacts in the projection domain using a guided-diffusion model. The intraoral optical scanning data provides a more accurate generation domain for the diffusion model. We have proposed a multi-channel generation method in the training and generation stage of the diffusion model, considering the physical mechanism of CBCT, to ensure the consistency of the diffusion model generation. In this paper, we present experimental results that convincingly demonstrate the feasibility and efficacy of our approach, which introduces intraoral optical scanning data into the analysis and processing of projection domain data using the diffusion model for the first time, and modifies the diffusion model to better adapt to the physical model of CBCT.
在牙科锥形束计算机断层扫描 (CBCT) 中，金属植入物可能会产生金属伪影，影响图像质量和最终的医疗诊断。为了减少金属伪影的影响，我们提出的金属伪影减少（MAR）方法采用了一种新颖的方法，将 CBCT 数据与口内光学扫描数据相结合，利用这两种不同模式的信息，使用引导-校正投影域中的金属伪影。扩散模型。口内光学扫描数据为扩散模型提供了更准确的生成域。考虑到CBCT的物理机制，我们在扩散模型的训练和生成阶段提出了多通道生成方法，以保证扩散模型生成的一致性。在本文中，我们提出的实验结果令人信服地证明了我们方法的可行性和有效性，该方法首次使用扩散模型将口内光学扫描数据引入到投影域数据的分析和处理中，并将扩散模型修改为更好的适应CBCT的物理模型。

AU Jiang, Xiajun Missel, Ryan Toloubidokhti, Maryam Gillette, Karli Prassl, Anton J. Plank, Gernot Horacek, B. Milan Sapp, John L. Wang, Linwei
AU Jiang, Xiajun Missel, Ryan Toloubidokhti, Maryam Gillette, Karli Prassl, Anton J. Plank, Gernot Horacek, B. Milan Sapp, John L. Wang, 林伟

Hybrid Neural State-Space Modeling for Supervised and Unsupervised Electrocardiographic Imaging
用于监督和无监督心电图成像的混合神经状态空间建模

State-space modeling (SSM) provides a general framework for many image reconstruction tasks. Error in a priori physiological knowledge of the imaging physics, can bring incorrectness to solutions. Modern deep-learning approaches show great promise but lack interpretability and rely on large amounts of labeled data. In this paper, we present a novel hybrid SSM framework for electrocardiographic imaging (ECGI) to leverage the advantage of state-space formulations in data-driven learning. We first leverage the physics-based forward operator to supervise the learning. We then introduce neural modeling of the transition function and the associated Bayesian filtering strategy. We applied the hybrid SSM framework to reconstruct electrical activity on the heart surface from body-surface potentials. In unsupervised settings of both in-silico and in-vivo data without cardiac electrical activity as the ground truth to supervise the learning, we demonstrated improved ECGI performances of the hybrid SSM framework trained from a small number of ECG observations in comparison to the fixed SSM. We further demonstrated that, when in-silico simulation data becomes available, mixed supervised and unsupervised training of the hybrid SSM achieved a further 40.6% and 45.6% improvements, respectively, in comparison to traditional ECGI baselines and supervised data-driven ECGI baselines for localizing the origin of ventricular activations in real data.
状态空间建模（SSM）为许多图像重建任务提供了通用框架。成像物理学的先验生理知识的错误可能会给解决方案带来不正确的结果。现代深度学习方法显示出巨大的前景，但缺乏可解释性，并且依赖大量标记数据。在本文中，我们提出了一种用于心电图成像（ECGI）的新型混合 SSM 框架，以利用数据驱动学习中状态空间公式的优势。我们首先利用基于物理的前向算子来监督学习。然后我们介绍转换函数的神经建模和相关的贝叶斯过滤策略。我们应用混合 SSM 框架从体表电位重建心脏表面的电活动。在计算机和体内数据的无监督设置中，没有心电活动作为监督学习的基本事实，我们证明了与固定 SSM 相比，通过少量 ECG 观察训练的混合 SSM 框架的 ECGI 性能得到了改善。我们进一步证明，当计算机模拟数据可用时，与传统 ECGI 基线和用于本地化的监督数据驱动 ECGI 基线相比，混合 SSM 的混合监督和无监督训练分别进一步提高了 40.6% 和 45.6%真实数据中心室激活的起源。

AU Zeng, Qingjie Xie, Yutong Lu, Zilin Lu, Mengkang Zhang, Jingfeng Zhou, Yuyin Xia, Yong
曾区、谢庆杰、路雨桐、路子林、张孟康、周景峰、夏玉印、勇

Consistency-guided Differential Decoding for Enhancing Semi-supervised Medical Image Segmentation.
用于增强半监督医学图像分割的一致性引导差分解码。

Semi-supervised learning (SSL) has been proven beneficial for mitigating the issue of limited labeled data, especially on volumetric medical image segmentation. Unlike previous SSL methods which focus on exploring highly confident pseudo-labels or developing consistency regularization schemes, our empirical findings suggest that differential decoder features emerge naturally when two decoders strive to generate consistent predictions. Based on the observation, we first analyze the treasure of discrepancy in learning towards consistency, under both pseudo-labeling and consistency regularization settings, and subsequently propose a novel SSL method called LeFeD, which learns the feature-level discrepancies obtained from two decoders, by feeding such information as feedback signals to the encoder. The core design of LeFeD is to enlarge the discrepancies by training differential decoders, and then learn from the differential features iteratively. We evaluate LeFeD against eight state-of-the-art (SOTA) methods on three public datasets. Experiments show LeFeD surpasses competitors without any bells and whistles, such as uncertainty estimation and strong constraints, as well as setting a new state of the art for semi-supervised medical image segmentation. Code has been released at https://github.com/maxwell0027/LeFeD.
半监督学习 (SSL) 已被证明有助于缓解有限标记数据的问题，尤其是在体积医学图像分割方面。与之前的 SSL 方法专注于探索高度置信的伪标签或开发一致性正则化方案不同，我们的实证研究结果表明，当两个解码器努力生成一致的预测时，差分解码器特征会自然出现。基于观察，我们首先分析了在伪标签和一致性正则化设置下学习一致性的差异宝藏，随后提出了一种称为 LeFeD 的新颖 SSL 方法，该方法学习从两个解码器获得的特征级别差异，通过将此类信息作为反馈信号馈送到编码器。 LeFeD的核心设计是通过训练差分解码器来放大差异，然后迭代地从差分特征中学习。我们在三个公共数据集上针对八种最先进 (SOTA) 方法评估 LeFeD。实验表明，LeFeD 在没有任何附加功能（例如不确定性估计和强约束）的情况下超越了竞争对手，并为半监督医学图像分割设定了新的技术水平。代码已发布于 https://github.com/maxwell0027/LeFeD。

AU Li, Jun Su, Tongkun Zhao, Baoliang Lv, Faqin Wang, Qiong Navab, Nassir Hu, Ying Jiang, Zhongliang
AU Li, 苏军, 赵同坤, 吕宝亮, 王发勤, 纳瓦布, Nassir Hu, 蒋英, 忠良

Ultrasound Report Generation with Cross-Modality Feature Alignment via Unsupervised Guidance.
通过无监督指导生成具有跨模态特征对齐的超声报告。

Automatic report generation has arisen as a significant research area in computer-aided diagnosis, aiming to alleviate the burden on clinicians by generating reports automatically based on medical images. In this work, we propose a novel framework for automatic ultrasound report generation, leveraging a combination of unsupervised and supervised learning methods to aid the report generation process. Our framework incorporates unsupervised learning methods to extract potential knowledge from ultrasound text reports, serving as the prior information to guide the model in aligning visual and textual features, thereby addressing the challenge of feature discrepancy. Additionally, we design a global semantic comparison mechanism to enhance the performance of generating more comprehensive and accurate medical reports. To enable the implementation of ultrasound report generation, we constructed three large-scale ultrasound image-text datasets from different organs for training and validation purposes. Extensive evaluations with other state-of-the-art approaches exhibit its superior performance across all three datasets. Code and dataset are valuable at this link.
自动报告生成已成为计算机辅助诊断的一个重要研究领域，旨在通过根据医学图像自动生成报告来减轻临床医生的负担。在这项工作中，我们提出了一种自动超声报告生成的新颖框架，利用无监督和监督学习方法的组合来帮助报告生成过程。我们的框架采用无监督学习方法，从超声文本报告中提取潜在知识，作为先验信息来指导模型对齐视觉和文本特征，从而解决特征差异的挑战。此外，我们设计了一种全局语义比较机制，以提高生成更全面、更准确的医疗报告的性能。为了实现超声报告生成，我们构建了来自不同器官的三个大规模超声图像文本数据集，用于训练和验证目的。对其他最先进方法的广泛评估显示了其在所有三个数据集上的卓越性能。代码和数据集在此链接中很有价值。

AU Zhang, Dong Liu, Xiujian Wang, Anbang Zhang, Hongwei Yang, Guang Zhang, Heye Gao, Zhifan
张AU、刘栋、王秀剑、张安邦、杨宏伟、张光、高荷叶、志凡

Constraint-Aware Learning for Fractional Flow Reserve Pullback Curve Estimation from Invasive Coronary Imaging.
基于侵入性冠状动脉成像的血流储备分数回拉曲线估计的约束感知学习。

Estimation of the fractional flow reserve (FFR) pullback curve from invasive coronary imaging is important for the intraoperative guidance of coronary intervention. Machine/deep learning has been proven effective in FFR pullback curve estimation. However, the existing methods suffer from inadequate incorporation of intrinsic geometry associations and physics knowledge. In this paper, we propose a constraint-aware learning framework to improve the estimation of the FFR pullback curve from invasive coronary imaging. It incorporates both geometrical and physical constraints to approximate the relationships between the geometric structure and FFR values along the coronary artery centerline. Our method also leverages the power of synthetic data in model training to reduce the collection costs of clinical data. Moreover, to bridge the domain gap between synthetic and real data distributions when testing on real-world imaging data, we also employ a diffusion-driven test-time data adaptation method that preserves the knowledge learned in synthetic data. Specifically, this method learns a diffusion model of the synthetic data distribution and then projects real data to the synthetic data distribution at test time. Extensive experimental studies on a synthetic dataset and a real-world dataset of 382 patients covering three imaging modalities have shown the better performance of our method for FFR estimation of stenotic coronary arteries, compared with other machine/deep learning-based FFR estimation models and computational fluid dynamics-based model. The results also provide high agreement and correlation between the FFR predictions of our method and the invasively measured FFR values. The plausibility of FFR predictions along the coronary artery centerline is also validated.
通过侵入性冠状动脉成像估计血流储备分数（FFR）回拉曲线对于冠状动脉介入术的术中指导非常重要。机器/深度学习已被证明在 FFR 回调曲线估计中有效。然而，现有的方法缺乏对内在几何关联和物理知识的充分结合。在本文中，我们提出了一种约束感知学习框架，以改进侵入性冠状动脉成像中 FFR 回拉曲线的估计。它结合了几何和物理约束来近似沿着冠状动脉中心线的几何结构和 FFR 值之间的关系。我们的方法还利用模型训练中合成数据的力量来降低临床数据的收集成本。此外，为了在测试真实世界成像数据时弥合合成数据分布和真实数据分布之间的领域差距，我们还采用了扩散驱动的测试时数据适应方法，该方法保留了在合成数据中学到的知识。具体来说，该方法学习合成数据分布的扩散模型，然后在测试时将真实数据投影到合成数据分布。对合成数据集和涵盖三种成像模式的 382 名患者的真实数据集进行的广泛实验研究表明，与其他基于机器/深度学习的 FFR 估计模型和计算相比，我们的方法在狭窄冠状动脉的 FFR 估计方面具有更好的性能。基于流体动力学的模型。结果还提供了我们方法的 FFR 预测与侵入式测量的 FFR 值之间的高度一致性和相关性。沿着冠状动脉中心线的 FFR 预测的合理性也得到了验证。

AU Shao, Wei Shi, Hang Liu, Jianxin Zuo, Yingli Sun, Liang Xia, Tiansong Chen, Wanyuan Wan, Peng Sheng, Jianpeng Zhu, Qi Zhang, Daoqiang
区少、石伟、刘航、左建新、孙英丽、夏亮、陈天松、万源、盛鹏、朱建鹏、张琪、道强

Multi-Instance Multi-Task Learning for Joint Clinical Outcome and Genomic Profile Predictions From the Histopathological Images
根据组织病理学图像进行联合临床结果和基因组图谱预测的多实例多任务学习

With the remarkable success of digital histopathology and the deep learning technology, many whole-slide pathological images (WSIs) based deep learning models are designed to help pathologists diagnose human cancers. Recently, rather than predicting categorical variables as in cancer diagnosis, several deep learning studies are also proposed to estimate the continuous variables such as the patients' survival or their transcriptional profile. However, most of the existing studies focus on conducting these predicting tasks separately, which overlooks the useful intrinsic correlation among them that can boost the prediction performance of each individual task. In addition, it is sill challenge to design the WSI-based deep learning models, since a WSI is with huge size but annotated with coarse label. In this study, we propose a general multi-instance multi-task learning framework (HistMIMT) for multi-purpose prediction from WSIs. Specifically, we firstly propose a novel multi-instance learning module (TMICS) considering both common and specific task information across different tasks to generate bag representation for each individual task. Then, a soft-mask based fusion module with channel attention (SFCA) is developed to leverage useful information from the related tasks to help improve the prediction performance on target task. We evaluate our method on three cancer cohorts derived from the Cancer Genome Atlas (TCGA). For each cohort, our multi-purpose prediction tasks range from cancer diagnosis, survival prediction and estimating the transcriptional profile of gene TP53. The experimental results demonstrated that HistMIMT can yield better outcome on all clinical prediction tasks than its competitors.
随着数字组织病理学和深度学习技术的巨大成功，许多基于全幻灯片病理图像（WSI）的深度学习模型被设计用来帮助病理学家诊断人类癌症。最近，还提出了几项深度学习研究来估计连续变量，例如患者的生存或其转录谱，而不是像癌症诊断中那样预测分类变量。然而，大多数现有研究都集中于单独执行这些预测任务，而忽略了它们之间有用的内在相关性，而这些内在相关性可以提高每个单独任务的预测性能。此外，设计基于 WSI 的深度学习模型仍然是一个挑战，因为 WSI 尺寸巨大，但标签标注粗糙。在本研究中，我们提出了一种通用的多实例多任务学习框架（HistMIMT），用于 WSI 的多用途预测。具体来说，我们首先提出了一种新颖的多实例学习模块（TMICS），考虑不同任务之间的常见和特定任务信息，为每个单独的任务生成包表示。然后，开发了一种具有通道注意功能的基于软掩模的融合模块（SFCA），以利用相关任务中的有用信息来帮助提高目标任务的预测性能。我们在来自癌症基因组图谱 (TCGA) 的三个癌症队列中评估了我们的方法。对于每个队列，我们的多用途预测任务包括癌症诊断、生存预测和估计基因 TP53 的转录谱。实验结果表明，HistMIMT 在所有临床预测任务上都能比竞争对手产生更好的结果。

AU Ding, Saisai Li, Juncheng Wang, Jun Ying, Shihui Shi, Jun
欧丁、李赛赛、王俊成、英俊、施世慧、Jun

Multimodal Co-attention Fusion Network with Online Data Augmentation for Cancer Subtype Classification.
多模态共同注意融合网络与在线数据增强用于癌症亚型分类。

It is an essential task to accurately diagnose cancer subtypes in computational pathology for personalized cancer treatment. Recent studies have indicated that the combination of multimodal data, such as whole slide images (WSIs) and multi-omics data, could achieve more accurate diagnosis. However, robust cancer diagnosis remains challenging due to the heterogeneity among multimodal data, as well as the performance degradation caused by insufficient multimodal patient data. In this work, we propose a novel multimodal co-attention fusion network (MCFN) with online data augmentation (ODA) for cancer subtype classification. Specifically, a multimodal mutual-guided co-attention (MMC) module is proposed to effectively perform dense multimodal interactions. It enables multimodal data to mutually guide and calibrate each other during the integration process to alleviate inter- and intra-modal heterogeneities. Subsequently, a self-normalizing network (SNN)-Mixer is developed to allow information communication among different omics data and alleviate the high-dimensional small-sample size problem in multi-omics data. Most importantly, to compensate for insufficient multimodal samples for model training, we propose an ODA module in MCFN. The ODA module leverages the multimodal knowledge to guide the data augmentations of WSIs and maximize the data diversity during model training. Extensive experiments are conducted on the public TCGA dataset. The experimental results demonstrate that the proposed MCFN outperforms all the compared algorithms, suggesting its effectiveness.
在计算病理学中准确诊断癌症亚型是个性化癌症治疗的一项重要任务。最近的研究表明，多模态数据（例如全切片图像（WSI）和多组学数据）的结合可以实现更准确的诊断。然而，由于多模态数据之间的异质性，以及多模态患者数据不足导致的性能下降，稳健的癌症诊断仍然具有挑战性。在这项工作中，我们提出了一种新颖的多模式共同注意融合网络（MCFN）和在线数据增强（ODA），用于癌症亚型分类。具体来说，提出了一种多模态相互引导共同注意（MMC）模块来有效地执行密集的多模态交互。它使多模态数据能够在集成过程中相互指导和校准，以减轻模态间和模内的异质性。随后，开发了自归一化网络（SNN）-Mixer，以允许不同组学数据之间的信息通信，并缓解多组学数据中的高维小样本问题。最重要的是，为了弥补模型训练的多模态样本不足的问题，我们在 MCFN 中提出了 ODA 模块。 ODA模块利用多模态知识来指导WSI的数据增强，并在模型训练期间最大化数据多样性。在公共 TCGA 数据集上进行了大量实验。实验结果表明，所提出的 MCFN 优于所有比较算法，表明其有效性。

AU Li, Yicong Li, Wanhua Chen, Qi Huang, Wei Zou, Yuda Xiao, Xin Shinomiya, Kazunori Gunn, Pat Gupta, Nishika Polilov, Alexey Xu, Yongchao Zhang, Yueyi Xiong, Zhiwei Pfister, Hanspeter Wei, Donglai Wu, Jingpeng
AU Li、李一聪、陈万华、黄奇、邹伟、肖宇大、Xin Shinomiya、Kazunori Gunn、Pat Gupta、Nishika Polilov、Alexey Xu、张永超、熊跃毅、Zhiwei Pfister、Hanspeter Wei、吴东来、Jingpeng

WASPSYN: A Challenge for Domain Adaptive Synapse Detection in Microwasp Brain Connectomes.
WASPSYN：微黄蜂大脑连接体中域自适应突触检测的挑战。

The size of image volumes in connectomics studies now reaches terabyte and often petabyte scales with a great diversity of appearance due to different sample preparation procedures. However, manual annotation of neuronal structures (e.g., synapses) in these huge image volumes is time-consuming, leading to limited labeled training data often smaller than 0.001% of the large-scale image volumes in application. Methods that can utilize in-domain labeled data and generalize to out-of-domain unlabeled data are in urgent need. Although many domain adaptation approaches are proposed to address such issues in the natural image domain, few of them have been evaluated on connectomics data due to a lack of domain adaptation benchmarks. Therefore, to enable developments of domain adaptive synapse detection methods for large-scale connectomics applications, we annotated 14 image volumes from a biologically diverse set of Megaphragma viggianii brain regions originating from three different whole-brain datasets and organized the WASPSYN challenge at ISBI 2023. The annotations include coordinates of pre-synapses and post-synapses in the 3D space, together with their one-to-many connectivity information. This paper describes the dataset, the tasks, the proposed baseline, the evaluation method, and the results of the challenge. Limitations of the challenge and the impact on neuroscience research are also discussed. The challenge is and will continue to be available at https://codalab.lisn.upsaclay.fr/competitions/9169. Successful algorithms that emerge from our challenge may potentially revolutionize real-world connectomics research and further the cause that aims to unravel the complexity of brain structure and function.
连接组学研究中的图像体积大小现已达到 TB 级甚至 PB 级，由于不同的样品制备程序，其外观具有很大的多样性。然而，对这些巨大图像量中的神经元结构（例如突触）进行手动注释非常耗时，导致有限的标记训练数据通常小于应用中大规模图像量的 0.001%。迫切需要能够利用域内标记数据并推广到域外未标记数据的方法。尽管提出了许多领域适应方法来解决自然图像领域中的此类问题，但由于缺乏领域适应基准，很少有人对连接组学数据进行评估。因此，为了开发适用于大规模连接组学应用的域自适应突触检测方法，我们注释了来自来自三个不同全脑数据集的生物学多样性的维吉亚巨噬菌大脑区域的 14 个图像卷，并在 ISBI 2023 上组织了 WASPSYN 挑战赛。注释包括 3D 空间中突触前和突触后的坐标，以及它们的一对多连接信息。本文描述了数据集、任务、建议的基线、评估方法和挑战结果。还讨论了挑战的局限性以及对神经科学研究的影响。挑战赛现已并将继续在 https://codalab.lisn.upsaclay.fr/competitions/9169 上进行。我们的挑战中出现的成功算法可能会彻底改变现实世界的连接组学研究，并进一步推动旨在揭示大脑结构和功能复杂性的事业。

AU Naughton, Noel Cahoon, Stacey Sutton, Brad Georgiadis, John G
AU Naughton、Noel Cahoon、Stacey Sutton、Brad Georgiadis、John G

Accelerated, physics-inspired inference of skeletal muscle microstructure from diffusion-weighted MRI.
从扩散加权 MRI 加速、受物理启发的骨骼肌微观结构推断。

Muscle health is a critical component of overall health and quality of life. However, current measures of skeletal muscle health take limited account of microstructural variations within muscle, which play a crucial role in mediating muscle function. To address this, we present a physics-inspired, machine learning-based framework for the non-invasive estimation of microstructural organization in skeletal muscle from diffusion-weighted MRI (dMRI) in an uncertainty-aware manner. To reduce the computational expense associated with direct numerical simulations of dMRI physics, a polynomial meta-model is developed that accurately represents the input/output relationships of a high-fidelity numerical model. This meta-model is used to develop a Gaussian process (GP) model that provides voxel-wise estimates and confidence intervals of microstructure organization in skeletal muscle. Given noise-free data, the GP model accurately estimates microstructural parameters. In the presence of noise, the diameter, intracellular diffusion coefficient, and membrane permeability are accurately estimated with narrow confidence intervals, while volume fraction and extracellular diffusion coefficient are poorly estimated and exhibit wide confidence intervals. A reduced-acquisition GP model, consisting of one-third the diffusion-encoding measurements, is shown to predict parameters with similar accuracy to the original model. The fiber diameter and volume fraction estimated by the reduced GP model is validated via histology, with both parameters accurately estimated, demonstrating the capability of the proposed framework as a promising non-invasive tool for assessing skeletal muscle health and function.
肌肉健康是整体健康和生活质量的重要组成部分。然而，目前的骨骼肌健康测量方法对肌肉内部微观结构变化的考虑有限，而肌肉内部微观结构变化在调节肌肉功能中发挥着至关重要的作用。为了解决这个问题，我们提出了一种受物理启发、基于机器学习的框架，用于以不确定性感知的方式通过扩散加权 MRI (dMRI) 对骨骼肌的微观结构组织进行非侵入性估计。为了减少与 dMRI 物理直接数值模拟相关的计算费用，开发了一种多项式元模型，可以准确地表示高保真数值模型的输入/输出关系。该元模型用于开发高斯过程 (GP) 模型，该模型提供骨骼肌微观结构组织的体素估计和置信区间。给定无噪声数据，GP 模型可以准确估计微观结构参数。在存在噪声的情况下，直径、细胞内扩散系数和膜渗透性的准确估计具有狭窄的置信区间，而体积分数和细胞外扩散系数的估计较差并且表现出宽的置信区间。减少采集的 GP 模型由三分之一的扩散编码测量组成，可以以与原始模型相似的精度预测参数。通过组织学验证了简化 GP 模型估计的纤维直径和体积分数，这两个参数都得到了准确估计，证明了所提出的框架作为评估骨骼肌健康和功能的有前景的非侵入性工具的能力。

AU van Herten, Rudolf L. M. Hampe, Nils Takx, Richard A. P. Franssen, Klaas Jan Wang, Yining Sucha, Dominika Henriques, Jose P. Leiner, Tim Planken, R. Nils Isgum, Ivana
AU van Herten、Rudolf LM Hampe、Nils Takx、Richard AP Franssen、Klaas Jan Wang、Yining Sucha、Dominika Henriques、Jose P. Leiner、Tim Planken、R. Nils Isgum、Ivana

Automatic Coronary Artery Plaque Quantification and CAD-RADS Prediction Using Mesh Priors
使用网格先验自动冠状动脉斑块量化和 CAD-RADS 预测

Coronary artery disease (CAD) remains the leading cause of death worldwide. Patients with suspected CAD undergo coronary CT angiography (CCTA) to evaluate the risk of cardiovascular events and determine the treatment. Clinical analysis of coronary arteries in CCTA comprises the identification of atherosclerotic plaque, as well as the grading of any coronary artery stenosis typically obtained through the CAD-Reporting and Data System (CAD-RADS). This requires analysis of the coronary lumen and plaque. While voxel-wise segmentation is a commonly used approach in various segmentation tasks, it does not guarantee topologically plausible shapes. To address this, in this work, we propose to directly infer surface meshes for coronary artery lumen and plaque based on a centerline prior and use it in the downstream task of CAD-RADS scoring. The method is developed and evaluated using a total of 2407 CCTA scans. Our method achieved lesion-wise volume intraclass correlation coefficients of 0.98, 0.79, and 0.85 for calcified, non-calcified, and total plaque volume respectively. Patient-level CAD-RADS categorization was evaluated on a representative hold-out test set of 300 scans, for which the achieved linearly weighted kappa (kappa) was 0.75. CAD-RADS categorization on the set of 658 scans from another hospital and scanner led to a kappa of 0.71. The results demonstrate that direct inference of coronary artery meshes for lumen and plaque is feasible, and allows for the automated prediction of routinely performed CAD-RADS categorization.
冠状动脉疾病（CAD）仍然是全世界死亡的主要原因。疑似 CAD 患者接受冠状动脉 CT 血管造影 (CCTA)，以评估心血管事件的风险并确定治疗方案。 CCTA 中冠状动脉的临床分析包括动脉粥样硬化斑块的识别，以及通常通过 CAD 报告和数据系统 (CAD-RADS) 获得的任何冠状动脉狭窄的分级。这需要对冠状动脉腔和斑块进行分析。虽然体素分割是各种分割任务中常用的方法，但它不能保证拓扑上合理的形状。为了解决这个问题，在这项工作中，我们建议根据先验中心线直接推断冠状动脉管腔和斑块的表面网格，并将其用于 CAD-RADS 评分的下游任务。该方法是使用总共 2407 个 CCTA 扫描来开发和评估的。我们的方法实现了钙化、非钙化和总斑块体积的病变体积组内相关系数分别为 0.98、0.79 和 0.85。患者级别 CAD-RADS 分类在 300 次扫描的代表性保留测试集上进行评估，其中获得的线性加权 kappa (kappa) 为 0.75。对来自另一家医院和扫描仪的 658 个扫描集进行 CAD-RADS 分类，得出的 kappa 为 0.71。结果表明，直接推断冠状动脉网格的管腔和斑块是可行的，并且可以自动预测常规执行的 CAD-RADS 分类。

AU Guo, Jia Lu, Shuai Jia, Lize Zhang, Weihang Li, Huiqi
郭AU、路佳、贾帅、张丽泽、李伟航、慧琪

Encoder-Decoder Contrast for Unsupervised Anomaly Detection in Medical Images
用于医学图像中无监督异常检测的编码器-解码器对比

Unsupervised anomaly detection (UAD) aims to recognize anomalous images based on the training set that contains only normal images. In medical image analysis, UAD benefits from leveraging the easily obtained normal (healthy) images, avoiding the costly collecting and labeling of anomalous (unhealthy) images. Most advanced UAD methods rely on frozen encoder networks pre-trained using ImageNet for extracting feature representations. However, the features extracted from the frozen encoders that are borrowed from natural image domains coincide little with the features required in the target medical image domain. Moreover, optimizing encoders usually causes pattern collapse in UAD. In this paper, we propose a novel UAD method, namely Encoder-Decoder Contrast (EDC), which optimizes the entire network to reduce biases towards pre-trained image domain and orient the network in the target medical domain. We start from feature reconstruction approach that detects anomalies from reconstruction errors. Essentially, a contrastive learning paradigm is introduced to tackle the problem of pattern collapsing while optimizing the encoder and the reconstruction decoder simultaneously. In addition, to prevent instability and further improve performances, we propose to bring globality into the contrastive objective function. Extensive experiments are conducted across four medical image modalities including optical coherence tomography, color fundus image, brain MRI, and skin lesion image, where our method outperforms all current state-of-the-art UAD methods.
无监督异常检测（UAD）旨在基于仅包含正常图像的训练集来识别异常图像。在医学图像分析中，UAD 受益于利用容易获得的正常（健康）图像，避免了昂贵的收集和标记异常（不健康）图像。大多数先进的 UAD 方法依赖于使用 ImageNet 预先训练的冻结编码器网络来提取特征表示。然而，从自然图像域借用的冻结编码器中提取的特征与目标医学图像域中所需的特征几乎不相符。此外，优化编码器通常会导致 UAD 中的模式崩溃。在本文中，我们提出了一种新颖的UAD方法，即编码器-解码器对比度（EDC），该方法优化整个网络以减少对预训练图像域的偏差并将网络定位在目标医学领域。我们从检测重建错误中的异常的特征重建方法开始。本质上，引入对比学习范式来解决模式崩溃问题，同时优化编码器和重建解码器。此外，为了防止不稳定并进一步提高性能，我们建议将全局性引入对比目标函数中。对四种医学图像模式进行了广泛的实验，包括光学相干断层扫描、彩色眼底图像、脑部 MRI 和皮肤病变图像，我们的方法优于当前所有最先进的 UAD 方法。

AU Zhu, Jiening Veeraraghavan, Harini Jiang, Jue Oh, Jung Hun Norton, Larry Deasy, Joseph O. Tannenbaum, Allen

Wasserstein HOG: Local Directionality Extraction via Optimal Transport
Wasserstein HOG：通过最佳传输提取局部方向性

Directionally sensitive radiomic features including the histogram of oriented gradient (HOG) have been shown to provide objective and quantitative measures for predicting disease outcomes in multiple cancers. However, radiomic features are sensitive to imaging variabilities including acquisition differences, imaging artifacts and noise, making them impractical for using in the clinic to inform patient care. We treat the problem of extracting robust local directionality features by mapping via optimal transport a given local image patch to an iso-intense patch of its mean. We decompose the transport map into sub-work costs each transporting in different directions. To test our approach, we evaluated the ability of the proposed approach to quantify tumor heterogeneity from magnetic resonance imaging (MRI) scans of brain glioblastoma multiforme, computed tomography (CT) scans of head and neck squamous cell carcinoma as well as longitudinal CT scans in lung cancer patients treated with immunotherapy. By considering the entropy difference of the extracted local directionality within tumor regions, we found that patients with higher entropy in their images, had significantly worse overall survival for all three datasets, which indicates that tumors that have images exhibiting flows in many directions may be more malignant. This may seem to reflect high tumor histologic grade or disorganization. Furthermore, by comparing the changes in entropy longitudinally using two imaging time points, we found patients with reduction in entropy from baseline CT are associated with longer overall survival (hazard ratio = 1.95, 95% confidence interval of 1.4-2.8, ${p}$ = 1.65e-5). The proposed method provides a robust, training free approach to quantify the local directionality contained in images.
方向敏感的放射组学特征，包括定向梯度直方图 (HOG) 已被证明可以为预测多种癌症的疾病结果提供客观和定量的测量。然而，放射组学特征对成像变异（包括采集差异、成像伪影和噪声）很敏感，这使得它们在临床中用于指导患者护理是不切实际的。我们通过最优传输将给定的局部图像块映射到其均值的等强度块来处理提取鲁棒局部方向性特征的问题。我们将运输地图分解为每个在不同方向运输的子工作成本。为了测试我们的方法，我们评估了所提出的方法通过脑多形性胶质母细胞瘤的磁共振成像（MRI）扫描、头颈部鳞状细胞癌的计算机断层扫描（CT）以及纵向CT扫描来量化肿瘤异质性的能力。接受免疫疗法治疗的肺癌患者。通过考虑肿瘤区域内提取的局部方向性的熵差，我们发现图像中熵较高的患者在所有三个数据集中的总体生存率明显较差，这表明具有在多个方向上显示流动的图像的肿瘤可能更差。恶性的。这似乎反映了肿瘤组织学分级较高或组织混乱。此外，通过使用两个成像时间点纵向比较熵的变化，我们发现熵较基线 CT 减少的患者与较长的总生存期相关（风险比 = 1.95，95% 置信区间为 1.4-2.8，${p} $ = 1.65e-5)。所提出的方法提供了一种稳健的、免训练的方法来量化图像中包含的局部方向性。

AU Fu, Suzhong Xu, Jing Chang, Shilong Yang, Luyao Ling, Shuting Cai, Jinghan Chen, Jiayin Yuan, Jiacheng Cai, Ying Zhang, Bei Huang, Zicheng Yang, Kun Sui, Wenhai Xue, Linyan Zhao, Qingliang
AU Fu, 徐苏中, 常静, 杨世龙, 凌璐瑶, 蔡舒婷, 陈静涵, 袁佳音, 蔡家成, 张英, 黄蓓, 杨子成, 隋坤, 薛文海, 赵林艳, 庆亮

Robust Vascular Segmentation for Raw Complex Images of Laser Speckle Contrast Based on Weakly Supervised Learning
基于弱监督学习的激光散斑对比原始复杂图像的鲁棒血管分割

Laser speckle contrast imaging (LSCI) is widely used for in vivo real-time detection and analysis of local blood flow microcirculation due to its non-invasive ability and excellent spatial and temporal resolution. However, vascular segmentation of LSCI images still faces a lot of difficulties due to numerous specific noises caused by the complexity of blood microcirculation's structure and irregular vascular aberrations in diseased regions. In addition, the difficulties of LSCI image data annotation have hindered the application of deep learning methods based on supervised learning in the field of LSCI image vascular segmentation. To tackle these difficulties, we propose a robust weakly supervised learning method, which selects the threshold combinations and processing flows instead of labor-intensive annotation work to construct the ground truth of the dataset, and design a deep neural network, FURNet, based on UNet++ and ResNeXt. The model obtained from training achieves high-quality vascular segmentation and captures multi-scene vascular features on both constructed and unknown datasets with good generalization. Furthermore, we intravital verified the availability of this method on a tumor before and after embolization treatment. This work provides a new approach for realizing LSCI vascular segmentation and also makes a new application-level advance in the field of artificial intelligence-assisted disease diagnosis.
激光散斑对比成像（LSCI）因其无创能力和优异的时空分辨率而被广泛应用于体内局部血流微循环的实时检测和分析。然而，由于血液微循环结构的复杂性和病变区域不规则的血管畸变造成大量特定噪声，LSCI图像的血管分割仍然面临很多困难。此外，LSCI图像数据标注的困难阻碍了基于监督学习的深度学习方法在LSCI图像血管分割领域的应用。为了解决这些困难，我们提出了一种鲁棒的弱监督学习方法，该方法选择阈值组合和处理流程而不是劳动密集型的注释工作来构建数据集的基本事实，并设计了一个基于 UNet++ 的深度神经网络 FURNet和 ResNeXt。训练获得的模型实现了高质量的血管分割，并在构建的和未知的数据集上捕获多场景血管特征，具有良好的泛化性。此外，我们在栓塞治疗前后验证了该方法在肿瘤上的可用性。该工作为实现LSCI血管分割提供了新的途径，也在人工智能辅助疾病诊断领域取得了新的应用层面的进展。

AU Lobos, Rodrigo A. Chan, Chin-Cheng Haldar, Justin P.
AU Lobos、Rodrigo A. Chan、Chin-Cheng Haldar、Justin P.

New Theory and Faster Computations for Subspace-Based Sensitivity Map Estimation in Multichannel MRI
多通道 MRI 中基于子空间的灵敏度图估计的新理论和更快的计算

Sensitivity map estimation is important in many multichannel MRI applications. Subspace-based sensitivity map estimation methods like ESPIRiT are popular and perform well, though can be computationally expensive and their theoretical principles can be nontrivial to understand. In the first part of this work, we present a novel theoretical derivation of subspace-based sensitivity map estimation based on a linear-predictability/structured low-rank modeling perspective. This results in an estimation approach that is equivalent to ESPIRiT, but with distinct theory that may be more intuitive for some readers. In the second part of this work, we propose and evaluate a set of computational acceleration approaches (collectively known as PISCO) that can enable substantial improvements in computation time (up to similar to 100x in the examples we show) and memory for subspace-based sensitivity map estimation.
灵敏度图估计在许多多通道 MRI 应用中很重要。基于子空间的灵敏度图估计方法（例如 ESPIRiT）很流行并且性能良好，但计算成本可能很高，而且其理论原理也很难理解。在这项工作的第一部分中，我们提出了一种基于线性可预测性/结构化低秩建模视角的基于子空间的灵敏度图估计的新颖理论推导。这产生了与 ESPIRiT 等效的估计方法，但具有对某些读者来说可能更直观的独特理论。在这项工作的第二部分中，我们提出并评估了一组计算加速方法（统称为 PISCO），这些方法可以显着改善基于子空间的计算时间（在我们展示的示例中高达 100 倍）和内存。敏感性图估计。

AU Tang, Xinlu Zhang, Chencheng Guo, Rui Yang, Xinling Qian, Xiaohua
唐AU、张新禄、郭晨城、杨锐、钱欣岭、晓华

A Causality-Aware Graph Convolutional Network Framework for Rigidity Assessment in Parkinsonians
用于帕金森病僵化评估的因果感知图卷积网络框架

Rigidity is one of the common motor disorders in Parkinson's disease (PD), which lead to life quality deterioration. The widely-used rating-scale-based approach for rigidity assessment still depends on the availability of experienced neurologists and is limited by rating subjectivity. Given the recent successful applications of quantitative susceptibility mapping (QSM) in auxiliary PD diagnosis, automated assessment of PD rigidity can be essentially achieved through QSM analysis. However, a major challenge is the performance instability due to the confounding factors (e.g., noise and distribution shift) which conceal the truly-causal features. Therefore, we propose a causality-aware graph convolutional network (GCN) framework, where causal feature selection is combined with causal invariance to ensure that causality-informed model decisions are reached. Firstly, a GCN model that integrates causal feature selection is systematically constructed at three graph levels: node, structure, and representation. In this model, a causal diagram is learned to extract a subgraph with truly-causal information. Secondly, a non-causal perturbation strategy is developed along with an invariance constraint to ensure the stability of the assessment results under different distributions, and thus avoid spurious correlations caused by distribution shifts. The superiority of the proposed method is shown by extensive experiments and the clinical value is revealed by the direct relevance of selected brain regions to rigidity in PD. Besides, its extensibility is verified on other two tasks: PD bradykinesia and mental state for Alzheimer's disease. Overall, we provide a clinically-potential tool for automated and stable assessment of PD rigidity. Our source code will be available at https://github.com/SJTUBME-QianLab/Causality-Aware-Rigidity.
强直是帕金森病（PD）常见的运动障碍之一，会导致生活质量恶化。广泛使用的基于评级量表的僵化评估方法仍然取决于经验丰富的神经科医生的可用性，并且受到评级主观性的限制。鉴于定量磁化率图（QSM）最近在辅助PD诊断中的成功应用，PD刚性的自动评估基本上可以通过QSM分析来实现。然而，一个主要的挑战是由于隐藏了真正因果特征的混杂因素（例如噪声和分布变化）而导致的性能不稳定。因此，我们提出了一种因果关系感知图卷积网络（GCN）框架，其中因果特征选择与因果不变性相结合，以确保达成因果关系知情的模型决策。首先，在节点、结构和表示三个图层面系统地构建了集成因果特征选择的GCN模型。在这个模型中，学习因果图来提取具有真正因果信息的子图。其次，制定了非因果扰动策略和不变性约束，以确保不同分布下评估结果的稳定性，从而避免分布变化引起的虚假相关性。广泛的实验证明了所提出方法的优越性，并且所选大脑区域与 PD 僵硬的直接相关性揭示了临床价值。此外，它的可扩展性在另外两个任务上得到了验证：PD运动迟缓和阿尔茨海默病的精神状态。总体而言，我们提供了一种具有临床潜力的工具，用于自动、稳定地评估 PD 硬度。我们的源代码可在 https://github.com/SJTUBME-QianLab/Causality-Aware-Rigidity 获取。

AU Wu, Yongjian Zhou, Yang Saiyin, Jiya Wei, Bingzheng Lai, Maode Shou, Jianzhong Xu, Yan
吴宇、周永健、杨赛银、魏继雅、赖秉正、寿茂德、徐建中、严

AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models.
AttriPrompter：通过视觉语言预训练模型进行零样本核检测的属性语义自动提示。

Large-scale visual-language pre-trained models (VLPMs) have demonstrated exceptional performance in downstream object detection through text prompts for natural scenes. However, their application to zero-shot nuclei detection on histopathology images remains relatively unexplored, mainly due to the significant gap between the characteristics of medical images and the weboriginated text-image pairs used for pre-training. This paper aims to investigate the potential of the object-level VLPM, Grounded Language-Image Pre-training (GLIP), for zero-shot nuclei detection. Specifically, we propose an innovative auto-prompting pipeline, named AttriPrompter, comprising attribute generation, attribute augmentation, and relevance sorting, to avoid subjective manual prompt design. AttriPrompter utilizes VLPMs' text-to-image alignment to create semantically rich text prompts, which are then fed into GLIP for initial zero-shot nuclei detection. Additionally, we propose a self-trained knowledge distillation framework, where GLIP serves as the teacher with its initial predictions used as pseudo labels, to address the challenges posed by high nuclei density, including missed detections, false positives, and overlapping instances. Our method exhibits remarkable performance in label-free nuclei detection, out-performing all existing unsupervised methods and demonstrating excellent generality. Notably, this work highlights the astonishing potential of VLPMs pre-trained on natural image-text pairs for downstream tasks in the medical field as well. Code will be released at github.com/AttriPrompter.
大规模视觉语言预训练模型（VLPM）通过自然场景的文本提示在下游对象检测中表现出了卓越的性能。然而，它们在组织病理学图像上的零样本核检测中的应用仍然相对未经探索，这主要是由于医学图像的特征与用于预训练的网络原始文本图像对之间存在显着差距。本文旨在研究对象级 VLPM（基础语言图像预训练（GLIP））在零样本核检测中的潜力。具体来说，我们提出了一种创新的自动提示管道，名为 AttriPrompter，包括属性生成、属性增强和相关性排序，以避免主观的手动提示设计。 AttriPrompter 利用 VLPM 的文本到图像对齐来创建语义丰富的文本提示，然后将其输入 GLIP 中以进行初始零样本核检测。此外，我们提出了一个自我训练的知识蒸馏框架，其中 GLIP 作为老师，其初始预测用作伪标签，以解决高核密度带来的挑战，包括漏检、误报和重叠实例。我们的方法在无标记细胞核检测中表现出卓越的性能，优于所有现有的无监督方法，并表现出出色的通用性。值得注意的是，这项工作凸显了在自然图像文本对上进行预训练的 VLPM 对于医学领域下游任务的惊人潜力。代码将在 github.com/AttriPrompter 发布。

EI 1558-254X DA 2024-10-05 UT MEDLINE:39361456 PM 39361456 ER
EI 1558-254X DA 2024-10-05 UT MEDLINE：39361456 PM 39361456 ER

AU You, Xin He, Junjun Yang, Jie Gu, Yun
区游、何欣、杨军军、顾杰、云

Learning with Explicit Shape Priors for Medical Image Segmentation.
使用显式形状先验学习医学图像分割。

Medical image segmentation is a fundamental task for medical image analysis and surgical planning. In recent years, UNet-based networks have prevailed in the field of medical image segmentation. However, convolutional neural networks (CNNs) suffer from limited receptive fields, which fail to model the long-range dependency of organs or tumors. Besides, these models are heavily dependent on the training of the final segmentation head. And existing methods can not well address aforementioned limitations simultaneously. Hence, in our work, we proposed a novel shape prior module (SPM), which can explicitly introduce shape priors to promote the segmentation performance of UNet-based models. The explicit shape priors consist of global and local shape priors. The former with coarse shape representations provides networks with capabilities to model global contexts. The latter with finer shape information serves as additional guidance to relieve the heavy dependence on the learnable prototype in the segmentation head. To evaluate the effectiveness of SPM, we conduct experiments on three challenging public datasets. And our proposed model achieves state-of-the-art performance. Furthermore, SPM can serve as a plug-and-play structure into classic CNNs and Transformer-based backbones, facilitating the segmentation task on different datasets. Source codes are available at https://github. com/AlexYouXin/Explicit-Shape-Priors.
医学图像分割是医学图像分析和手术规划的一项基本任务。近年来，基于UNet的网络在医学图像分割领域盛行。然而，卷积神经网络（CNN）的感受野有限，无法模拟器官或肿瘤的远程依赖性。此外，这些模型很大程度上依赖于最终分割头的训练。而现有方法并不能很好地同时解决上述局限性。因此，在我们的工作中，我们提出了一种新颖的形状先验模块（SPM），它可以显式地引入形状先验来提高基于 UNet 的模型的分割性能。显式形状先验由全局形状先验和局部形状先验组成。前者具有粗糙的形状表示，为网络提供了对全局上下文进行建模的能力。后者具有更精细的形状信息，可作为额外的指导，以减轻对分割头中可学习原型的严重依赖。为了评估 SPM 的有效性，我们在三个具有挑战性的公共数据集上进行了实验。我们提出的模型实现了最先进的性能。此外，SPM 可以作为经典 CNN 和基于 Transformer 的主干的即插即用结构，促进不同数据集上的分割任务。源代码可在 https://github 上获取。 com/AlexYouXin/Explicit-Shape-Priors。

AU Liu, Che Cheng, Sibo Shi, Miaojing Shah, Anand Bai, Wenjia Arcucci, Rossella

IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training.
模仿：临床事先引导的分层视觉语言预训练。

In the field of medical Vision-Language Pretraining (VLP), significant efforts have been devoted to deriving text and image features from both clinical reports and associated medical images. However, most existing methods may have overlooked the opportunity in leveraging the inherent hierarchical structure of clinical reports, which are generally split into 'findings' for descriptive content and 'impressions' for conclusive observation. Instead of utilizing this rich, structured format, current medical VLP approaches often simplify the report into either a unified entity or fragmented tokens. In this work, we propose a novel clinical prior guided VLP framework named IMITATE to learn the structure information from medical reports with hierarchical vision-language alignment. The framework derives multi-level visual features from the chest X-ray (CXR) images and separately aligns these features with the descriptive and the conclusive text encoded in the hierarchical medical report. Furthermore, a new clinical-informed contrastive loss is introduced for cross-modal learning, which accounts for clinical prior knowledge in formulating sample correlations in contrastive learning. The proposed model, IMITATE, outperforms baseline VLP methods across six different datasets, spanning five medical imaging downstream tasks. Comprehensive experimental results highlight the advantages of integrating the hierarchical structure of medical reports for vision-language alignment.
在医学视觉语言预训练（VLP）领域，人们致力于从临床报告和相关医学图像中获取文本和图像特征。然而，大多数现有方法可能忽视了利用临床报告固有的层次结构的机会，临床报告通常分为描述性内容的“发现”和结论性观察的“印象”。当前的医疗 VLP 方法通常将报告简化为统一的实体或碎片化的标记，而不是利用这种丰富的结构化格式。在这项工作中，我们提出了一种名为 IMITATE 的新型临床先验引导 VLP 框架，用于通过分层视觉语言对齐从医学报告中学习结构信息。该框架从胸部 X 射线 (CXR) 图像中获取多级视觉特征，并将这些特征与分层医疗报告中编码的描述性和结论性文本分别对齐。此外，为跨模式学习引入了一种新的临床知情对比损失，它解释了对比学习中制定样本相关性的临床先验知识。所提出的模型 IMITATE 在六个不同的数据集上优于基线 VLP 方法，涵盖五个医学成像下游任务。综合实验结果凸显了整合医疗报告的层次结构以实现视觉语言对齐的优势。

AU Bian, Wanyu Jang, Albert Zhang, Liping Yang, Xiaonan Stewart, Zachary Liu, Fang
AU Bian、Wanyu Jang、Albert 张、Liping Yang、Xiaonan Stewart、Zachary Liu、Fang

Diffusion Modeling with Domain-conditioned Prior Guidance for Accelerated MRI and qMRI Reconstruction.
具有域条件先验指导的扩散建模，用于加速 MRI 和 qMRI 重建。

This study introduces a novel image reconstruction technique based on a diffusion model that is conditioned on the native data domain. Our method is applied to multi-coil MRI and quantitative MRI (qMRI) reconstruction, leveraging the domain-conditioned diffusion model within the frequency and parameter domains. The prior MRI physics are used as embeddings in the diffusion model, enforcing data consistency to guide the training and sampling process, characterizing MRI k-space encoding in MRI reconstruction, and leveraging MR signal modeling for qMRI reconstruction. Furthermore, a gradient descent optimization is incorporated into the diffusion steps, enhancing feature learning and improving denoising. The proposed method demonstrates a significant promise, particularly for reconstructing images at high acceleration factors. Notably, it maintains great reconstruction accuracy for static and quantitative MRI reconstruction across diverse anatomical structures. Beyond its immediate applications, this method provides potential generalization capability, making it adaptable to inverse problems across various domains.
本研究介绍了一种基于以本机数据域为条件的扩散模型的新颖图像重建技术。我们的方法应用于多线圈 MRI 和定量 MRI (qMRI) 重建，利用频率和参数域内的域条件扩散模型。先前的 MRI 物理学被用作扩散模型中的嵌入，强制数据一致性以指导训练和采样过程，表征 MRI 重建中的 MRI k 空间编码，并利用 MR 信号建模进行 qMRI 重建。此外，将梯度下降优化纳入扩散步骤中，增强特征学习并改善去噪。所提出的方法展现了巨大的前景，特别是在高加速因子下重建图像方面。值得注意的是，它在不同解剖结构的静态和定量 MRI 重建中保持了很高的重建精度。除了直接应用之外，该方法还提供了潜在的泛化能力，使其适用于跨各个领域的反演问题。

AU Liu, Min Wu, Shuhan Chen, Runze Lin, Zhuangdian Wang, Yaonan Meijering, Erik

Brain Image Segmentation for Ultrascale Neuron Reconstruction via an Adaptive Dual-Task Learning Network
通过自适应双任务学习网络进行超大规模神经元重建的大脑图像分割

Accurate morphological reconstruction of neurons in whole brain images is critical for brain science research. However, due to the wide range of whole brain imaging, uneven staining, and optical system fluctuations, there are significant differences in image properties between different regions of the ultrascale brain image, such as dramatically varying voxel intensities and inhomogeneous distribution of background noise, posing an enormous challenge to neuron reconstruction from whole brain images. In this paper, we propose an adaptive dual-task learning network (ADTL-Net) to quickly and accurately extract neuronal structures from ultrascale brain images. Specifically, this framework includes an External Features Classifier (EFC) and a Parameter Adaptive Segmentation Decoder (PASD), which share the same Multi-Scale Feature Encoder (MSFE). MSFE introduces an attention module named Channel Space Fusion Module (CSFM) to extract structure and intensity distribution features of neurons at different scales for addressing the problem of anisotropy in 3D space. Then, EFC is designed to classify these feature maps based on external features, such as foreground intensity distributions and image smoothness, and select specific PASD parameters to decode them of different classes to obtain accurate segmentation results. PASD contains multiple sets of parameters trained by different representative complex signal-to-noise distribution image blocks to handle various images more robustly. Experimental results prove that compared with other advanced segmentation methods for neuron reconstruction, the proposed method achieves state-of-the-art results in the task of neuron reconstruction from ultrascale brain images, with an improvement of about 49% in speed and 12% in F1 score.
全脑图像中神经元的准确形态重建对于脑科学研究至关重要。然而，由于全脑成像范围广泛、染色不均匀和光学系统波动，超尺度脑图像不同区域之间的图像特性存在显着差异，例如体素强度差异巨大、背景噪声分布不均匀等，从全脑图像重建神经元是一个巨大的挑战。在本文中，我们提出了一种自适应双任务学习网络（ADTL-Net），可以快速准确地从超大规模脑图像中提取神经元结构。具体来说，该框架包括外部特征分类器（EFC）和参数自适应分割解码器（PASD），它们共享相同的多尺度特征编码器（MSFE）。 MSFE引入了名为通道空间融合模块（CSFM）的注意力模块来提取不同尺度神经元的结构和强度分布特征，以解决3D空间中的各向异性问题。然后，EFC旨在根据外部特征（例如前景强度分布和图像平滑度）对这些特征图进行分类，并选择特定的PASD参数对不同类别的它们进行解码，以获得准确的分割结果。 PASD包含由不同代表性复杂信噪分布图像块训练的多组参数，以更鲁棒地处理各种图像。实验结果证明，与其他先进的神经元重建分割方法相比，该方法在超尺度脑图像神经元重建任务中取得了state-of-the-art的结果，速度提高了约49%，效率提高了12%。 F1成绩。

C1 Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Peoples R China C1 Hunan Univ, Natl Engn Lab Robot Visual Percept & Control Techn, Changsha 410082, Peoples R China C1 Int Sci & Technol Innovat Cooperat Base Biomed Ima, Changsha 410082, Peoples R China C1 Hunan Univ, Res Inst, Chongqing 401120, Peoples R China C1 Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW 2052, Australia C3 Int Sci & Technol Innovat Cooperat Base Biomed Ima SN 0278-0062 EI 1558-254X DA 2024-07-22 UT WOS:001263692100011 PM 38373129 ER
C1 湖南大学电子与信息工程学院, 长沙 410082, 人民大学 C1 湖南大学, 国家工程实验室机器人视觉感知与控制技术实验室, 长沙 410082, 人民大学 C1 生物医学国际科技创新合作基地, 长沙 410082 Peoples R China C1 湖南大学，Res Inst，重庆 401120，Peoples R China C1 Univ 新南威尔士州，Sch Comp Sci & Engn，Sydney，NSW 2052，澳大利亚 C3 国际科技创新合作基地 Biomed Ima SN 0278-0062 EI 1558 -254X DA 2024-07-22 UT WOS：001263692100011 PM 38373129 ER

AU Thandiackal, Kevin Piccinelli, Luigi Gupta, Rajarsi Pati, Pushpak Goksel, Orcun
AU Thandiackal、Kevin Piccinelli、Luigi Gupta、Rajarsi Pati、Pushpak Goksel、Orcun

Multi-Scale Feature Alignment for Continual Learning of Unlabeled Domains
用于未标记域持续学习的多尺度特征对齐

Methods for unsupervised domain adaptation (UDA) help to improve the performance of deep neural networks on unseen domains without any labeled data. Especially in medical disciplines such as histopathology, this is crucial since large datasets with detailed annotations are scarce. While the majority of existing UDA methods focus on the adaptation from a labeled source to a single unlabeled target domain, many real-world applications with a long life cycle involve more than one target domain. Thus, the ability to sequentially adapt to multiple target domains becomes essential. In settings where the data from previously seen domains cannot be stored, e.g., due to data protection regulations, the above becomes a challenging continual learning problem. To this end, we propose to use generative feature-driven image replay in conjunction with a dual-purpose discriminator that not only enables the generation of images with realistic features for replay, but also promotes feature alignment during domain adaptation. We evaluate our approach extensively on a sequence of three histopathological datasets for tissue-type classification, achieving state-of-the-art results. We present detailed ablation experiments studying our proposed method components and demonstrate a possible use-case of our continual UDA method for an unsupervised patch-based segmentation task given high-resolution tissue images. Our code is available at: https://github.com/histocartography/multi-scale-feature-alignment.
无监督域适应（UDA）方法有助于在没有任何标记数据的情况下提高深度神经网络在看不见的域上的性能。特别是在组织病理学等医学学科中，这一点至关重要，因为带有详细注释的大型数据集很少。虽然大多数现有 UDA 方法侧重于从标记源到单个未标记目标域的适应，但许多具有较长生命周期的实际应用程序涉及多个目标域。因此，顺序适应多个目标域的能力变得至关重要。在无法存储来自先前看到的域的数据的设置中，例如，由于数据保护法规，上述问题成为具有挑战性的持续学习问题。为此，我们建议将生成特征驱动的图像重放与双用途鉴别器结合使用，不仅能够生成具有真实特征的图像以进行重放，而且还可以在域适应期间促进特征对齐。我们在用于组织类型分类的三个组织病理学数据集序列上广泛评估我们的方法，取得了最先进的结果。我们提出了详细的消融实验，研究我们提出的方法组件，并展示了我们的连续 UDA 方法的可能用例，用于给定高分辨率组织图像的无监督的基于块的分割任务。我们的代码位于：https://github.com/histocartography/multi-scale-feature-alignment。

AU Zhu, Qi Li, Shengrong Meng, Xiangshui Xu, Qiang Zhang, Zhiqiang Shao, Wei Zhang, Daoqiang
朱AU、李琪、孟胜荣、徐响水、张强、邵志强、张伟、道强

Spatio-Temporal Graph Hubness Propagation Model for Dynamic Brain Network Classification
动态脑网络分类的时空图中心传播模型

Dynamic brain network has the advantage over static brain network in characterizing the variation pattern of functional brain connectivity, and it has attracted increasing attention in brain disease diagnosis. However, most of the existing dynamic brain networks analysis methods rely on extracting features from independent brain networks divided by sliding windows, making them hard to reveal the high-order dynamic evolution laws of functional brain networks. Additionally, they cannot effectively extract the spatio-temporal topology features in dynamic brain networks. In this paper, we propose to use optimal transport (OT) theory to capture the topology evolution of the dynamic brain networks, and develop a multi-channel spatio-temporal graph convolutional network that collaboratively extracts the temporal and spatial features from the evolution networks. Specifically, we first adaptively evaluate the graph hubness of brain regions in the brain network of each time window, which comprehensively models information transmission among multiple brain regions. Second, the hubness propagation information across adjacent time windows is captured by optimal transport, describing high-order topology evolution of dynamic brain networks. Moreover, we develop a spatio-temporal graph convolutional network with attention mechanism to collaboratively extract the intrinsic temporal and spatial topology information from the above networks. Finally, the multi-layer perceptron is adopted for classifying the dynamic brain network. The extensive experiment on the collected epilepsy dataset and the public ADNI dataset show that our proposed method not only outperforms several state-of-the-art methods in brain disease diagnosis, but also reveals the key dynamic alterations of brain connectivities between patients and healthy controls.
动态脑网络在表征脑功能连接的变化模式方面比静态脑网络具有优势，在脑疾病诊断中越来越受到关注。然而，现有的动态脑网络分析方法大多依赖于从滑动窗口划分的独立脑网络中提取特征，难以揭示功能脑网络的高阶动态演化规律。此外，它们无法有效地提取动态大脑网络中的时空拓扑特征。在本文中，我们建议使用最优传输（OT）理论来捕获动态脑网络的拓扑演化，并开发一种多通道时空图卷积网络，从演化网络中协作提取时空特征。具体来说，我们首先自适应评估每个时间窗口的大脑网络中大脑区域的图中心度，从而综合建模多个大脑区域之间的信息传输。其次，通过最佳传输捕获相邻时间窗口的中心传播信息，描述动态大脑网络的高阶拓扑演化。此外，我们开发了一种具有注意机制的时空图卷积网络，以协作从上述网络中提取内在的时空拓扑信息。最后，采用多层感知器对动态脑网络进行分类。对收集的癫痫数据集和公共 ADNI 数据集进行的广泛实验表明，我们提出的方法不仅优于脑部疾病诊断中的几种最先进的方法，而且揭示了患者和健康对照之间大脑连接的关键动态变化。

AU Li, Pengcheng Gao, Chenqiang Lian, Chunfeng Meng, Deyu
AU Li、高鹏程、连晨强、孟春风、德宇

Spatial Prior-Guided Bi-Directional Cross-Attention Transformers for Tooth Instance Segmentation.
用于牙齿实例分割的空间先验引导双向交叉注意变压器。

Tooth instance segmentation of dental panoramic X-ray images represents a task of significant clinical importance. Teeth demonstrate symmetry within the upper and lower jawbones and are arranged in a specific order. However, previous studies frequently overlook this crucial spatial prior information, resulting in misidentifications of tooth categories for adjacent or similarly shaped teeth. In this paper, we propose SPGTNet, a spatial prior-guided transformer method, designed to both the extracted tooth positional features from CNNs and the long-range contextual information from vision transformers for dental panoramic X-ray image segmentation. Initially, a center-based spatial prior perception module is employed to identify each tooth's centroid, thereby enhancing the spatial prior information for the CNN sequence features. Subsequently, a bi-directional cross-attention module is designed to facilitate the interaction between the spatial prior information of the CNN sequence features and the long-distance contextual features of the vision transformer sequence features. Finally, an instance identification head is employed to derive the tooth segmentation results. Extensive experiments on three public benchmark datasets have demonstrated the effectiveness and superiority of our proposed method in comparison with other state-of-the-art approaches. The proposed method demonstrates the capability to accurately identify and analyze tooth structures, thereby providing crucial information for dental diagnosis, treatment planning, and research.
牙科全景 X 射线图像的牙齿实例分割是一项具有重要临床意义的任务。牙齿在上颌骨和下颌骨内表现出对称性，并按特定顺序排列。然而，以前的研究经常忽视这一重要的空间先验信息，导致对相邻或形状相似的牙齿的牙齿类别的错误识别。在本文中，我们提出了 SPGTNet，一种空间先验引导变换器方法，旨在从 CNN 提取的牙齿位置特征和来自视觉变换器的远程上下文信息进行牙科全景 X 射线图像分割。最初，采用基于中心的空间先验感知模块来识别每个牙齿的质心，从而增强 CNN 序列特征的空间先验信息。随后，设计了双向交叉注意力模块，以促进CNN序列特征的空间先验信息与视觉变换器序列特征的长距离上下文特征之间的交互。最后，采用实例识别头来得出牙齿分割结果。对三个公共基准数据集的广泛实验证明了我们提出的方法与其他最先进方法相比的有效性和优越性。该方法展示了准确识别和分析牙齿结构的能力，从而为牙科诊断、治疗计划和研究提供重要信息。

AU Chai, Zhizhong Luo, Luyang Lin, Huangjing Heng, Pheng-Ann Chen, Hao
柴AU、罗志忠、林路阳、黄静恒、陈鹏安、郝

Deep Omni-Supervised Learning for Rib Fracture Detection From Chest Radiology Images
通过胸部放射图像检测肋骨骨折的深度全监督学习

Deep learning (DL)-based rib fracture detection has shown promise of playing an important role in preventing mortality and improving patient outcome. Normally, developing DL-based object detection models requires a huge amount of bounding box annotation. However, annotating medical data is time-consuming and expertise-demanding, making obtaining a large amount of fine-grained annotations extremely infeasible. This poses a pressing need for developing label-efficient detection models to alleviate radiologists' labeling burden. To tackle this challenge, the literature on object detection has witnessed an increase of weakly-supervised and semi-supervised approaches, yet still lacks a unified framework that leverages various forms of fully-labeled, weakly-labeled, and unlabeled data. In this paper, we present a novel omni-supervised object detection network, ORF-Netv2, to leverage as much available supervision as possible. Specifically, a multi-branch omni-supervised detection head is introduced with each branch trained with a specific type of supervision. A co-training-based dynamic label assignment strategy is then proposed to enable flexible and robust learning from the weakly-labeled and unlabeled data. Extensive evaluation was conducted for the proposed framework with three rib fracture datasets on both chest CT and X-ray. By leveraging all forms of supervision, ORF-Netv2 achieves mAPs of 34.7, 44.7, and 19.4 on the three datasets, respectively, surpassing the baseline detector which uses only box annotations by mAP gains of 3.8, 4.8, and 5.0, respectively. Furthermore, ORF-Netv2 consistently outperforms other competitive label-efficient methods over various scenarios, showing a promising framework for label-efficient fracture detection. The code is available at: https://github.com/zhizhongchai/ORF-Net.
基于深度学习 (DL) 的肋骨骨折检测有望在预防死亡和改善患者预后方面发挥重要作用。通常，开发基于深度学习的对象检测模型需要大量的边界框注释。然而，对医学数据进行注释既耗时又需要专业知识，使得获得大量细粒度的注释极其不可行。这就迫切需要开发标签高效的检测模型，以减轻放射科医生的标签负担。为了应对这一挑战，有关目标检测的文献见证了弱监督和半监督方法的增加，但仍然缺乏利用各种形式的全标记、弱标记和未标记数据的统一框架。在本文中，我们提出了一种新颖的全监督对象检测网络 ORF-Netv2，以尽可能多地利用可用的监督。具体来说，引入了多分支全监督检测头，每个分支都接受特定类型的监督训练。然后提出了一种基于协同训练的动态标签分配策略，以实现从弱标记和未标记数据中进行灵活且鲁棒的学习。使用胸部 CT 和 X 射线的三个肋骨骨折数据集对所提出的框架进行了广泛的评估。通过利用各种形式的监督，ORF-Netv2 在三个数据集上分别实现了 34.7、44.7 和 19.4 的 mAP，超过了仅使用框注释的基线检测器，mAP 增益分别为 3.8、4.8 和 5.0。此外，ORF-Netv2 在各种场景中始终优于其他竞争性标签高效方法，为标签高效断裂检测展示了一个有前途的框架。代码位于：https://github.com/zhizhongchai/ORF-Net。

AU Lin, Weiyuan Gao, Zhifan Liu, Hui Zhang, Heye
AU Lin、高伟远、刘志凡、张辉、Heye

A Deformable Constraint Transport Network for Optimal Aortic Segmentation From CT Images
用于从 CT 图像中实现最佳主动脉分割的可变形约束传输网络

Aortic segmentation from computed tomography (CT) is crucial for facilitating aortic intervention, as it enables clinicians to visualize aortic anatomy for diagnosis and measurement. However, aortic segmentation faces the challenge of variable geometry in space, as the geometric diversity of different diseases and the geometric transformations that occur between raw and measured images. Existing constraint-based methods can potentially solve the challenge, but they are hindered by two key issues: inaccurate definition of properties and inappropriate topology of transformation in space. In this paper, we propose a deformable constraint transport network (DCTN). The DCTN adaptively extracts aortic features to define intra-image constrained properties and guides topological implementation in space to constrain inter-image geometric transformation between raw and curved planar reformation (CPR) images. The DCTN contains a deformable attention extractor, a geometry-aware decoder and an optimal transport guider. The extractor generates variable patches that preserve semantic integrity and long-range dependency in long-sequence images. The decoder enhances the perception of geometric texture and semantic features, particularly for low-intensity aortic coarctation and false lumen, which removes background interference. The guider explores the geometric discrepancies between raw and CPR images, constructs probability distributions of discrepancies, and matches them with inter-image transformation to guide geometric topology in space. Experimental studies on 267 aortic subjects and four public datasets show the superiority of our DCTN over 23 methods. The results demonstrate DCTN's advantages in aortic segmentation for different types of aortic disease, for different aortic segments, and in the measurement of clinical indexes.
计算机断层扫描 (CT) 的主动脉分割对于促进主动脉介入至关重要，因为它使临床医生能够可视化主动脉解剖结构以进行诊断和测量。然而，主动脉分割面临着空间几何可变的挑战，因为不同疾病的几何多样性以及原始图像和测量图像之间发生的几何变换。现有的基于约束的方法可以潜在地解决这一挑战，但它们受到两个关键问题的阻碍：属性定义不准确和空间变换拓扑不合适。在本文中，我们提出了一种可变形约束传输网络（DCTN）。 DCTN 自适应地提取主动脉特征来定义图像内约束属性，并指导空间中的拓扑实现以约束原始图像和弯曲平面重组 (CPR) 图像之间的图像间几何变换。 DCTN 包含可变形注意力提取器、几何感知解码器和最佳传输引导器。提取器生成可变补丁，以保留长序列图像中的语义完整性和远程依赖性。该解码器增强了对几何纹理和语义特征的感知，特别是对于低强度主动脉缩窄和假腔，从而消除了背景干扰。引导器探索原始图像和 CPR 图像之间的几何差异，构建差异的概率分布，并将其与图像间变换相匹配，以引导空间中的几何拓扑。对 267 名主动脉受试者和四个公共数据集的实验研究表明，我们的 DCTN 优于 23 种方法。结果证明了DCTN在针对不同类型主动脉疾病、不同主动脉节段的主动脉分割以及临床指标测量方面的优势。

AU Liu, Jinhua Desrosiers, Christian Yu, Dexin Zhou, Yuanfeng
AU Liu、Jinhua Desrosiers、Christian Yu、周德新、袁峰

Semi-Supervised Medical Image Segmentation Using Cross-Style Consistency With Shape-Aware and Local Context Constraints
使用具有形状感知和局部上下文约束的跨样式一致性的半监督医学图像分割

Despite the remarkable progress in semi-supervised medical image segmentation methods based on deep learning, their application to real-life clinical scenarios still faces considerable challenges. For example, insufficient labeled data often makes it difficult for networks to capture the complexity and variability of the anatomical regions to be segmented. To address these problems, we design a new semi-supervised segmentation framework that aspires to produce anatomically plausible predictions. Our framework comprises two parallel networks: shape-agnostic and shape-aware networks. These networks learn from each other, enabling effective utilization of unlabeled data. Our shape-aware network implicitly introduces shape guidance to capture shape fine-grained information. Meanwhile, shape-agnostic networks employ uncertainty estimation to further obtain reliable pseudo-labels for the counterpart. We also employ a cross-style consistency strategy to enhance the network's utilization of unlabeled data. It enriches the dataset to prevent overfitting and further eases the coupling of the two networks that learn from each other. Our proposed architecture also incorporates a novel loss term that facilitates the learning of the local context of segmentation by the network, thereby enhancing the overall accuracy of prediction. Experiments on three different datasets of medical images show that our method outperforms many excellent semi-supervised segmentation methods and outperforms them in perceiving shape. The code can be seen at https://github.com/igip-liu/SLC-Net.
尽管基于深度学习的半监督医学图像分割方法取得了显着进展，但其在现实临床场景中的应用仍然面临着相当大的挑战。例如，标记数据不足通常会使网络难以捕获要分割的解剖区域的复杂性和可变性。为了解决这些问题，我们设计了一个新的半监督分割框架，旨在产生解剖学上合理的预测。我们的框架包含两个并行网络：形状不可知网络和形状感知网络。这些网络相互学习，从而能够有效利用未标记的数据。我们的形状感知网络隐式引入形状指导来捕获形状细粒度信息。同时，形状不可知网络采用不确定性估计来进一步获得对应对象的可靠伪标签。我们还采用跨风格一致性策略来增强网络对未标记数据的利用率。它丰富了数据集以防止过度拟合，并进一步简化了两个相互学习的网络的耦合。我们提出的架构还结合了一种新颖的损失项，有助于网络学习分割的局部上下文，从而提高预测的整体准确性。在三个不同的医学图像数据集上的实验表明，我们的方法优于许多优秀的半监督分割方法，并且在感知形状方面优于它们。代码可见https://github.com/igip-liu/SLC-Net。

C1 Shandong Univ, Sch Software, Jinan 250101, Peoples R China C1 Ecole Technol Super ETS, Software & IT Dept, Montreal, PQ H3C 1K3, Canada C1 Shandong Univ, Qilu Hosp, Jinan 250012, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-07-02 UT WOS:001196733400012 PM 38032771 ER
C1 山东大学，Sch Software，济南 250101，人民 R 中国 C1 Ecole Technol Super ETS，软件与 IT 系，蒙特利尔，PQ H3C 1K3，加拿大 C1 山东大学，齐鲁医院，济南 250012，人民 R 中国 SN 0278-0062 EI 1558 -254X DA 2024-07-02 UT WOS：001196733400012 PM 38032771 ER

AU Stevens, Tristan S W Meral, Faik C Yu, Jason Apostolakis, Iason Z Robert, Jean-Luc Van Sloun, Ruud J G
AU Stevens、Tristan SW Meral、Faik C Yu、Jason Apostolakis、Iason Z Robert、Jean-Luc Van Sloun、Ruud JG

Dehazing Ultrasound using Diffusion Models.
使用扩散模型进行超声去雾。

Echocardiography has been a prominent tool for the diagnosis of cardiac disease. However, these diagnoses can be heavily impeded by poor image quality. Acoustic clutter emerges due to multipath reflections imposed by layers of skin, subcutaneous fat, and intercostal muscle between the transducer and heart. As a result, haze and other noise artifacts pose a real challenge to cardiac ultrasound imaging. In many cases, especially with difficult-to-image patients such as patients with obesity, a diagnosis from B-Mode ultrasound imaging is effectively rendered unusable, forcing sonographers to resort to contrast-enhanced ultrasound examinations or refer patients to other imaging modalities. Tissue harmonic imaging has been a popular approach to combat haze, but in severe cases is still heavily impacted by haze. Alternatively, denoising algorithms are typically unable to remove highly structured and correlated noise, such as haze. It remains a challenge to accurately describe the statistical properties of structured haze, and develop an inference method to subsequently remove it. Diffusion models have emerged as powerful generative models and have shown their effectiveness in a variety of inverse problems. In this work, we present a joint posterior sampling framework that combines two separate diffusion models to model the distribution of both clean ultrasound and haze in an unsupervised manner. Furthermore, we demonstrate techniques for effectively training diffusion models on radio-frequency ultrasound data and highlight the advantages over image data. Experiments on both in-vitro and in-vivo cardiac datasets show that the proposed dehazing method effectively removes haze while preserving signals from weakly reflected tissue.
超声心动图已成为诊断心脏病的重要工具。然而，这些诊断可能会因图像质量差而受到严重阻碍。由于换能器和心脏之间的皮肤层、皮下脂肪和肋间肌施加的多路径反射，出现了声杂波。因此，雾霾和其他噪声伪影对心脏超声成像构成了真正的挑战。在许多情况下，特别是对于肥胖患者等难以成像的患者，B 型超声成像的诊断实际上变得无法使用，迫使超声检查人员诉诸对比增强超声检查或将患者转诊至其他成像方式。组织谐波成像一直是对抗雾霾的流行方法，但在严重的情况下仍然受到雾霾的严重影响。或者，去噪算法通常无法去除高度结构化和相关的噪声，例如雾霾。准确描述结构雾的统计特性并开发一种推理方法来随后消除它仍然是一个挑战。扩散模型已成为强大的生成模型，并在各种反问题中显示出其有效性。在这项工作中，我们提出了一个联合后验采样框架，该框架结合了两个独立的扩散模型，以无监督的方式对清洁超声和雾霾的分布进行建模。此外，我们展示了在射频超声数据上有效训练扩散模型的技术，并强调了相对于图像数据的优势。对体外和体内心脏数据集的实验表明，所提出的去雾方法可以有效去除雾霾，同时保留来自弱反射组织的信号。

AU Hu, Wentao Cheng, Lianglun Huang, Guoheng Yuan, Xiaochen Zhong, Guo Pun, Chi-Man Zhou, Jian Cai, Muyan
胡AU、程文涛、黄良伦、袁国恒、钟晓晨、郭朋、周驰满、蔡健、穆岩

Learning From Incorrectness: Active Learning With Negative Pre-Training and Curriculum Querying for Histological Tissue Classification
从错误中学习：组织学组织分类的负预训练和课程查询的主动学习

Patch-level histological tissue classification is an effective pre-processing method for histological slide analysis. However, the classification of tissue with deep learning requires expensive annotation costs. To alleviate the limitations of annotation budgets, the application of active learning (AL) to histological tissue classification is a promising solution. Nevertheless, there is a large imbalance in performance between categories during application, and the tissue corresponding to the categories with relatively insufficient performance are equally important for cancer diagnosis. In this paper, we propose an active learning framework called ICAL, which contains Incorrectness Negative Pre-training (INP) and Category-wise Curriculum Querying (CCQ) to address the above problem from the perspective of category-to-category and from the perspective of categories themselves, respectively. In particular, INP incorporates the unique mechanism of active learning to treat the incorrect prediction results that obtained from CCQ as complementary labels for negative pre-training, in order to better distinguish similar categories during the training process. CCQ adjusts the query weights based on the learning status on each category by the model trained by INP, and utilizes uncertainty to evaluate and compensate for query bias caused by inadequate category performance. Experimental results on two histological tissue classification datasets demonstrate that ICAL achieves performance approaching that of fully supervised learning with less than 16% of the labeled data. In comparison to the state-of-the-art active learning algorithms, ICAL achieved better and more balanced performance in all categories and maintained robustness with extremely low annotation budgets. The source code will be released at https://github.com/LactorHwt/ICAL.
斑块级组织学组织分类是组织学玻片分析的有效预处理方法。然而，利用深度学习对组织进行分类需要昂贵的注释成本。为了缓解注释预算的限制，将主动学习（AL）应用于组织学组织分类是一个有前途的解决方案。然而，应用过程中类别之间的性能存在较大不平衡，性能相对不足的类别对应的组织对于癌症诊断同样重要。在本文中，我们提出了一种名为 ICAL 的主动学习框架，其中包含不正确负预训练（INP）和类别明智的课程查询（CCQ），以从类别到类别的角度和从类别角度解决上述问题分别是类别本身。特别是，INP结合了主动学习的独特机制，将CCQ获得的错误预测结果视为负面预训练的补充标签，以便在训练过程中更好地区分相似类别。 CCQ根据INP训练的模型对每个类别的学习状况来调整查询权重，并利用不确定性来评估和补偿由于类别表现不足而导致的查询偏差。两个组织学组织分类数据集的实验结果表明，ICAL 使用少于 16% 的标记数据实现了接近完全监督学习的性能。与最先进的主动学习算法相比，ICAL 在所有类别中实现了更好、更平衡的性能，并以极低的注释预算保持了鲁棒性。源代码将在 https://github.com/LactorHwt/ICAL 发布。

AU van Harten, Louis D. Stoker, Jaap Isgum, Ivana
AU van Harten、Louis D. Stoker、Jaap Isgum、Ivana

Robust Deformable Image Registration Using Cycle-Consistent Implicit Representations
使用循环一致隐式表示的鲁棒变形图像配准

Recent works in medical image registration have proposed the use of Implicit Neural Representations, demonstrating performance that rivals state-of-the-art learning-based methods. However, these implicit representations need to be optimized for each new image pair, which is a stochastic process that may fail to converge to a global minimum. To improve robustness, we propose a deformable registration method using pairs of cycle-consistent Implicit Neural Representations: each implicit representation is linked to a second implicit representation that estimates the opposite transformation, causing each network to act as a regularizer for its paired opposite. During inference, we generate multiple deformation estimates by numerically inverting the paired backward transformation and evaluating the consensus of the optimized pair. This consensus improves registration accuracy over using a single representation and results in a robust uncertainty metric that can be used for automatic quality control. We evaluate our method with a 4D lung CT dataset. The proposed cycle-consistent optimization method reduces the optimization failure rate from 2.4% to 0.0% compared to the current state-of-the-art. The proposed inference method improves landmark accuracy by 4.5% and the proposed uncertainty metric detects all instances where the registration method fails to converge to a correct solution. We verify the generalizability of these results to other data using a centerline propagation task in abdominal 4D MRI, where our method achieves a 46% improvement in propagation consistency compared with single-INR registration and demonstrates a strong correlation between the proposed uncertainty metric and registration accuracy.
最近的医学图像配准工作提出了使用隐式神经表示，其性能可与最先进的基于学习的方法相媲美。然而，这些隐式表示需要针对每个新图像对进行优化，这是一个随机过程，可能无法收敛到全局最小值。为了提高鲁棒性，我们提出了一种使用成对的循环一致隐式神经表示的可变形配准方法：每个隐式表示都链接到估计相反变换的第二隐式表示，使每个网络充当其配对相反的正则化器。在推理过程中，我们通过对成对后向变换进行数值反转并评估优化对的一致性来生成多个变形估计。与使用单一表示相比，这种共识提高了配准准确性，并产生了可用于自动质量控制的强大的不确定性度量。我们使用 4D 肺部 CT 数据集评估我们的方法。与当前最先进的技术相比，所提出的循环一致优化方法将优化失败率从 2.4% 降低到 0.0%。所提出的推理方法将地标精度提高了 4.5%，并且所提出的不确定性度量可检测配准方法未能收敛到正确解决方案的所有实例。我们使用腹部 4D MRI 中的中心线传播任务验证了这些结果对其他数据的通用性，与单 INR 配准相比，我们的方法在传播一致性方面实现了 46% 的改进，并证明了所提出的不确定性度量和配准精度之间的强相关性。

AU Li, Wei Liu, Guang-Hai Fan, Haoyi Li, Zuoyong Zhang, David
李AU、刘伟、范广海、李浩毅、张作勇、David

Self-Supervised Multi-Scale Cropping and Simple Masked Attentive Predicting for Lung CT-Scan Anomaly Detection
用于肺部 CT 扫描异常检测的自监督多尺度裁剪和简单屏蔽注意力预测

Anomaly detection has been widely explored by training an out-of-distribution detector with only normal data for medical images. However, detecting local and subtle irregularities without prior knowledge of anomaly types brings challenges for lung CT-scan image anomaly detection. In this paper, we propose a self-supervised framework for learning representations of lung CT-scan images via both multi-scale cropping and simple masked attentive predicting, which is capable of constructing a powerful out-of-distribution detector. Firstly, we propose CropMixPaste, a self-supervised augmentation task for generating density shadow-like anomalies that encourage the model to detect local irregularities of lung CT-scan images. Then, we propose a self-supervised reconstruction block, named simple masked attentive predicting block (SMAPB), to better refine local features by predicting masked context information. Finally, the learned representations by self-supervised tasks are used to build an out-of-distribution detector. The results on real lung CT-scan datasets demonstrate the effectiveness and superiority of our proposed method compared with state-of-the-art methods.
通过仅使用医学图像的正常数据训练分布外检测器，异常检测已被广泛探索。然而，在事先不了解异常类型的情况下检测局部和细微的不规则现象给肺部 CT 扫描图像异常检测带来了挑战。在本文中，我们提出了一种自监督框架，用于通过多尺度裁剪和简单的掩模注意预测来学习肺部 CT 扫描图像的表示，该框架能够构建强大的分布外检测器。首先，我们提出了 CropMixPaste，这是一种自监督增强任务，用于生成密度阴影状异常，从而鼓励模型检测肺部 CT 扫描图像的局部不规则性。然后，我们提出了一种自监督重建块，称为简单掩蔽注意预测块（SMAPB），通过预测掩蔽上下文信息来更好地细化局部特征。最后，通过自监督任务学习到的表示用于构建分布外检测器。真实肺部 CT 扫描数据集的结果证明了我们提出的方法与最先进的方法相比的有效性和优越性。

AU Caudoux, Manon Demeulenaere, Oscar Poree, Jonathan Sauvage, Jack Mateo, Philippe Ghaleh, Bijan Flesch, Martin Ferin, Guillaume Tanter, Mickael Deffieux, Thomas Papadacci, Clement Pernot, Mathieu
奥·考杜、曼农·德默勒奈尔、奥斯卡·波里、乔纳森·索瓦奇、杰克·马特奥、菲利普·盖勒、比扬·弗莱施、马丁·费林、吉约姆·坦特、米凯尔·德菲厄、托马斯·帕帕达奇、克莱门特·佩尔诺、马蒂厄

Curved Toroidal Row Column Addressed Transducer for 3D Ultrafast Ultrasound Imaging
用于 3D 超快超声成像的弯曲环形行列寻址传感器

3D Imaging of the human heart at high frame rate is of major interest for various clinical applications. Electronic complexity and cost has prevented the dissemination of 3D ultrafast imaging into the clinic. Row column addressed (RCA) transducers provide volumetric imaging at ultrafast frame rate by using a low electronic channel count, but current models are ill-suited for transthoracic cardiac imaging due to field-of-view limitations. In this study, we proposed a mechanically curved RCA with an aperture adapted for transthoracic cardiac imaging ( $24\times16$ mm2). The RCA has a toroidal curved surface of 96 elements along columns (curvature radius rC = 4.47 cm) and 64 elements along rows (curvature radius rR = 3 cm). We implemented delay and sum beamforming with an analytical calculation of the propagation of a toroidal wave which was validated using simulations (Field II). The imaging performance was evaluated on a calibrated phantom. Experimental 3D imaging was achieved up to 12 cm deep with a total angular aperture of 30 degrees for both lateral dimensions. The Contrast-to-Noise ratio increased by 12 dB from 2 to 128 virtual sources. Then, 3D Ultrasound Localization Microscopy (ULM) was characterized in a sub-wavelength tube diameter. Finally, 3D ULM was demonstrated on a perfused ex-vivo swine heart to image the coronary microcirculation.
高帧率下的人体心脏 3D 成像是各种临床应用的主要关注点。电子复杂性和成本阻碍了 3D 超快成像在临床中的传播。行列寻址 (RCA) 传感器通过使用低电子通道数以超快帧速率提供体积成像，但由于视场限制，当前模型不适合经胸心脏成像。在这项研究中，我们提出了一种机械弯曲的 RCA，其孔径适合经胸心脏成像（$24\times16$mm2）。 RCA 具有沿列 96 个元件（曲率半径 rC = 4.47 cm）和沿行 64 个元件（曲率半径 rR = 3 cm）的环形曲面。我们通过对环形波传播的分析计算来实现延迟和求和波束形成，并通过模拟进行验证（领域 II）。成像性能在校准模型上进行评估。实验性 3D 成像深度可达 12 厘米，两个横向尺寸的总孔径角均为 30 度。对比度与噪声比从 2 个虚拟源增加到 128 个，增加了 12 dB。然后，3D 超声定位显微镜 (ULM) 表征了亚波长管直径。最后，在灌注的离体猪心脏上演示了 3D ULM，以对冠状动脉微循环进行成像。

AU Shaker, Abdelrahman Maaz, Muhammad Rasheed, Hanoona Khan, Salman Yang, Ming-Hsuan Khan, Fahad Shahbaz
AU Shaker、Abdelrahman Maaz、Muhammad Rasheed、Hanoona Khan、Salman Yang、Ming-Hsuan Khan、Fahad Shahbaz

UNETR plus plus : Delving Into Efficient and Accurate 3D Medical Image Segmentation
UNETR plus plus：深入研究高效准确的 3D 医学图像分割

Owing to the success of transformer models, recent works study their applicability in 3D medical segmentation tasks. Within the transformer models, the self-attention mechanism is one of the main building blocks that strives to capture long-range dependencies, compared to the local convolutional-based design. However, the self-attention operation has quadratic complexity which proves to be a computational bottleneck, especially in volumetric medical imaging, where the inputs are 3D with numerous slices. In this paper, we propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed. The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features using a pair of inter-dependent branches based on spatial and channel attention. Our spatial attention formulation is efficient and has linear complexity with respect to the input. To enable communication between spatial and channel-focused branches, we share the weights of query and key mapping functions that provide a complimentary benefit (paired attention), while also reducing the complexity. Our extensive evaluations on five benchmarks, Synapse, BTCV, ACDC, BraTS, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy. On Synapse, our UNETR++ sets a new state-of-the-art with a Dice Score of 87.2%, while significantly reducing parameters and FLOPs by over 71%, compared to the best method in the literature. Our code and models are available at: https://tinyurl.com/2p87x5xn.
由于 Transformer 模型的成功，最近的工作研究了它们在 3D 医学分割任务中的适用性。在 Transformer 模型中，与基于局部卷积的设计相比，自注意力机制是努力捕获远程依赖性的主要构建块之一。然而，自注意力操作具有二次复杂度，这被证明是一个计算瓶颈，特别是在体积医学成像中，其中输入是具有大量切片的 3D。在本文中，我们提出了一种名为 UNETR++ 的 3D 医学图像分割方法，该方法提供高质量的分割掩模以及参数、计算成本和推理速度方面的效率。我们设计的核心是引入一种新颖的高效配对注意力（EPA）块，该块使用一对基于空间和通道注意力的相互依赖的分支来有效地学习空间和通道方面的判别特征。我们的空间注意力公式是有效的，并且相对于输入具有线性复杂性。为了实现以空间和通道为中心的分支之间的通信，我们共享查询和关键映射函数的权重，这提供了免费的好处（配对注意力），同时还降低了复杂性。我们对 Synapse、BTCV、ACDC、BraTS 和 Decathlon-Lung 五个基准进行了广泛的评估，揭示了我们在效率和准确性方面所做贡献的有效性。在 Synapse 上，我们的 UNETR++ 创下了新的最先进水平，Dice 得分为 87.2%，同时与文献中的最佳方法相比，参数和 FLOP 显着减少了 71% 以上。我们的代码和模型可在以下网址获取：https://tinyurl.com/2p87x5xn。

AU Gao, Jun Lao, Qicheng Kang, Qingbo Liu, Paul Du, Chenlin Li, Kang Zhang, Le
AU 高、老君、康其成、刘庆波、杜保罗、李陈林、张康、乐

Boosting Your Context by Dual Similarity Checkup for In-Context Learning Medical Image Segmentation.
通过双重相似性检查来增强您的上下文，以进行上下文学习医学图像分割。

The recent advent of in-context learning (ICL) capabilities in large pre-trained models has yielded significant advancements in the generalization of segmentation models. By supplying domain-specific image-mask pairs, the ICL model can be effectively guided to produce optimal segmentation outcomes, eliminating the necessity for model fine-tuning or interactive prompting. However, current existing ICL-based segmentation models exhibit significant limitations when applied to medical segmentation datasets with substantial diversity. To address this issue, we propose a dual similarity checkup approach to guarantee the effectiveness of selected in-context samples so that their guidance can be maximally leveraged during inference. We first employ large pre-trained vision models for extracting strong semantic representations from input images and constructing a feature embedding memory bank for semantic similarity checkup during inference. Assuring the similarity in the input semantic space, we then minimize the discrepancy in the mask appearance distribution between the support set and the estimated mask appearance prior through similarity-weighted sampling and augmentation. We validate our proposed dual similarity checkup approach on eight publicly available medical segmentation datasets, and extensive experimental results demonstrate that our proposed method significantly improves the performance metrics of existing ICL-based segmentation models, particularly when applied to medical image datasets characterized by substantial diversity.
最近大型预训练模型中上下文学习 (ICL) 功能的出现在分割模型的泛化方面取得了重大进展。通过提供特定领域的图像掩模对，可以有效地引导 ICL 模型产生最佳分割结果，从而消除模型微调或交互式提示的必要性。然而，当前现有的基于 ICL 的分割模型在应用于具有很大多样性的医学分割数据集时表现出很大的局限性。为了解决这个问题，我们提出了一种双重相似性检查方法来保证所选上下文样本的有效性，以便在推理过程中最大限度地利用它们的指导。我们首先采用大型预训练视觉模型从输入图像中提取强语义表示，并构建特征嵌入内存库以在推理过程中进行语义相似性检查。确保输入语义空间中的相似性，然后我们通过相似性加权采样和增强来最小化支持集与先验估计的掩模外观之间掩模外观分布的差异。我们在八个公开可用的医学分割数据集上验证了我们提出的双重相似性检查方法，并且广泛的实验结果表明，我们提出的方法显着提高了现有基于 ICL 的分割模型的性能指标，特别是当应用于具有大量多样性特征的医学图像数据集时。

AU Zhang, Yikun Hu, Dianlin Li, Wangyao Zhang, Weijie Chen, Gaoyu Chen, Ronald C Chen, Yang Gao, Hao
张AU、胡逸琨、李殿林、张旺耀、陈伟杰、陈高宇、陈志雄、高阳、郝

2V-CBCT: Two-Orthogonal-Projection based CBCT Reconstruction and Dose Calculation for Radiation Therapy using Real Projection Data.
2V-CBCT：基于两次正交投影的 CBCT 重建和使用真实投影数据的放射治疗剂量计算。

This work demonstrates the feasibility of two-orthogonal-projection-based CBCT (2V-CBCT) reconstruction and dose calculation for radiation therapy (RT) using real projection data, which is the first 2V-CBCT feasibility study with real projection data, to the best of our knowledge. RT treatments are often delivered in multiple fractions, for which on-board CBCT is desirable to calculate the delivered dose per fraction for the purpose of RT delivery quality assurance and adaptive RT. However, not all RT treatments/fractions have CBCT acquired, but two orthogonal projections are always available. The question to be addressed in this work is the feasibility of 2V-CBCT for the purpose of RT dose calculation. 2V-CBCT is a severely ill-posed inverse problem for which we propose a coarse-to-fine learning strategy. First, a 3D deep neural network that can extract and exploit the inter-slice and intra-slice information is adopted to predict the initial 3D volumes. Then, a 2D deep neural network is utilized to fine-tune the initial 3D volumes slice-by-slice. During the fine-tuning stage, a perceptual loss based on multi-frequency features is employed to enhance the image reconstruction. Dose calculation results from both photon and proton RT demonstrate that 2V-CBCT provides comparable accuracy with full-view CBCT based on real projection data.
这项工作向人们展示了使用真实投影数据进行基于二次正交投影的CBCT（2V-CBCT）重建和放射治疗（RT）剂量计算的可行性，这是第一个使用真实投影数据的2V-CBCT可行性研究。据我们所知。 RT 治疗通常分多次进行，因此需要机载 CBCT 来计算每次分次的递送剂量，以保证 RT 递送质量和适应性 RT。然而，并非所有 RT 治疗/分次都获得了 CBCT，但始终可以获得两个正交投影。本工作要解决的问题是 2V-CBCT 用于 RT 剂量计算的可行性。 2V-CBCT 是一个严重不适定的逆问题，我们为此提出了一种从粗到精的学习策略。首先，采用可以提取和利用切片间和切片内信息的 3D 深度神经网络来预测初始 3D 体积。然后，利用 2D 深度神经网络逐片微调初始 3D 体积。在微调阶段，采用基于多频率特征的感知损失来增强图像重建。光子和质子 RT 的剂量计算结果表明，2V-CBCT 提供的精度与基于真实投影数据的全视图 CBCT 相当。

AU Hou, Qingshan Wang, Yaqi Cao, Peng Cheng, Shuai Lan, Linqi Yang, Jinzhu Liu, Xiaoli Zaiane, Osmar R.
侯AU、王青山、曹雅琪、程鹏、兰帅、杨林奇、刘金柱、Xiaoli Zaiane、Osmar R.

A Collaborative Self-Supervised Domain Adaptation for Low-Quality Medical Image Enhancement
用于低质量医学图像增强的协作自监督域适应

Medical image analysis techniques have been employed in diagnosing and screening clinical diseases. However, both poor medical image quality and illumination style inconsistency increase uncertainty in clinical decision-making, potentially resulting in clinician misdiagnosis. The majority of current image enhancement methods primarily concentrate on enhancing medical image quality by leveraging high-quality reference images, which are challenging to collect in clinical applications. In this study, we address image quality enhancement within a fully self-supervised learning setting, wherein neither high-quality images nor paired images are required. To achieve this goal, we investigate the potential of self-supervised learning combined with domain adaptation to enhance the quality of medical images without the guidance of high-quality medical images. We design a Domain Adaptation Self-supervised Quality Enhancement framework, called DASQE. More specifically, we establish multiple domains at the patch level through a designed rule-based quality assessment scheme and style clustering. To achieve image quality enhancement and maintain style consistency, we formulate the image quality enhancement as a collaborative self-supervised domain adaptation task for disentangling the low-quality factors, medical image content, and illumination style characteristics by exploring intrinsic supervision in the low-quality medical images. Finally, we perform extensive experiments on six benchmark datasets of medical images, and the experimental results demonstrate that DASQE attains state-of-the-art performance. Furthermore, we explore the impact of the proposed method on various clinical tasks, such as retinal fundus vessel/lesion segmentation, nerve fiber segmentation, polyp segmentation, skin lesion segmentation, and disease classification. The results demonstrate that DASQE is advantageous for diverse downstream image analysis tasks.
医学图像分析技术已应用于临床疾病的诊断和筛查。然而，较差的医学图像质量和照明风格的不一致增加了临床决策的不确定性，可能导致临床医生误诊。当前大多数图像增强方法主要集中在通过利用高质量参考图像来增强医学图像质量，而在临床应用中收集这些图像具有挑战性。在这项研究中，我们在完全自我监督的学习环境中解决图像质量增强问题，其中既不需要高质量图像也不需要配对图像。为了实现这一目标，我们研究了自我监督学习与领域适应相结合的潜力，以在没有高质量医学图像指导的情况下提高医学图像的质量。我们设计了一个领域适应自监督质量增强框架，称为 DASQE。更具体地说，我们通过设计的基于规则的质量评估方案和风格聚类在补丁级别建立多个域。为了实现图像质量增强并保持风格一致性，我们将图像质量增强制定为协作自监督域适应任务，通过探索低质量中的内在监督来解开低质量因素、医学图像内容和照明风格特征。医学图像。最后，我们对六个医学图像基准数据集进行了广泛的实验，实验结果表明 DASQE 达到了最先进的性能。此外，我们探讨了所提出的方法对各种临床任务的影响，例如视网膜眼底血管/病变分割、神经纤维分割、息肉分割、皮肤病变分割和疾病分类。结果表明，DASQE 对于各种下游图像分析任务具有优势。

AU Li, Lei Camps, Julia Wang, Zhinuo (Jenny) Beetz, Marcel Banerjee, Abhirup Rodriguez, Blanca Grau, Vicente
AU Li、Lei Camps、Julia Wang、Zhinuo (Jenny) Beetz、Marcel Banerjee、Abhirup Rodriguez、Blanca Grau、Vicente

Toward Enabling Cardiac Digital Twins of Myocardial Infarction Using Deep Computational Models for Inverse Inference
使用深度计算模型进行逆向推理，实现心肌梗塞的心脏数字孪生

Cardiac digital twins (CDTs) have the potential to offer individualized evaluation of cardiac function in a non-invasive manner, making them a promising approach for personalized diagnosis and treatment planning of myocardial infarction (MI). The inference of accurate myocardial tissue properties is crucial in creating a reliable CDT of MI. In this work, we investigate the feasibility of inferring myocardial tissue properties from the electrocardiogram (ECG) within a CDT platform. The platform integrates multi-modal data, such as cardiac MRI and ECG, to enhance the accuracy and reliability of the inferred tissue properties. We perform a sensitivity analysis based on computer simulations, systematically exploring the effects of infarct location, size, degree of transmurality, and electrical activity alteration on the simulated QRS complex of ECG, to establish the limits of the approach. We subsequently present a novel deep computational model, comprising a dual-branch variational autoencoder and an inference model, to infer infarct location and distribution from the simulated QRS. The proposed model achieves mean Dice scores of $ {0}.{457} \pm {0}.{317} $ and $ {0}.{302} \pm {0}.{273} $ for the inference of left ventricle scars and border zone, respectively. The sensitivity analysis enhances our understanding of the complex relationship between infarct characteristics and electrophysiological features. The in silico experimental results show that the model can effectively capture the relationship for the inverse inference, with promising potential for clinical application in the future. The code is available at https://github.com/lileitech/MI_inverse_inference.
心脏数字孪生（CDT）有潜力以非侵入性方式提供心脏功能的个性化评估，使其成为心肌梗死（MI）个性化诊断和治疗计划的有前途的方法。准确的心肌组织特性的推断对于创建可靠的 MI 的 CDT 至关重要。在这项工作中，我们研究了在 CDT 平台内从心电图 (ECG) 推断心肌组织特性的可行性。该平台集成了心脏 MRI 和 ECG 等多模态数据，以提高推断组织特性的准确性和可靠性。我们基于计算机模拟进行敏感性分析，系统地探索梗塞位置、大小、透壁程度和电活动改变对模拟心电图 QRS 波群的影响，以确定该方法的局限性。随后，我们提出了一种新颖的深度计算模型，包括双分支变分自动编码器和推理模型，以从模拟的 QRS 推断梗塞位置和分布。所提出的模型对于 left 的推理实现了 $ {0}.{457} \pm {0}.{317} $ 和 $ {0}.{302} \pm {0}.{273} $ 的平均 Dice 分数分别是心室疤痕和边界区。敏感性分析增强了我们对梗塞特征和电生理特征之间复杂关系的理解。计算机实验结果表明，该模型能够有效捕捉逆向推理的关系，未来具有良好的临床应用潜力。代码可在 https://github.com/lileitech/MI_inverse_inference 获取。

AU Schmidt, Adam Mohareri, Omid DiMaio, Simon P. Salcudean, Septimiu E.
AU Schmidt、Adam Mohareri、Omid DiMaio、Simon P. Salcudean、Septimiu E.

Surgical Tattoos in Infrared: A Dataset for Quantifying Tissue Tracking and Mapping
红外手术纹身：用于量化组织跟踪和绘图的数据集

Quantifying performance of methods for tracking and mapping tissue in endoscopic environments is essential for enabling image guidance and automation of medical interventions and surgery. Datasets developed so far either use rigid environments, visible markers, or require annotators to label salient points in videos after collection. These are respectively: not general, visible to algorithms, or costly and error-prone. We introduce a novel labeling methodology along with a dataset that uses said methodology, Surgical Tattoos in Infrared (STIR). STIR has labels that are persistent but invisible to visible spectrum algorithms. This is done by labelling tissue points with IR-fluorescent dye, indocyanine green (ICG), and then collecting visible light video clips. STIR comprises hundreds of stereo video clips in both in vivo and ex vivo scenes with start and end points labelled in the IR spectrum. With over 3,000 labelled points, STIR will help to quantify and enable better analysis of tracking and mapping methods. After introducing STIR, we analyze multiple different frame-based tracking methods on STIR using both 3D and 2D endpoint error and accuracy metrics. STIR is available at https://dx.doi.org/10.21227/w8g4-g548
量化内窥镜环境中跟踪和绘制组织的方法的性能对于实现医疗干预和手术的图像引导和自动化至关重要。迄今为止开发的数据集要么使用严格的环境、可见的标记，要么要求注释者在收集后标记视频中的显着点。这些分别是：不通用、算法可见、成本高昂且容易出错。我们引入了一种新颖的标记方法以及使用该方法的数据集，即红外纹身手术（STIR）。 STIR 具有持久性但对可见光谱算法不可见的标签。这是通过用红外荧光染料吲哚菁绿 (ICG) 标记组织点，然后收集可见光视频剪辑来完成的。 STIR 包含数百个体内和离体场景的立体视频剪辑，并在红外光谱中标记了起点和终点。 STIR 拥有 3,000 多个标记点，将有助于量化并更好地分析跟踪和绘图方法。介绍 STIR 后，我们使用 3D 和 2D 端点误差和准确度指标来分析 STIR 上多种不同的基于帧的跟踪方法。 STIR 可在 https://dx.doi.org/10.21227/w8g4-g548 获取

AU Shen, Chengkang Zhu, Hao Zhou, You Liu, Yu Yi, Si Dong, Lili Zhao, Weipeng Brady, David J Cao, Xun Ma, Zhan Lin, Yi
沉区、朱成康、周浩、刘友、于毅、董思、赵丽丽、布雷迪伟鹏、曹大卫、马迅、林展、易

Continuous 3D Myocardial Motion Tracking via Echocardiography.
通过超声心动图进行连续 3D 心肌运动跟踪。

Myocardial motion tracking stands as an essential clinical tool in the prevention and detection of cardiovascular diseases (CVDs), the foremost cause of death globally. However, current techniques suffer from incomplete and inaccurate motion estimation of the myocardium in both spatial and temporal dimensions, hindering the early identification of myocardial dysfunction. To address these challenges, this paper introduces the Neural Cardiac Motion Field (NeuralCMF). NeuralCMF leverages implicit neural representation (INR) to model the 3D structure and the comprehensive 6D forward/backward motion of the heart. This method surpasses pixel-wise limitations by offering the capability to continuously query the precise shape and motion of the myocardium at any specific point throughout the cardiac cycle, enhancing the detailed analysis of cardiac dynamics beyond traditional speckle tracking. Notably, NeuralCMF operates without the need for paired datasets, and its optimization is self-supervised through the physics knowledge priors in both space and time dimensions, ensuring compatibility with both 2D and 3D echocardiogram video inputs. Experimental validations across three representative datasets support the robustness and innovative nature of the NeuralCMF, marking significant advantages over existing state-of-the-art methods in cardiac imaging and motion tracking. Code is available at: https://njuvision.github.io/NeuralCMF.
心肌运动跟踪是预防和检测心血管疾病（CVD）的重要临床工具，心血管疾病是全球最主要的死亡原因。然而，当前的技术在空间和时间维度上对心肌的运动估计不完整且不准确，阻碍了心肌功能障碍的早期识别。为了应对这些挑战，本文介绍了神经心脏运动场（NeuralCMF）。 NeuralCMF 利用隐式神经表示 (INR) 对心脏的 3D 结构和全面的 6D 向前/向后运动进行建模。该方法超越了像素方面的限制，能够在整个心动周期的任何特定点连续查询心肌的精确形状和运动，从而增强对心脏动力学的详细分析，超越传统的斑点跟踪。值得注意的是，NeuralCMF 的运行不需要配对数据集，其优化是通过空间和时间维度的物理知识先验进行自我监督的，确保与 2D 和 3D 超声心动图视频输入兼容。三个代表性数据集的实验验证支持 NeuralCMF 的稳健性和创新性，与心脏成像和运动跟踪领域现有的最先进方法相比具有显着优势。代码位于：https://njuvision.github.io/NeuralCMF。

AU Yang, Wenhui Gao, Shuo Zhang, Hao Yu, Hong Xu, Menglei Chong, Puimun Zhang, Weijie Wang, Hong Zhang, Wenjuan Qian, Airong
AU Yang、高文辉、张硕、于浩、徐红、冲梦蕾、张佩蒙、王伟杰、张红、钱文娟、爱荣

PtbNet: Based on Local Few-Shot Classes and Small Objects to accurately detect PTB.
PtbNet：基于局部少样本类和小对象来准确检测PTB。

Pulmonary Tuberculosis (PTB) is one of the world's most infectious illnesses, and its early detection is critical for preventing PTB. Digital Radiography (DR) has been the most common and effective technique to examine PTB. However, due to the variety and weak specificity of phenotypes on DR chest X-ray (DCR), it is difficult to make reliable diagnoses for radiologists. Although artificial intelligence technology has made considerable gains in assisting the diagnosis of PTB, it lacks methods to identify the lesions of PTB with few-shot classes and small objects. To solve these problems, geometric data augmentation was used to increase the size of the DCRs. For this purpose, a diffusion probability model was implemented for six few-shot classes. Importantly, we propose a new multi-lesion detector PtbNet based on RetinaNet, which was constructed to detect small objects of PTB lesions. The results showed that by two data augmentations, the number of DCRs increased by 80% from 570 to 2,859. In the pre-evaluation experiments with the baseline, RetinaNet, the AP improved by 9.9 for six few-shot classes. Our extensive empirical evaluation showed that the AP of PtbNet achieved 28.2, outperforming the other 9 state-of-the-art methods. In the ablation study, combined with BiFPN+ and PSPD-Conv, the AP increased by 2.1, APs increased by 5.0, and grew by an average of 9.8 in APm and APl. In summary, PtbNet not only improves the detection of small-object lesions but also enhances the ability to detect different types of PTB uniformly, which helps physicians diagnose PTB lesions accurately. The code is available at https://github.com/Wenhui-person/PtbNet/tree/master.
肺结核（PTB）是世界上传染性最强的疾病之一，早期发现对于预防肺结核至关重要。数字放射线摄影 (DR) 是检查 PTB 的最常见和最有效的技术。然而，由于DR胸部X线（DCR）表型的多样性和特异性弱，放射科医生很难做出可靠的诊断。尽管人工智能技术在辅助PTB诊断方面取得了相当大的成果，但缺乏识别少镜头类、小物体的PTB病灶的方法。为了解决这些问题，使用几何数据增强来增加 DCR 的大小。为此，针对六个小样本类别实施了扩散概率模型。重要的是，我们提出了一种基于 RetinaNet 的新型多病灶检测器 PtbNet，其构建用于检测 PTB 病灶的小物体。结果显示，通过两次数据增强，DCR 的数量增加了 80%，从 570 个增加到 2,859 个。在使用基线 RetinaNet 的预评估实验中，六个小样本类别的 AP 提高了 9.9。我们广泛的实证评估表明，PtbNet 的 AP 达到了 28.2，优于其他 9 种最先进的方法。在消融研究中，结合BiFPN+和PSPD-Conv，AP增加了2.1，APs增加了5.0，APm和APl平均增长了9.8。综上所述，PtbNet不仅提高了小物体病灶的检测能力，还增强了统一检测不同类型PTB的能力，有助于医生准确诊断PTB病灶。代码可在 https://github.com/Wenhui-person/PtbNet/tree/master 获取。

EI 1558-254X DA 2024-06-29 UT MEDLINE:38923480 PM 38923480 ER
EI 1558-254X DA 2024-06-29 UT MEDLINE：38923480 PM 38923480 ER

AU Ching-Roa, Vincent D. Huang, Chi Z. Giacomelli, Michael G.
AU Ching-Roa、Vincent D. Huang、Chi Z. Giacomelli、Michael G.

Suppression of Subpixel Jitter in Resonant Scanning Systems With Phase-locked Sampling
利用锁相采样抑制谐振扫描系统中的子像素抖动

Resonant scanning is critical to high speed and in vivo imaging in many applications of laser scanning microscopy. However, resonant scanning suffers from well-known image artifacts due to scanner jitter, limiting adoption of high-speed imaging technologies. Here, we introduce a real-time, inexpensive and all electrical method to suppress jitter more than an order of magnitude below the diffraction limit that can be applied to most existing microscope systems with no software changes. By phase-locking imaging to the resonant scanner period, we demonstrate an 86% reduction in pixel jitter, a 15% improvement in point spread function with resonant scanning and show that this approach enables two widely used models of resonant scanners to achieve comparable accuracy to galvanometer scanners running two orders of magnitude slower. Finally, we demonstrate the versatility of this method by retrofitting a commercial two photon microscope and show that this approach enables significant quantitative and qualitative improvements in biological imaging.
在激光扫描显微镜的许多应用中，共振扫描对于高速体内成像至关重要。然而，共振扫描由于扫描仪抖动而存在众所周知的图像伪影，限制了高速成像技术的采用。在这里，我们介绍了一种实时、廉价且全电气的方法，可以将抖动抑制在衍射极限以下一个数量级以上，该方法可以应用于大多数现有的显微镜系统，而无需更改软件。通过将成像锁相到共振扫描仪周期，我们证明了共振扫描的像素抖动减少了 86%，点扩散函数提高了 15%，并表明这种方法使两种广泛使用的共振扫描仪模型能够达到与检流计扫描仪的运行速度慢两个数量级。最后，我们通过改装商用双光子显微镜证明了该方法的多功能性，并表明该方法可以在生物成像方面实现显着的定量和定性改进。

AU Wang, Pengyu Zhang, Huaqi Zhu, Meilu Jiang, Xi Qin, Jing Yuan, Yixuan
王AU、张鹏宇、朱华琪、蒋美露、秦曦、袁静、艺轩

MGIML: Cancer Grading With Incomplete Radiology-Pathology Data via Memory Learning and Gradient Homogenization
MGIML：通过记忆学习和梯度均质化使用不完整的放射病理学数据进行癌症分级

Taking advantage of multi-modal radiology-pathology data with complementary clinical information for cancer grading is helpful for doctors to improve diagnosis efficiency and accuracy. However, radiology and pathology data have distinct acquisition difficulties and costs, which leads to incomplete-modality data being common in applications. In this work, we propose a Memory- and Gradient-guided Incomplete Modal-modal Learning (MGIML) framework for cancer grading with incomplete radiology-pathology data. Firstly, to remedy missing-modality information, we propose a Memory-driven Hetero-modality Complement (MH-Complete) scheme, which constructs modal-specific memory banks constrained by a coarse-grained memory boosting (CMB) loss to record generic radiology and pathology feature patterns, and develops a cross-modal memory reading strategy enhanced by a fine-grained memory consistency (FMC) loss to take missing-modality information from well-stored memories. Secondly, as gradient conflicts exist between missing-modality situations, we propose a Rotation-driven Gradient Homogenization (RG-Homogenize) scheme, which estimates instance-specific rotation matrices to smoothly change the feature-level gradient directions, and computes confidence-guided homogenization weights to dynamically balance gradient magnitudes. By simultaneously mitigating gradient direction and magnitude conflicts, this scheme well avoids the negative transfer and optimization imbalance problems. Extensive experiments on CPTAC-UCEC and CPTAC-PDA datasets show that the proposed MGIML framework performs favorably against state-of-the-art multi-modal methods on missing-modality situations.
利用多模态放射病理数据和互补的临床信息进行癌症分级有助于医生提高诊断效率和准确性。然而，放射学和病理学数据具有明显的获取难度和成本，这导致不完整模态数据在应用中很常见。在这项工作中，我们提出了一种记忆和梯度引导的不完整模态学习（MGIML）框架，用于使用不完整的放射病理学数据进行癌症分级。首先，为了弥补缺失的模态信息，我们提出了一种记忆驱动的异模态补充（MH-Complete）方案，该方案构建了受粗粒度记忆增强（CMB）损失约束的模态特定记忆库，以记录通用放射学和病理学特征模式，并开发了一种跨模态记忆读取策略，该策略通过细粒度记忆一致性（FMC）损失来增强，以从存储良好的记忆中获取缺失模态信息。其次，由于缺失模态情况之间存在梯度冲突，我们提出了一种旋转驱动的梯度均质化（RG-Homogenize）方案，该方案估计特定于实例的旋转矩阵以平滑地改变特征级梯度方向，并计算置信引导的均质化权重来动态平衡梯度大小。通过同时缓解梯度方向和幅度冲突，该方案很好地避免了负传递和优化不平衡问题。对 CPTAC-UCEC 和 CPTAC-PDA 数据集的大量实验表明，所提出的 MGIML 框架在模态缺失的情况下优于最先进的多模态方法。

AU Huang, Wei Zhang, Lei Wang, Zizhou Wang, Lituan
AU Huang、张伟、王雷、王子洲、立团

Exploring Inherent Consistency for Semi-supervised Anatomical Structure Segmentation in Medical Imaging.
探索医学成像中半监督解剖结构分割的固有一致性。

Due to the exorbitant expense of obtaining labeled data in the field of medical image analysis, semi-supervised learning has emerged as a favorable method for the segmentation of anatomical structures. Although semi-supervised learning techniques have shown great potential in this field, existing methods only utilize image-level spatial consistency to impose unsupervised regularization on data in label space. Considering that anatomical structures often possess inherent anatomical properties that have not been focused on in previous works, this study introduces the inherent consistency into semi-supervised anatomical structure segmentation. First, the prediction and the ground-truth are projected into an embedding space to obtain latent representations that encapsulate the inherent anatomical properties of the structures. Then, two inherent consistency constraints are designed to leverage these inherent properties by aligning these latent representations. The proposed method is plug-and-play and can be seamlessly integrated with existing methods, thereby collaborating to improve segmentation performance and enhance the anatomical plausibility of the results. To evaluate the effectiveness of the proposed method, experiments are conducted on three public datasets (ACDC, LA, and Pancreas). Extensive experimental results demonstrate that the proposed method exhibits good generalizability and outperforms several state-of-the-art methods.
由于在医学图像分析领域获取标记数据的费用高昂，半监督学习已成为解剖结构分割的有利方法。尽管半监督学习技术在该领域显示出巨大的潜力，但现有方法仅利用图像级空间一致性对标签空间中的数据施加无监督正则化。考虑到解剖结构往往具有先前工作中未关注的固有解剖特性，本研究将固有一致性引入半监督解剖结构分割。首先，将预测和真实情况投影到嵌入空间中，以获得封装结构固有解剖特性的潜在表示。然后，设计两个固有的一致性约束，通过对齐这些潜在表示来利用这些固有属性。所提出的方法是即插即用的，可以与现有方法无缝集成，从而协作提高分割性能并增强结果的解剖合理性。为了评估所提出方法的有效性，在三个公共数据集（ACDC、LA 和 Pancreas）上进行了实验。大量的实验结果表明，所提出的方法具有良好的通用性，并且优于几种最先进的方法。

AU Fu, Wenli Hu, Huijun Li, Xinyue Guo, Rui Chen, Tao Qian, Xiaohua
AU Fu、胡文丽、李慧君、郭新月、陈锐、陶谦、晓华

A Generalizable Causal-Invariance-Driven Segmentation Model for Peripancreatic Vessels.
一种可推广的因果不变性驱动的胰周血管分割模型。

Segmenting peripancreatic vessels in CT, including the superior mesenteric artery (SMA), the coeliac artery (CA), and the partial portal venous system (PPVS), is crucial for preoperative resectability analysis in pancreatic cancer. However, the clinical applicability of vessel segmentation methods is impeded by the low generalizability on multi-center data, mainly attributed to the wide variations in image appearance, namely the spurious correlation factor. Therefore, we propose a causal-invariance-driven generalizable segmentation model for peripancreatic vessels. It incorporates interventions at both image and feature levels to guide the model to capture causal information by enforcing consistency across datasets, thus enhancing the generalization performance. Specifically, firstly, a contrast-driven image intervention strategy is proposed to construct image-level interventions by generating images with various contrast-related appearances and seeking invariant causal features. Secondly, the feature intervention strategy is designed, where various patterns of feature bias across different centers are simulated to pursue invariant prediction. The proposed model achieved high DSC scores (79.69%, 82.62%, and 83.10%) for the three vessels on a cross-validation set containing 134 cases. Its generalizability was further confirmed on three independent test sets of 233 cases. Overall, the proposed method provides an accurate and generalizable segmentation model for peripancreatic vessels and offers a promising paradigm for increasing the generalizability of segmentation models from a causality perspective. Our source codes will be released at https://github.com/SJTUBME-QianLab/PC_VesselSeg.
在 CT 中分割胰周血管，包括肠系膜上动脉 (SMA)、腹腔动脉 (CA) 和部分门静脉系统 (PPVS)，对于胰腺癌术前可切除性分析至关重要。然而，血管分割方法的临床适用性因多中心数据的通用性较低而受到阻碍，这主要归因于图像外观的巨大变化，即虚假相关因子。因此，我们提出了一种因果不变性驱动的胰周血管广义分割模型。它结合了图像和特征级别的干预措施，通过强制数据集之间的一致性来指导模型捕获因果信息，从而提高泛化性能。具体来说，首先，提出了一种对比度驱动的图像干预策略，通过生成具有各种对比度相关外观的图像并寻求不变的因果特征来构建图像级干预。其次，设计了特征干预策略，模拟不同中心的各种特征偏差模式以追求不变的预测。所提出的模型在包含 134 个案例的交叉验证集上为三艘船取得了高 DSC 分数（79.69%、82.62% 和 83.10%）。其普遍性在包含 233 个案例的三个独立测试集上得到进一步证实。总体而言，所提出的方法为胰周血管提供了准确且可概括的分割模型，并为从因果关系角度提高分割模型的概括性提供了一个有前景的范例。我们的源代码将在 https://github.com/SJTUBME-QianLab/PC_VesselSeg 发布。

AU Wang, Jinhong Xu, Zhe Zheng, Wenhao Ying, Haochao Chen, Tingting Liu, Zuozhu Chen, Danny Z. Yao, Ke Wu, Jian
王AU、徐金红、郑喆、应文浩、陈浩超、刘婷婷、陈佐助、姚子明、吴克、Jian

A Transformer-Based Knowledge Distillation Network for Cortical Cataract Grading
基于 Transformer 的皮质白内障分级知识蒸馏网络

Cortical cataract, a common type of cataract, is particularly difficult to be diagnosed automatically due to the complex features of the lesions. Recently, many methods based on edge detection or deep learning were proposed for automatic cataract grading. However, these methods suffer a large performance drop in cortical cataract grading due to the more complex cortical opacities and uncertain data. In this paper, we propose a novel Transformer-based Knowledge Distillation Network, called TKD-Net, for cortical cataract grading. To tackle the complex opacity problem, we first devise a zone decomposition strategy to extract more refined features and introduce special sub-scores to consider critical factors of clinical cortical opacity assessment (location, area, density) for comprehensive quantification. Next, we develop a multi-modal mix-attention Transformer to efficiently fuse sub-scores and image modality for complex feature learning. However, obtaining the sub-score modality is a challenge in the clinic, which could cause the modality missing problem instead. To simultaneously alleviate the issues of modality missing and uncertain data, we further design a Transformer-based knowledge distillation method, which uses a teacher model with perfect data to guide a student model with modality-missing and uncertain data. We conduct extensive experiments on a dataset of commonly-used slit-lamp images annotated by the LOCS III grading system to demonstrate that our TKD-Net outperforms state-of-the-art methods, as well as the effectiveness of its key components.
皮质性白内障是白内障的一种常见类型，由于病变特征复杂，自动诊断尤为困难。最近，提出了许多基于边缘检测或深度学习的方法用于自动白内障分级。然而，由于更复杂的皮质混浊和不确定的数据，这些方法在皮质白内障分级方面的性能大幅下降。在本文中，我们提出了一种新颖的基于 Transformer 的知识蒸馏网络，称为 TKD-Net，用于皮质白内障分级。为了解决复杂的不透明问题，我们首先设计了一种区域分解策略来提取更精细的特征，并引入特殊的子分数来考虑临床皮质不透明评估的关键因素（位置、面积、密度）以进行全面量化。接下来，我们开发了一个多模态混合注意力 Transformer，以有效地融合子分数和图像模态以进行复杂的特征学习。然而，获得子评分模态在临床中是一个挑战，这可能会导致模态缺失问题。为了同时缓解模态缺失和不确定数据的问题，我们进一步设计了一种基于 Transformer 的知识蒸馏方法，该方法使用具有完美数据的教师模型来指导具有模态缺失和不确定数据的学生模型。我们对由 LOCS III 分级系统注释的常用裂隙灯图像数据集进行了广泛的实验，以证明我们的 TKD-Net 优于最先进的方法及其关键组件的有效性。

AU Vu, Tri Klippel, Paul Canning, Aidan J. Ma, Chenshuo Zhang, Huijuan Kasatkina, Ludmila A. Tang, Yuqi Xia, Jun Verkhusha, Vladislav V. Tuan Vo-Dinh Jing, Yun Yao, Junjie
AU Vu、Tri Klippel、Paul Canning、Aidan J. Ma、张晨硕、Huijuan Kasatkina、Ludmila A. Tang、Yuqi Xia、Jun Verkhusha、Vladislav V. Tuan Vo-Dinh Jing、Yun Yao、Junjie

On the Importance of Low-Frequency Signals in Functional and Molecular Photoacoustic Computed Tomography
低频信号在功能和分子光声计算机断层扫描中的重要性

In photoacoustic computed tomography (PACT) with short-pulsed laser excitation, wideband acoustic signals are generated in biological tissues with frequencies related to the effective shapes and sizes of the optically absorbing targets. Low-frequency photoacoustic signal components correspond to slowly varying spatial features and are often omitted during imaging due to the limited detection bandwidth of the ultrasound transducer, or during image reconstruction as undesired background that degrades image contrast. Here we demonstrate that low-frequency photoacoustic signals, in fact, contain functional and molecular information, and can be used to enhance structural visibility, improve quantitative accuracy, and reduce spare-sampling artifacts. We provide an in-depth theoretical analysis of low-frequency signals in PACT, and experimentally evaluate their impact on several representative PACT applications, such as mapping temperature in photothermal treatment, measuring blood oxygenation in a hypoxia challenge, and detecting photoswitchable molecular probes in deep organs. Our results strongly suggest that low-frequency signals are important for functional and molecular PACT.
在采用短脉冲激光激发的光声计算机断层扫描 (PACT) 中，生物组织中会产生宽带声信号，其频率与光吸收目标的有效形状和尺寸相关。低频光声信号分量对应于缓慢变化的空间特征，并且由于超声换能器的有限检测带宽而在成像期间经常被忽略，或者在图像重建期间作为降低图像对比度的不需要的背景而被忽略。在这里，我们证明低频光声信号实际上包含功能和分子信息，可用于增强结构可视性、提高定量精度并减少备用采样伪影。我们对 PACT 中的低频信号进行了深入的理论分析，并通过实验评估了它们对几种代表性 PACT 应用的影响，例如绘制光热治疗中的温度图、测量缺氧挑战中的血氧饱和度以及检测深部光开关分子探针。器官。我们的结果强烈表明低频信号对于功能和分子 PACT 很重要。

AU Xie, Jiaming Zhang, Qing Cui, Zhiming Ma, Chong Zhou, Yan Wang, Wenping Shen, Dinggang
谢AU、张家明、崔青、马志明、周冲、王艳、沉文平、丁刚

Integrating Eye Tracking with Grouped Fusion Networks for Semantic Segmentation on Mammogram Images.
将眼动追踪与分组融合网络相结合，对乳房 X 光图像进行语义分割。

Medical image segmentation has seen great progress in recent years, largely due to the development of deep neural networks. However, unlike in computer vision, high-quality clinical data is relatively scarce, and the annotation process is often a burden for clinicians. As a result, the scarcity of medical data limits the performance of existing medical image segmentation models. In this paper, we propose a novel framework that integrates eye tracking information from experienced radiologists during the screening process to improve the performance of deep neural networks with limited data. Our approach, a grouped hierarchical network, guides the network to learn from its faults by using gaze information as weak supervision. We demonstrate the effectiveness of our framework on mammogram images, particularly for handling segmentation classes with large scale differences.We evaluate the impact of gaze information on medical image segmentation tasks and show that our method achieves better segmentation performance compared to state-of-the-art models. A robustness study is conducted to investigate the influence of distraction or inaccuracies in gaze collection. We also develop a convenient system for collecting gaze data without interrupting the normal clinical workflow. Our work offers novel insights into the potential benefits of integrating gaze information into medical image segmentation tasks.
近年来，医学图像分割取得了长足的进步，这很大程度上归功于深度神经网络的发展。然而，与计算机视觉不同的是，高质量的临床数据相对稀缺，并且注释过程往往是临床医生的负担。因此，医学数据的稀缺限制了现有医学图像分割模型的性能。在本文中，我们提出了一种新颖的框架，该框架在筛选过程中集成了经验丰富的放射科医生的眼动追踪信息，以提高深度神经网络在有限数据下的性能。我们的方法是一个分组分层网络，通过使用注视信息作为弱监督来引导网络从错误中学习。我们展示了我们的框架在乳房X光图像上的有效性，特别是在处理具有大尺度差异的分割类方面。我们评估了注视信息对医学图像分割任务的影响，并表明我们的方法与现有技术相比实现了更好的分割性能艺术模型。进行稳健性研究来调查视线收集中注意力分散或不准确的影响。我们还开发了一种方便的系统，用于在不中断正常临床工作流程的情况下收集眼动数据。我们的工作为将注视信息集成到医学图像分割任务中的潜在好处提供了新颖的见解。

AU Gao, Bin Yu, Aiju Qiao, Chen Calhoun, Vince D Stephen, Julia M Wilson, Tony W Wang, Yu-Ping
AU 高、于斌、乔爱菊、Chen Calhoun、Vince D Stephen、Julia M Wilson、Tony W Wang、Yu-Ping

An Explainable Unified Framework of Spatio-Temporal Coupling Learning with Application to Dynamic Brain Functional Connectivity Analysis.
时空耦合学习的可解释统一框架及其应用于动态脑功能连接分析。

Time-series data such as fMRI and MEG carry a wealth of inherent spatio-temporal coupling relationship, and their modeling via deep learning is essential for uncovering biological mechanisms. However, current machine learning models for mining spatio-temporal information usually overlook this intrinsic coupling association, in addition to poor explainability. In this paper, we present an explainable learning framework for spatio-temporal coupling. Specifically, this framework constructs a deep learning network based on spatio-temporal correlation, which can well integrate the time-varying coupled relationships between node representation and inter-node connectivity. Furthermore, it explores spatio-temporal evolution at each time step, providing a better explainability of the analysis results. Finally, we apply the proposed framework to brain dynamic functional connectivity (dFC) analysis. Experimental results demonstrate that it can effectively capture the variations in dFC during brain development and the evolution of spatio-temporal information at the resting state. Two distinct developmental functional connectivity (FC) patterns are identified. Specifically, the connectivity among regions related to emotional regulation decreases, while the connectivity associated with cognitive activities increases. In addition, children and young adults display notable cyclic fluctuations in resting-state brain dFC.
fMRI和MEG等时间序列数据具有丰富的固有时空耦合关系，通过深度学习对其进行建模对于揭示生物机制至关重要。然而，当前用于挖掘时空信息的机器学习模型除了可解释性差之外，通常忽视了这种内在的耦合关联。在本文中，我们提出了一个可解释的时空耦合学习框架。具体来说，该框架构建了一个基于时空相关性的深度学习网络，可以很好地融合节点表示和节点间连接性之间的时变耦合关系。此外，它还探索了每个时间步骤的时空演化，为分析结果提供了更好的可解释性。最后，我们将所提出的框架应用于大脑动态功能连接（dFC）分析。实验结果表明，它可以有效捕捉大脑发育过程中dFC的变化以及静息状态下时空信息的演化。确定了两种不同的发育功能连接（FC）模式。具体来说，与情绪调节相关的区域之间的连通性减少，而与认知活动相关的区域之间的连通性增加。此外，儿童和年轻人的静息态大脑 dFC 表现出显着的周期性波动。

AU Huang, Wenhao Gong, Haifan Zhang, Huan Wang, Yu Wan, Xiang Li, Guanbin Li, Haofeng Shen, Hong
黄AU、龚文浩、张海帆、王欢、万宇、李翔、李冠斌、沉浩峰、洪

BCNet: Bronchus Classification via Structure Guided Representation Learning.
BCNet：通过结构引导表示学习进行支气管分类。

CT-based bronchial tree analysis is a key step for the diagnosis of lung and airway diseases. However, the topology of bronchial trees varies across individuals, which presents a challenge to the automatic bronchus classification. To solve this issue, we propose the Bronchus Classification Network (BCNet), a structure-guided framework that exploits the segment-level topological information using point clouds to learn the voxel-level features. BCNet has two branches, a Point-Voxel Graph Neural Network (PV-GNN) for segment classification, and a Convolutional Neural Network (CNN) for voxel labeling. The two branches are simultaneously trained to learn topology-aware features for their shared backbone while it is feasible to run only the CNN branch for the inference. Therefore, BCNet maintains the same inference efficiency as its CNN baseline. Experimental results show that BCNet significantly exceeds the state-of-the-art methods by over 8.0% both on F1-score for classifying bronchus. Furthermore, we contribute BronAtlas: an open-access benchmark of bronchus imaging analysis with high-quality voxel-wise annotations of both anatomical and abnormal bronchial segments. The benchmark is available at link1.
基于 CT 的支气管树分析是诊断肺部和气道疾病的关键步骤。然而，支气管树的拓扑结构因个体而异，这对支气管自动分类提出了挑战。为了解决这个问题，我们提出了支气管分类网络（BCNet），这是一种结构引导框架，利用点云利用段级拓扑信息来学习体素级特征。 BCNet 有两个分支，用于分段分类的点体素图神经网络（PV-GNN）和用于体素标记的卷积神经网络（CNN）。这两个分支同时接受训练，以学习其共享主干的拓扑感知特征，同时仅运行 CNN 分支进行推理也是可行的。因此，BCNet 保持了与其 CNN 基线相同的推理效率。实验结果表明，BCNet 在支气管分类的 F1 分数上均显着超过最先进的方法 8.0% 以上。此外，我们还贡献了 BronAtlas：支气管成像分析的开放获取基准，对解剖和异常支气管段进行高质量的体素注释。该基准测试可在 link1 上找到。

AU Caravaca, Javier Bobba, Kondapa Naidu Du, Shixian Peter, Robin Gullberg, Grant T. Bidkar, Anil P. Flavell, Robert R. Seo, Youngho
AU Caravaca、Javier Bobba、Kondapa Naidu Du、Shixian Peter、Robin Gullberg、Grant T. Bidkar、Anil P. Flavell、Robert R. Seo、Youngho

A Technique to Quantify Very Low Activities in Regions of Interest With a Collimatorless Detector
使用无准直仪探测器量化感兴趣区域中极低活性的技术

We present a new method to measure sub-microcurie activities of photon-emitting radionuclides in organs and lesions of small animals in vivo. Our technique, named the collimator-less likelihood fit, combines a very high sensitivity collimatorless detector with a Monte Carlo-based likelihood fit in order to estimate the activities in previously segmented regions of interest along with their uncertainties. This is done directly from the photon projections in our collimatorless detector and from the region of interest segmentation provided by an x-ray computed tomography scan. We have extensively validated our approach with Ac-225 experimentally in spherical phantoms and mouse phantoms, and also numerically with simulations of a realistic mouse anatomy. Our method yields statistically unbiased results with uncertainties smaller than 20% for activities as low as similar to 111Bq (3nCi) and for exposures under 30 minutes. We demonstrate that our method yields more robust recovery coefficients when compared to SPECT imaging with a commercial pre-clinical scanner, specially at very low activities. Thus, our technique is complementary to traditional SPECT/CT imaging since it provides a more accurate and precise organ and tumor dosimetry, with a more limited spatial information. Finally, our technique is specially significant in extremely low-activity scenarios when SPECT/CT imaging is simply not viable.
我们提出了一种新方法来测量小动物体内器官和病变中光子发射放射性核素的亚微居里活性。我们的技术称为无准直器似然拟合，它将非常高灵敏度的无准直器探测器与基于蒙特卡罗的似然拟合相结合，以估计先前分割的感兴趣区域中的活动及其不确定性。这是直接通过无准直仪探测器中的光子投影和 X 射线计算机断层扫描提供的感兴趣区域分割来完成的。我们使用 Ac-225 在球形模型和小鼠模型中进行了实验，并通过对真实小鼠解剖结构进行数值模拟来广泛验证了我们的方法。我们的方法可产生统计上无偏差的结果，对于低至 111Bq (3nCi) 的活动以及暴露时间低于 30 分钟的情况，不确定性小于 20%。我们证明，与使用商用临床前扫描仪进行 SPECT 成像相比，我们的方法可产生更稳健的恢复系数，特别是在活性非常低的情况下。因此，我们的技术是对传统 SPECT/CT 成像的补充，因为它提供了更准确和精确的器官和肿瘤剂量测定，且空间信息更有限。最后，当 SPECT/CT 成像根本不可行时，我们的技术在活动极低的情况下特别重要。

AU Cui, Hengfei Li, Yan Wang, Yifan Xu, Di Wu, Lian-Ming Xia, Yong
崔AU、李恒飞、王艳、徐一凡、吴迪、夏连明、勇

Toward Accurate Cardiac MRI Segmentation With Variational Autoencoder-Based Unsupervised Domain Adaptation
通过基于变分自动编码器的无监督域适应实现准确的心脏 MRI 分割

Accurate myocardial segmentation is crucial in the diagnosis and treatment of myocardial infarction (MI), especially in Late Gadolinium Enhancement (LGE) cardiac magnetic resonance (CMR) images, where the infarcted myocardium exhibits a greater brightness. However, segmentation annotations for LGE images are usually not available. Although knowledge gained from CMR images of other modalities with ample annotations, such as balanced-Steady State Free Precession (bSSFP), can be transferred to the LGE images, the difference in image distribution between the two modalities (i.e., domain shift) usually results in a significant degradation in model performance. To alleviate this, an end-to-end Variational autoencoder based feature Alignment Module Combining Explicit and Implicit features (VAMCEI) is proposed. We first re-derive the Kullback-Leibler (KL) divergence between the posterior distributions of the two domains as a measure of the global distribution distance. Second, we calculate the prototype contrastive loss between the two domains, bringing closer the prototypes of the same category across domains and pushing away the prototypes of different categories within or across domains. Finally, a domain discriminator is added to the output space, which indirectly aligns the feature distribution and forces the extracted features to be more favorable for segmentation. In addition, by combining CycleGAN and VAMCEI, we propose a more refined multi-stage unsupervised domain adaptation (UDA) framework for myocardial structure segmentation. We conduct extensive experiments on the MSCMRSeg 2019, MyoPS 2020 and MM-WHS 2017 datasets. The experimental results demonstrate that our framework achieves superior performances than state-of-the-art methods.
准确的心肌分割对于心肌梗死（MI）的诊断和治疗至关重要，特别是在晚期钆增强（LGE）心脏磁共振（CMR）图像中，梗塞心肌表现出更大的亮度。然而，LGE 图像的分割注释通常不可用。尽管从具有充足注释的其他模态的 CMR 图像（例如平衡稳态自由进动 (bSSFP)）获得的知识可以转移到 LGE 图像，但这两种模态之间的图像分布差异（即域偏移）通常会导致模型性能显着下降。为了缓解这个问题，提出了一种基于端到端变分自动编码器的特征对齐模块组合显式和隐式特征（VAMCEI）。我们首先重新推导两个域的后验分布之间的 Kullback-Leibler (KL) 散度作为全局分布距离的度量。其次，我们计算两个领域之间的原型对比损失，使跨领域的同一类别的原型更加接近，并推开领域内或跨领域的不同类别的原型。最后，在输出空间中添加域鉴别器，间接对齐特征分布并强制提取的特征更有利于分割。此外，通过结合CycleGAN和VAMCEI，我们提出了一种更精细的多阶段无监督域适应（UDA）框架用于心肌结构分割。我们在 MSCMRSeg 2019、MyoPS 2020 和 MM-WHS 2017 数据集上进行了广泛的实验。实验结果表明，我们的框架比最先进的方法具有更优越的性能。

AU Wang, Zihao Yang, Yingyu Chen, Yuzhou Yuan, Tingting Sermesant, Maxime Delingette, Herve Wu, Ona
AU Wang、杨子豪、陈英宇、袁雨洲、婷婷 Sermesant、Maxime Delingette、Herve Wu、Ona

Mutual Information Guided Diffusion for Zero-Shot Cross-Modality Medical Image Translation
零样本跨模态医学图像翻译的互信息引导扩散

Cross-modality data translation has attracted great interest in medical image computing. Deep generative models show performance improvement in addressing related challenges. Nevertheless, as a fundamental challenge in image translation, the problem of zero-shot learning cross-modality image translation with fidelity remains unanswered. To bridge this gap, we propose a novel unsupervised zero-shot learning method called Mutual Information guided Diffusion Model, which learns to translate an unseen source image to the target modality by leveraging the inherent statistical consistency of Mutual Information between different modalities. To overcome the prohibitive high dimensional Mutual Information calculation, we propose a differentiable local-wise mutual information layer for conditioning the iterative denoising process. The Local-wise-Mutual-Information-Layer captures identical cross-modality features in the statistical domain, offering diffusion guidance without relying on direct mappings between the source and target domains. This advantage allows our method to adapt to changing source domains without the need for retraining, making it highly practical when sufficient labeled source domain data is not available. We demonstrate the superior performance of MIDiffusion in zero-shot cross-modality translation tasks through empirical comparisons with other generative models, including adversarial-based and diffusion-based models. Finally, we showcase the real-world application of MIDiffusion in 3D zero-shot learning-based cross-modality image segmentation tasks.
跨模态数据翻译引起了医学图像计算的极大兴趣。深度生成模型在解决相关挑战方面表现出性能改进。然而，作为图像翻译的一个基本挑战，零样本学习保真度的跨模态图像翻译问题仍然没有得到解决。为了弥补这一差距，我们提出了一种新颖的无监督零样本学习方法，称为互信息引导扩散模型，该方法学习通过利用不同模态之间互信息的固有统计一致性将看不见的源图像转换为目标模态。为了克服令人望而却步的高维互信息计算，我们提出了一种可微的局部互信息层来调节迭代去噪过程。局部互信息层捕获统计域中相同的跨模态特征，提供扩散指导，而不依赖于源域和目标域之间的直接映射。这一优势使我们的方法能够适应不断变化的源域，而无需重新训练，这使得它在没有足够的标记源域数据可用时非常实用。我们通过与其他生成模型（包括基于对抗性和基于扩散的模型）的实证比较，证明了 MIDiffusion 在零样本跨模态翻译任务中的优越性能。最后，我们展示了 MIDiffusion 在基于 3D 零样本学习的跨模态图像分割任务中的实际应用。

AU Yang, Kun Li, Qiang Xu, Jiahong Tang, Meng-Xing Wang, Zhibiao Tsui, Po-Hsiang Zhou, Xiaowei
欧阳、李坤、徐强、唐嘉宏、王孟兴、徐志标、周博翔、晓伟

Frequency-Domain Robust PCA for Real-Time Monitoring of HIFU Treatment
用于实时监测 HIFU 治疗的频域鲁棒 PCA

High intensity focused ultrasound (HIFU) is a thriving non-invasive technique for thermal ablation of tumors, but significant challenges remain in its real-time monitoring with medical imaging. Ultrasound imaging is one of the main imaging modalities for monitoring HIFU surgery in organs other than the brain, mainly due to its good temporal resolution. However, strong acoustic interference from HIFU irradiation severely obscures the B-mode images and compromises the monitoring. To address this problem, we proposed a frequency-domain robust principal component analysis (FRPCA) method to separate the HIFU interference from the contaminated B-mode images. Ex-vivo and in-vivo experiments were conducted to validate the proposed method based on a clinical HIFU therapy system combined with an ultrasound imaging platform. The performance of the FRPCA method was compared with the conventional notch filtering method. Results demonstrated that the FRPCA method can effectively remove HIFU interference from the B-mode images, which allowed HIFU-induced grayscale changes at the focal region to be recovered. Compared to notch-filtered images, the FRPCA-processed images showed an 8.9% improvement in terms of the structural similarity (SSIM) index to the uncontaminated B-mode images. These findings demonstrate that the FRPCA method presents an effective signal processing framework to remove the strong HIFU acoustic interference, obtains better dynamic visualization in monitoring the HIFU irradiation process, and offers great potential to improve the efficacy and safety of HIFU treatment and other focused ultrasound related applications.
高强度聚焦超声（HIFU）是一种蓬勃发展的肿瘤热消融非侵入性技术，但其医学成像实时监测仍面临重大挑战。超声成像是监测大脑以外器官 HIFU 手术的主要成像方式之一，主要是由于其良好的时间分辨率。然而，HIFU 辐射产生的强烈声干扰严重模糊了 B 型图像并影响了监测。为了解决这个问题，我们提出了一种频域鲁棒主成分分析（FRPCA）方法来将 HIFU 干扰与污染的 B 模式图像分开。进行了体外和体内实验来验证所提出的基于临床 HIFU 治疗系统结合超声成像平台的方法。将FRPCA方法的性能与传统的陷波滤波方法进行了比较。结果表明，FRPCA 方法可以有效去除 B 模式图像中的 HIFU 干扰，从而可以恢复 HIFU 引起的焦点区域的灰度变化。与陷波滤波图像相比，FRPCA 处理的图像与未污染的 B 模式图像的结构相似性 (SSIM) 指数提高了 8.9%。这些研究结果表明，FRPCA方法提供了一种有效的信号处理框架来消除强HIFU声学干扰，在监测HIFU照射过程中获得更好的动态可视化，并为提高HIFU治疗和其他聚焦超声相关的疗效和安全性提供了巨大的潜力。应用程序。

AU Mahapatra, Dwarikanath Yepes, Antonio Jimeno Bozorgtabar, Behzad Roy, Sudipta Ge, Zongyuan Reyes, Mauricio
AU Mahapatra、Dwarikanath Yepes、Antonio Jimeno Bozorgtabar、Behzad Roy、Sudipta Ge、Zongyuan Reyes、Mauricio

Multi-Label Generalized Zero Shot Chest Xray Classification By Combining Image-Text Information With Feature Disentanglement.
通过将图像文本信息与特征解缠结合起来进行多标签广义零射击胸部 X 射线分类。

In fully supervised learning-based medical image classification, the robustness of a trained model is influenced by its exposure to the range of candidate disease classes. Generalized Zero Shot Learning (GZSL) aims to correctly predict seen and novel unseen classes. Current GZSL approaches have focused mostly on the single-label case. However, it is common for chest X-rays to be labelled with multiple disease classes. We propose a novel multi-modal multi-label GZSL approach that leverages feature disentanglement andmulti-modal information to synthesize features of unseen classes. Disease labels are processed through a pre-trained BioBert model to obtain text embeddings that are used to create a dictionary encoding similarity among different labels. We then use disentangled features and graph aggregation to learn a second dictionary of inter-label similarities. A subsequent clustering step helps to identify representative vectors for each class. The multi-modal multi-label dictionaries and the class representative vectors are used to guide the feature synthesis step, which is the most important component of our pipeline, for generating realistic multi-label disease samples of seen and unseen classes. Our method is benchmarked against multiple competing methods and we outperform all of them based on experiments conducted on the publicly available NIH and CheXpert chest X-ray datasets.
在基于完全监督学习的医学图像分类中，经过训练的模型的稳健性受到其接触候选疾病类别范围的影响。广义零样本学习（GZSL）旨在正确预测已见过的和新的未见的类。目前的 GZSL 方法主要关注单标签情况。然而，胸部 X 光检查通常会标记多种疾病类别。我们提出了一种新颖的多模态多标签 GZSL 方法，该方法利用特征解缠结和多模态信息来合成未见过的类的特征。疾病标签通过预先训练的 BioBert 模型进行处理，以获得文本嵌入，用于创建编码不同标签之间相似性的字典。然后，我们使用解缠结的特征和图聚合来学习标签间相似性的第二个字典。随后的聚类步骤有助于识别每个类别的代表向量。多模态多标签字典和类代表向量用于指导特征合成步骤，这是我们管道中最重要的组成部分，用于生成可见和未见类别的真实多标签疾病样本。我们的方法以多种竞争方法为基准，并且根据在公开的 NIH 和 CheXpert 胸部 X 射线数据集上进行的实验，我们优于所有方法。

AU Yin, Yi Clark, Alys R. Collins, Sally L.
AU Yin、Yi Clark、Alys R. Collins、Sally L.

3D Single Vessel Fractional Moving Blood Volume (3D-svFMBV): Fully Automated Tissue Perfusion Estimation Using Ultrasound
3D 单血管分数移动血容量 (3D-svFMBV)：使用超声进行全自动组织灌注估计

Power Doppler ultrasound (PD-US) is the ideal modality to assess tissue perfusion as it is cheap, patient-friendly and does not require ionizing radiation. However, meaningful inter-patient comparison only occurs if differences in tissue-attenuation are corrected for. This can be done by standardizing the PD-US signal to a blood vessel assumed to have 100% vascularity. The original method to do this is called fractional moving blood volume (FMBV). We describe a novel, fully-automated method combining image processing, numerical modelling, and deep learning to estimate three-dimensional single vessel fractional moving blood volume (3D-svFMBV). We map the PD signals to a characteristic intensity profile within a single large vessel to define the standardization value at the high shear vessel margins. This removes the need for mathematical correction for background signal which can introduce error. The 3D-svFMBV was first tested on synthetic images generated using the characteristics of uterine artery and physiological ultrasound noise levels, demonstrating prediction of standardization value close to the theoretical ideal. Clinical utility was explored using 143 first-trimester placental ultrasound volumes. More biologically plausible perfusion estimates were obtained, showing improved prediction of pre-eclampsia compared with those generated with the semi-automated original 3D-FMBV technique. The proposed 3D-svFMBV method overcomes the limitations of the original technique to provide accurate and robust placental perfusion estimation. This not only has the potential to provide an early pregnancy screening tool but may also be used to assess perfusion of different organs and tumors.
能量多普勒超声 (PD-US) 是评估组织灌注的理想方式，因为它价格便宜、对患者友好且不需要电离辐射。然而，只有在组织衰减的差异得到纠正的情况下，才会发生有意义的患者间比较。这可以通过将 PD-US 信号标准化到假设具有 100% 血管分布的血管来完成。最初的方法称为移动血容量分数 (FMBV)。我们描述了一种新颖的全自动方法，结合图像处理、数值建模和深度学习来估计三维单血管移动血容量（3D-svFMBV）。我们将 PD 信号映射到单个大血管内的特征强度分布，以定义高剪切血管边缘的标准化值。这消除了对背景信号进行数学校正的需要，因为背景信号可能会引入误差。 3D-svFMBV 首先在使用子宫动脉特征和生理超声噪声水平生成的合成图像上进行测试，证明标准化值的预测接近理论理想值。使用 143 个早孕期胎盘超声体积探讨了临床实用性。获得了更符合生物学合理性的灌注估计值，与半自动原始 3D-FMBV 技术生成的灌注估计值相比，显示出对先兆子痫的预测得到了改善。所提出的 3D-svFMBV 方法克服了原始技术的局限性，提供准确且稳健的胎盘灌注估计。这不仅有可能提供早孕筛查工具，还可以用于评估不同器官和肿瘤的灌注。

AU Wang, Pengyu Zhang, Huaqi Yuan, Yixuan
王AU、张鹏宇、袁华琪、艺轩

MCPL: Multi-modal Collaborative Prompt Learning for Medical Vision-Language Model.
MCPL：医学视觉语言模型的多模式协作即时学习。

Multi-modal prompt learning is a high-performance and cost-effective learning paradigm, which learns text as well as image prompts to tune pre-trained vision-language (V-L) models like CLIP for adapting multiple downstream tasks. However, recent methods typically treat text and image prompts as independent components without considering the dependency between prompts. Moreover, extending multi-modal prompt learning into the medical field poses challenges due to a significant gap between general- and medical-domain data. To this end, we propose a Multi-modal Collaborative Prompt Learning (MCPL) pipeline to tune a frozen V-L model for aligning medical text-image representations, thereby achieving medical downstream tasks. We first construct the anatomy-pathology (AP) prompt for multi-modal prompting jointly with text and image prompts. The AP prompt introduces instance-level anatomy and pathology information, thereby making a V-L model better comprehend medical reports and images. Next, we propose graph-guided prompt collaboration module (GPCM), which explicitly establishes multi-way couplings between the AP, text, and image prompts, enabling collaborative multi-modal prompt producing and updating for more effective prompting. Finally, we develop a novel prompt configuration scheme, which attaches the AP prompt to the query and key, and the text/image prompt to the value in self-attention layers for improving the interpretability of multi-modal prompts. Extensive experiments on numerous medical classification and object detection datasets show that the proposed pipeline achieves excellent effectiveness and generalization. Compared with state-of-the-art prompt learning methods, MCPL provides a more reliable multi-modal prompt paradigm for reducing tuning costs of V-L models on medical downstream tasks. Our code: https://github.com/CUHK-AIM-Group/MCPL.
多模态提示学习是一种高性能且经济高效的学习范例，它学习文本和图像提示来调整预先训练的视觉语言（VL）模型（例如 CLIP），以适应多个下游任务。然而，最近的方法通常将文本和图像提示视为独立的组件，而不考虑提示之间的依赖性。此外，由于通用数据和医学领域数据之间存在巨大差距，将多模式即时学习扩展到医学领域也带来了挑战。为此，我们提出了一种多模态协作提示学习（MCPL）管道来调整冻结的 VL 模型，以对齐医学文本图像表示，从而实现医学下游任务。我们首先构建解剖病理学（AP）提示，用于结合文本和图像提示的多模式提示。 AP提示引入了实例级解剖学和病理学信息，从而使VL模型更好地理解医学报告和图像。接下来，我们提出了图引导提示协作模块（GPCM），它明确地建立了AP、文本和图像提示之间的多路耦合，实现协作式多模态提示生成和更新，以实现更有效的提示。最后，我们开发了一种新颖的提示配置方案，将 AP 提示附加到查询和密钥，将文本/图像提示附加到自注意力层中的值，以提高多模态提示的可解释性。对大量医学分类和目标检测数据集的广泛实验表明，所提出的流程实现了出色的有效性和泛化性。与最先进的提示学习方法相比，MCPL 提供了更可靠的多模态提示范式，可降低 VL 模型在医疗下游任务上的调整成本。我们的代码：https://github.com/CUHK-AIM-Group/MCPL。

AU Mei, Xin Yang, Libin Gao, Denghong Cai, Xiaoyan Han, Junwei Liu, Tianming
区梅、杨鑫、高立斌、蔡登红、韩晓燕、刘俊伟、天明

PhraseAug: An Augmented Medical Report Generation Model with Phrasebook.
PhraseAug：带有 Phrasebook 的增强型医疗报告生成模型。

Medical report generation is a valuable and challenging task, which automatically generates accurate and fluent diagnostic reports for medical images, reducing workload of radiologists and improving efficiency of disease diagnosis. Fine-grained alignment of medical images and reports facilitates the exploration of close correlations between images and texts, which is crucial for cross-modal generation. However, visual and linguistic biases caused by radiologists' writing styles make cross-modal image-text alignment difficult. To alleviate visual-linguistic bias, this paper discretizes medical reports and introduces an intermediate modality, i.e. phrasebook, consisting of key noun phrases. As discretized representation of medical reports, phrasebook contains both disease-related medical terms, and synonymous phrases representing different writing styles which can identify synonymous sentences, thereby promoting fine-grained alignment between images and reports. In this paper, an augmented two-stage medical report generation model with phrasebook (PhraseAug) is developed, which combines medical images, clinical histories and writing styles to generate diagnostic reports. In the first stage, phrasebook is used to extract semantically relevant important features and predict key phrases contained in the report. In the second stage, medical reports are generated according to the predicted key phrases which contain synonymous phrases, promoting our model to adapt to different writing styles and generating diverse medical reports. Experimental results on two public datasets, IU-Xray and MIMIC-CXR, demonstrate that our proposed PhraseAug outperforms state-of-the-art baselines.
医学报告生成是一项有价值且具有挑战性的任务，它自动生成准确、流畅的医学图像诊断报告，减少放射科医生的工作量，提高疾病诊断的效率。医学图像和报告的细粒度对齐有助于探索图像和文本之间的密切相关性，这对于跨模式生成至关重要。然而，放射科医生的写作风格造成的视觉和语言偏差使得跨模式图像文本对齐变得困难。为了减轻视觉语言偏差，本文对医疗报告进行离散化，并引入了一种中间模式，即由关键名词短语组成的短语手册。作为医学报告的离散化表示，短语手册既包含与疾病相关的医学术语，也包含代表不同写作风格的同义词短语，可以识别同义句子，从而促进图像和报告之间的细粒度对齐。本文开发了一种带有短语手册（PhraseAug）的增强型两阶段医疗报告生成模型，该模型结合医学图像、临床病史和写作风格来生成诊断报告。在第一阶段，短语手册用于提取语义相关的重要特征并预测报告中包含的关键短语。在第二阶段，根据包含同义短语的预测关键词生成医疗报告，促进我们的模型适应不同的写作风格并生成多样化的医疗报告。两个公共数据集 IU-Xray 和 MIMIC-CXR 的实验结果表明，我们提出的 PhraseAug 优于最先进的基线。

AU Wang, Enpeng Liu, Yueang Tu, Puxun Taylor, Zeike A Chen, Xiaojun
王AU、刘恩鹏、屠悦昂、泰勒普勋、陈泽科、晓军

Video-based Soft Tissue Deformation Tracking for Laparoscopic Augmented Reality-based Navigation in Kidney Surgery.
基于视频的软组织变形跟踪，用于肾脏手术中腹腔镜增强现实导航。

Minimally invasive surgery (MIS) remains technically demanding due to the difficulty of tracking hidden critical structures within the moving anatomy of the patient. In this study, we propose a soft tissue deformation tracking augmented reality (AR) navigation pipeline for laparoscopic surgery of the kidneys. The proposed navigation pipeline addresses two main sub-problems: the initial registration and deformation tracking. Our method utilizes preoperative MR or CT data and binocular laparoscopes without any additional interventional hardware. The initial registration is resolved through a probabilistic rigid registration algorithm and elastic compensation based on dense point cloud reconstruction. For deformation tracking, the sparse feature point displacement vector field continuously provides temporal boundary conditions for the biomechanical model. To enhance the accuracy of the displacement vector field, a novel feature points selection strategy based on deep learning is proposed. Moreover, an ex-vivo experimental method for internal structures error assessment is presented. The ex-vivo experiments indicate an external surface reprojection error of 4.07 ± 2.17mm and a maximum mean absolutely error for internal structures of 2.98mm. In-vivo experiments indicate mean absolutely error of 3.28 ± 0.40mm and 1.90±0.24mm, respectively. The combined qualitative and quantitative findings indicated the potential of our AR-assisted navigation system in improving the clinical application of laparoscopic kidney surgery.
由于难以跟踪患者移动解剖结构中隐藏的关键结构，微创手术 (MIS) 在技术上仍然要求很高。在这项研究中，我们提出了一种用于肾脏腹腔镜手术的软组织变形跟踪增强现实（AR）导航管道。所提出的导航管道解决两个主要子问题：初始配准和变形跟踪。我们的方法利用术前 MR 或 CT 数据和双目腹腔镜，无需任何额外的介入硬件。通过概率刚性配准算法和基于密集点云重建的弹性补偿来解决初始配准。对于变形跟踪，稀疏特征点位移矢量场连续为生物力学模型提供时间边界条件。为了提高位移矢量场的精度，提出了一种基于深度学习的特征点选择策略。此外，还提出了一种用于内部结构误差评估的离体实验方法。离体实验表明外表面重投影误差为 4.07 ± 2.17mm，内部结构的最大平均绝对误差为 2.98mm。体内实验表明平均绝对误差分别为 3.28 ± 0.40mm 和 1.90±0.24mm。定性和定量相结合的研究结果表明，我们的 AR 辅助导航系统在改善腹腔镜肾脏手术的临床应用方面具有潜力。

AU Bi, Xia-An Yang, Zicheng Huang, Yangjun Chen, Ke Xing, Zhaoxu Xu, Luyun Wu, Zihao Liu, Zhengliang Li, Xiang Liu, Tianming
毕毕、杨夏安、黄子成、陈杨军、邢克、徐朝旭、吴路云、刘子豪、李正亮、刘翔、天明

CE-GAN: Community Evolutionary Generative Adversarial Network for Alzheimer's Disease Risk Prediction.
CE-GAN：用于阿尔茨海默病风险预测的社区进化生成对抗网络。

In the studies of neurodegenerative diseases such as Alzheimer's Disease (AD), researchers often focus on the associations among multi-omics pathogeny based on imaging genetics data. However, current studies overlook the communities in brain networks, leading to inaccurate models of disease development. This paper explores the developmental patterns of AD from the perspective of community evolution. We first establish a mathematical model to describe functional degeneration in the brain as the community evolution driven by entropy information propagation. Next, we propose an interpretable Community Evolutionary Generative Adversarial Network (CE-GAN) to predict disease risk. In the generator of CE-GAN, community evolutionary convolutions are designed to capture the evolutionary patterns of AD. The experiments are conducted using functional magnetic resonance imaging (fMRI) data and single nucleotide polymorphism (SNP) data. CE-GAN achieves 91.67% accuracy and 91.83% area under curve (AUC) in AD risk prediction tasks, surpassing advanced methods on the same dataset. In addition, we validated the effectiveness of CE-GAN for pathogeny extraction. The source code of this work is available at https://github.com/fmri123456/CE-GAN.
在阿尔茨海默病（AD）等神经退行性疾病的研究中，研究人员经常关注基于成像遗传学数据的多组学病因之间的关联。然而，目前的研究忽视了大脑网络中的群落，导致疾病发展模型不准确。本文从群落演化的角度探讨AD的发展模式。我们首先建立一个数学模型来描述大脑功能退化作为熵信息传播驱动的群落进化。接下来，我们提出了一个可解释的社区进化生成对抗网络（CE-GAN）来预测疾病风险。在CE-GAN的生成器中，社区进化卷积被设计用来捕获AD的进化模式。这些实验是使用功能磁共振成像（fMRI）数据和单核苷酸多态性（SNP）数据进行的。 CE-GAN 在 AD 风险预测任务中实现了 91.67% 的准确率和 91.83% 的曲线下面积 (AUC)，超越了同一数据集上的先进方法。此外，我们还验证了 CE-GAN 在病原提取方面的有效性。这项工作的源代码可在 https://github.com/fmri123456/CE-GAN 获取。

AU Kang, Eunsong Heo, Da-Woon Lee, Jiwon Suk, Heung-, II
AU Kang、Eunsong Heo、Da-Woon Lee、Jiwon Suk、Heung-、II

A Learnable Counter-Condition Analysis Framework for Functional Connectivity-Based Neurological Disorder Diagnosis
基于功能连接的神经系统疾病诊断的可学习反条件分析框架

To understand the biological characteristics of neurological disorders with functional connectivity (FC), recent studies have widely utilized deep learning-based models to identify the disease and conducted post-hoc analyses via explainable models to discover disease-related biomarkers. Most existing frameworks consist of three stages, namely, feature selection, feature extraction for classification, and analysis, where each stage is implemented separately. However, if the results at each stage lack reliability, it can cause misdiagnosis and incorrect analysis in afterward stages. In this study, we propose a novel unified framework that systemically integrates diagnoses (i.e., feature selection and feature extraction) and explanations. Notably, we devised an adaptive attention network as a feature selection approach to identify individual-specific disease-related connections. We also propose a functional network relational encoder that summarizes the global topological properties of FC by learning the inter-network relations without pre-defined edges between functional networks. Last but not least, our framework provides a novel explanatory power for neuroscientific interpretation, also termed counter-condition analysis. We simulated the FC that reverses the diagnostic information (i.e., counter-condition FC): converting a normal brain to be abnormal and vice versa. We validated the effectiveness of our framework by using two large resting-state functional magnetic resonance imaging (fMRI) datasets, Autism Brain Imaging Data Exchange (ABIDE) and REST-meta-MDD, and demonstrated that our framework outperforms other competing methods for disease identification. Furthermore, we analyzed the disease-related neurological patterns based on counter-condition analysis.
为了了解具有功能连接（FC）的神经系统疾病的生物学特征，最近的研究广泛利用基于深度学习的模型来识别疾病，并通过可解释的模型进行事后分析，以发现与疾病相关的生物标志物。大多数现有框架由三个阶段组成，即特征选择、分类特征提取和分析，每个阶段都是单独实现的。但如果每个阶段的结果缺乏可靠性，就会导致后续阶段的误诊和错误分析。在本研究中，我们提出了一种新颖的统一框架，系统地集成了诊断（即特征选择和特征提取）和解释。值得注意的是，我们设计了一个自适应注意力网络作为特征选择方法来识别个体特定的疾病相关联系。我们还提出了一种功能网络关系编码器，通过学习网络间关系来总结 FC 的全局拓扑特性，而无需功能网络之间预先定义的边。最后但并非最不重要的一点是，我们的框架为神经科学解释提供了一种新颖的解释力，也称为反条件分析。我们模拟了逆转诊断信息的FC（即反条件FC）：将正常大脑转换为异常大脑，反之亦然。我们通过使用两个大型静息态功能磁共振成像 (fMRI) 数据集、自闭症脑成像数据交换 (ABIDE) 和 REST-meta-MDD 验证了我们框架的有效性，并证明我们的框架优于其他疾病识别竞争方法。此外，我们根据反条件分析分析了与疾病相关的神经模式。

AU Gao, Cong Feng, Anqi Liu, Xingtong Taylor, Russell H. Armand, Mehran Unberath, Mathias
AU 高、冯丛、刘安琪、Xingtong Taylor、Russell H. Armand、Mehran Unberath、Mathias

A Fully Differentiable Framework for 2D/3D Registration and the Projective Spatial Transformers
2D/3D 配准和投影空间变换器的完全可微框架

Image-based 2D/3D registration is a critical technique for fluoroscopic guided surgical interventions. Conventional intensity-based 2D/3D registration approaches suffer from a limited capture range due to the presence of local minima in hand-crafted image similarity functions. In this work, we aim to extend the 2D/3D registration capture range with a fully differentiable deep network framework that learns to approximate a convex-shape similarity function. The network uses a novel Projective Spatial Transformer (ProST) module that has unique differentiability with respect to 3D pose parameters, and is trained using an innovative double backward gradient-driven loss function. We compare the most popular learning-based pose regression methods in the literature and use the well-established CMAES intensity-based registration as a benchmark. We report registration pose error, target registration error (TRE) and success rate (SR) with a threshold of 10mm for mean TRE. For the pelvis anatomy, the median TRE of ProST followed by CMAES is 4.4mm with a SR of 65.6% in simulation, and 2.2mm with a SR of 73.2% in real data. The CMAES SRs without using ProST registration are 28.5% and 36.0% in simulation and real data, respectively. Our results suggest that the proposed ProST network learns a practical similarity function, which vastly extends the capture range of conventional intensity-based 2D/3D registration. We believe that the unique differentiable property of ProST has the potential to benefit related 3D medical imaging research applications. The source code is available at https://github.com/gaocong13/Projective-Spatial-Transformers.
基于图像的 2D/3D 配准是荧光镜引导手术干预的关键技术。由于手工制作的图像相似性函数中存在局部最小值，传统的基于强度的 2D/3D 配准方法的捕获范围有限。在这项工作中，我们的目标是通过学习近似凸形状相似函数的完全可微的深度网络框架来扩展 2D/3D 配准捕获范围。该网络使用新颖的投影空间变换器 (ProST) 模块，该模块对于 3D 姿态参数具有独特的可微性，并使用创新的双反向梯度驱动损失函数进行训练。我们比较了文献中最流行的基于学习的姿态回归方法，并使用成熟的 CMAES 基于强度的配准作为基准。我们报告配准姿势误差、目标配准误差 (TRE) 和成功率 (SR)，平均 TRE 阈值为 10mm。对于骨盆解剖结构，模拟中 ProST 和 CMAES 的中位 TRE 为 4.4 毫米，SR 为 65.6%；实际数据中为 2.2 毫米，SR 为 73.2%。不使用 ProST 注册的 CMAES SR 在模拟和实际数据中分别为 28.5% 和 36.0%。我们的结果表明，所提出的 ProST 网络学习了实用的相似性函数，这极大地扩展了传统的基于强度的 2D/3D 配准的捕获范围。我们相信 ProST 独特的可微分特性有可能使相关 3D 医学成像研究应用受益。源代码可在 https://github.com/gaocong13/Projective-Spatial-Transformers 获取。

AU Sun, Kaicong Wang, Qian Shen, Dinggang
AU Sun、王凯聪、沉谦、丁刚

Joint Cross-Attention Network With Deep Modality Prior for Fast MRI Reconstruction
具有深度模态先验的联合交叉注意网络用于快速 MRI 重建

Current deep learning-based reconstruction models for accelerated multi-coil magnetic resonance imaging (MRI) mainly focus on subsampled k-space data of single modality using convolutional neural network (CNN). Although dual-domain information and data consistency constraint are commonly adopted in fast MRI reconstruction, the performance of existing models is still limited mainly by three factors: inaccurate estimation of coil sensitivity, inadequate utilization of structural prior, and inductive bias of CNN. To tackle these challenges, we propose an unrolling-based joint Cross-Attention Network, dubbed as jCAN, using deep guidance of the already acquired intra-subject data. Particularly, to improve the performance of coil sensitivity estimation, we simultaneously optimize the latent MR image and sensitivity map (SM). Besides, we introduce Gating layer and Gaussian layer into SM estimation to alleviate the "defocus" and "over-coupling" effects and further ameliorate the SM estimation. To enhance the representation ability of the proposed model, we deploy Vision Transformer (ViT) and CNN in the image and k-space domains, respectively. Moreover, we exploit pre-acquired intra-subject scan as reference modality to guide the reconstruction of subsampled target modality by resorting to the self- and cross-attention scheme. Experimental results on public knee and in-house brain datasets demonstrate that the proposed jCAN outperforms the state-of-the-art methods by a large margin in terms of SSIM and PSNR for different acceleration factors and sampling masks.
当前基于深度学习的加速多线圈磁共振成像（MRI）重建模型主要集中于使用卷积神经网络（CNN）的单模态下采样k空间数据。尽管双域信息和数据一致性约束在快速MRI重建中被普遍采用，但现有模型的性能仍然主要受到三个因素的限制：线圈灵敏度估计不准确、结构先验利用不充分以及CNN的归纳偏差。为了应对这些挑战，我们提出了一种基于展开的联合交叉注意力网络，称为 jCAN，使用已获取的受试者内数据的深度指导。特别是，为了提高线圈灵敏度估计的性能，我们同时优化了潜在 MR 图像和灵敏度图（SM）。此外，我们将门控层和高斯层引入SM估计中，以减轻“散焦”和“过耦合”效应，进一步改善SM估计。为了增强所提出模型的表示能力，我们分别在图像和 k 空间域中部署 Vision Transformer (ViT) 和 CNN。此外，我们利用预先获得的受试者内扫描作为参考模态，通过自我和交叉注意方案来指导子采样目标模态的重建。在公共膝盖和内部大脑数据集上的实验结果表明，对于不同的加速因子和采样掩模，所提出的 jCAN 在 SSIM 和 PSNR 方面大幅优于最先进的方法。

AU Chi, Jianning Sun, Zhiyi Meng, Liuyi Wang, Siqi Yu, Xiaosheng Wei, Xiaolin Yang, Bin
AU Chi、孙建宁、孟志毅、王六一、余思琪、魏晓生、杨小林、斌

Low-dose CT image super-resolution with noise suppression based on prior degradation estimator and self-guidance mechanism.
基于先验退化估计器和自引导机制的具有噪声抑制的低剂量 CT 图像超分辨率。

The anatomies in low-dose computer tomography (LDCT) are usually distorted during the zooming-in observation process due to the small amount of quantum. Super-resolution (SR) methods have been proposed to enhance qualities of LDCT images as post-processing approaches without increasing radiation damage to patients, but suffered from incorrect prediction of degradation information and incomplete leverage of internal connections within the 3D CT volume, resulting in the imbalance between noise removal and detail sharpening in the super-resolution results. In this paper, we propose a novel LDCT SR network where the degradation information self-parsed from the LDCT slice and the 3D anatomical information captured from the LDCT volume are integrated to guide the backbone network. The prior degradation estimator (PDE) is proposed following the contrastive learning strategy to estimate the degradation features in the LDCT images without paired low-normal dose CT images. The self-guidance fusion module (SGFM) is designed to capture anatomical features with internal 3D consistencies between the squashed images along the coronal, sagittal, and axial views of the CT volume. Finally, the features representing degradation and anatomical structures are integrated to recover the CT images with higher resolutions. We apply the proposed method to the 2016 NIH-AAPM Mayo Clinic LDCT Grand Challenge dataset and our collected LDCT dataset to evaluate its ability to recover LDCT images. Experimental results illustrate the superiority of our network concerning quantitative metrics and qualitative observations, demonstrating its potential in recovering detail-sharp and noise-free CT images with higher resolutions from the practical LDCT images.
由于量子量较小，低剂量计算机断层扫描（LDCT）中的解剖结构在放大观察过程中通常会发生扭曲。超分辨率（SR）方法已被提出作为后处理方法来增强 LDCT 图像的质量，而不增加对患者的辐射损伤，但由于对退化信息的错误预测以及 3D CT 体积内内部连接的利用不完整，导致超分辨率结果中噪声去除和细节锐化之间的不平衡。在本文中，我们提出了一种新颖的 LDCT SR 网络，其中从 LDCT 切片自解析的退化信息和从 LDCT 体积捕获的 3D 解剖信息被集成以指导骨干网络。先验退化估计器（PDE）是按照对比学习策略提出的，用于在没有配对低正常剂量CT图像的情况下估计LDCT图像中的退化特征。自引导融合模块 (SGFM) 旨在捕获沿 CT 体积的冠状、矢状和轴向视图的压缩图像之间具有内部 3D 一致性的解剖特征。最后，融合代表退化和解剖结构的特征，以恢复更高分辨率的 CT 图像。我们将所提出的方法应用于 2016 年 NIH-AAPM Mayo Clinic LDCT Grand Challenge 数据集和我们收集的 LDCT 数据集，以评估其恢复 LDCT 图像的能力。实验结果说明了我们的网络在定量指标和定性观察方面的优越性，证明了其从实际 LDCT 图像中恢复细节清晰、无噪声、分辨率更高的 CT 图像的潜力。

EI 1558-254X DA 2024-09-06 UT MEDLINE:39231060 PM 39231060 ER
EI 1558-254X DA 2024-09-06 UT MEDLINE：39231060 PM 39231060 ER

AU Yan, Renao Sun, Qiehe Jin, Cheng Liu, Yiqing He, Yonghong Guan, Tian Chen, Hao
区彦、孙热闹、金切和、刘成、何一清、关永红、陈天、郝

Shapley Values-enabled Progressive Pseudo Bag Augmentation for Whole-Slide Image Classification.
支持 Shapley 值的渐进式伪袋增强，用于全幻灯片图像分类。

In computational pathology, whole-slide image (WSI) classification presents a formidable challenge due to its gigapixel resolution and limited fine-grained annotations. Multiple-instance learning (MIL) offers a weakly supervised solution, yet refining instance-level information from bag-level labels remains challenging. While most of the conventional MIL methods use attention scores to estimate instance importance scores (IIS) which contribute to the prediction of the slide labels, these often lead to skewed attention distributions and inaccuracies in identifying crucial instances. To address these issues, we propose a new approach inspired by cooperative game theory: employing Shapley values to assess each instance's contribution, thereby improving IIS estimation. The computation of the Shapley value is then accelerated using attention, meanwhile retaining the enhanced instance identification and prioritization. We further introduce a framework for the progressive assignment of pseudo bags based on estimated IIS, encouraging more balanced attention distributions in MIL models. Our extensive experiments on CAMELYON-16, BRACS, TCGA-LUNG, and TCGA-BRCA datasets show our method's superiority over existing state-of-the-art approaches, offering enhanced interpretability and class-wise insights. We will release the code upon acceptance.
在计算病理学中，全切片图像（WSI）分类由于其十亿像素分辨率和有限的细粒度注释而提出了巨大的挑战。多实例学习 (MIL) 提供了弱监督解决方案，但从包级标签中提炼实例级信息仍然具有挑战性。虽然大多数传统的 MIL 方法使用注意力分数来估计有助于预测幻灯片标签的实例重要性分数 (IIS)，但这些通常会导致注意力分布不均以及识别关键实例时的不准确。为了解决这些问题，我们提出了一种受合作博弈论启发的新方法：利用 Shapley 值来评估每个实例的贡献，从而改进 IIS 估计。然后使用注意力加速 Shapley 值的计算，同时保留增强的实例识别和优先级。我们进一步引入了一个基于估计的 IIS 的渐进式伪袋分配框架，鼓励 MIL 模型中更加平衡的注意力分配。我们对 CAMELYON-16、BRACS、TCGA-LUNG 和 TCGA-BRCA 数据集进行的广泛实验表明，我们的方法优于现有最先进的方法，提供增强的可解释性和分类洞察。我们将在接受后发布代码。

AU Borazjani, Kasra Khosravan, Naji Ying, Leslie Hosseinalipour, Seyyedali
AU Borazjani、Kasra Khosravan、Naji Ying、Leslie Hosseinalipour、Seyyedali

Multi-Modal Federated Learning for Cancer Staging over Non-IID Datasets with Unbalanced Modalities.
具有不平衡模式的非独立同分布数据集上的癌症分期的多模式联合学习。

The use of machine learning (ML) for cancer staging through medical image analysis has gained substantial interest across medical disciplines. When accompanied by the innovative federated learning (FL) framework, ML techniques can further overcome privacy concerns related to patient data exposure. Given the frequent presence of diverse data modalities within patient records, leveraging FL in a multi-modal learning framework holds considerable promise for cancer staging. However, existing works on multi-modal FL often presume that all data-collecting institutions have access to all data modalities. This oversimplified approach neglects institutions that have access to only a portion of data modalities within the system. In this work, we introduce a novel FL architecture designed to accommodate not only the heterogeneity of data samples, but also the inherent heterogeneity/non-uniformity of data modalities across institutions. We shed light on the challenges associated with varying convergence speeds observed across different data modalities within our FL system. Subsequently, we propose a solution to tackle these challenges by devising a distributed gradient blending and proximity-aware client weighting strategy tailored for multi-modal FL. To show the superiority of our method, we conduct experiments using The Cancer Genome Atlas program (TCGA) datalake considering different cancer types and three modalities of data: mRNA sequences, histopathological image data, and clinical information. Our results further unveil the impact and severity of class-based vs type-based heterogeneity across institutions on the model performance, which widens the perspective to the notion of data heterogeneity in multi-modal FL literature.
通过医学图像分析使用机器学习 (ML) 进行癌症分期已经引起了各个医学学科的浓厚兴趣。当与创新的联邦学习 (FL) 框架相结合时，机器学习技术可以进一步克服与患者数据暴露相关的隐私问题。鉴于患者记录中经常存在不同的数据模式，在多模式学习框架中利用 FL 为癌症分期带来了巨大的希望。然而，现有的多模态 FL 工作通常假设所有数据收集机构都可以访问所有数据模态。这种过于简单化的方法忽略了只能访问系统内部分数据模式的机构。在这项工作中，我们引入了一种新颖的 FL 架构，其设计不仅可以适应数据样本的异质性，还可以适应跨机构数据模式固有的异质性/不均匀性。我们揭示了与 FL 系统中不同数据模式观察到的不同收敛速度相关的挑战。随后，我们提出了一个解决方案，通过设计专为多模态 FL 定制的分布式梯度混合和邻近感知客户端加权策略来应对这些挑战。为了展示我们方法的优越性，我们使用癌症基因组图谱计划 (TCGA) 数据湖进行实验，考虑不同的癌症类型和三种数据模式：mRNA 序列、组织病理学图像数据和临床信息。我们的结果进一步揭示了跨机构基于类别与基于类型的异质性对模型性能的影响和严重性，这拓宽了多模态 FL 文献中数据异质性概念的视角。

AU Feng, Yidan Deng, Sen Lyu, Jun Cai, Jing Wei, Mingqiang Qin, Jing
AU Feng、邓一丹、吕森、蔡军、静伟、秦明强、景

Bridging MRI Cross-Modality Synthesis and Multi-Contrast Super-Resolution by Fine-Grained Difference Learning.
通过细粒度差异学习桥接 MRI 跨模态合成和多对比度超分辨率。

In multi-modal magnetic resonance imaging (MRI), the tasks of imputing or reconstructing the target modality share a common obstacle: the accurate modeling of fine-grained inter-modal differences, which has been sparingly addressed in current literature. These differences stem from two sources: 1) spatial misalignment remaining after coarse registration and 2) structural distinction arising from modality-specific signal manifestations. This paper integrates the previously separate research trajectories of cross-modality synthesis (CMS) and multi-contrast super-resolution (MCSR) to address this pervasive challenge within a unified framework. Connected through generalized down-sampling ratios, this unification not only emphasizes their common goal in reducing structural differences, but also identifies the key task distinguishing MCSR from CMS: modeling the structural distinctions using the limited information from the misaligned target input. Specifically, we propose a composite network architecture with several key components: a label correction module to align the coordinates of multi-modal training pairs, a CMS module serving as the base model, an SR branch to handle target inputs, and a difference projection discriminator for structural distinction-centered adversarial training. When training the SR branch as the generator, the adversarial learning is enhanced with distinction-aware incremental modulation to ensure better-controlled generation. Moreover, the SR branch integrates deformable convolutions to address cross-modal spatial misalignment at the feature level. Experiments conducted on three public datasets demonstrate that our approach effectively balances structural accuracy and realism, exhibiting overall superiority in comprehensive evaluations for both tasks over current state-of-the-art approaches. The code is available at https://github.com/papshare/FGDL.
在多模态磁共振成像（MRI）中，输入或重建目标模态的任务有一个共同的障碍：细粒度模态间差异的精确建模，而当前的文献对此很少提及。这些差异源于两个来源：1）粗略配准后剩余的空间错位；2）由特定模态信号表现引起的结构区别。本文整合了之前独立的跨模态合成（CMS）和多对比度超分辨率（MCSR）的研究轨迹，以在统一的框架内解决这一普遍的挑战。通过广义下采样比率连接，这种统一不仅强调了它们减少结构差异的共同目标，而且还确定了区分 MCSR 和 CMS 的关键任务：使用来自未对齐目标输入的有限信息对结构差异进行建模。具体来说，我们提出了一种具有几个关键组件的复合网络架构：用于对齐多模态训练对坐标的标签校正模块、用作基础模型的 CMS 模块、用于处理目标输入的 SR 分支以及差异投影鉴别器以结构区分为中心的对抗性训练。当将 SR 分支训练为生成器时，通过区分感知增量调制增强对抗性学习，以确保更好地控制生成。此外，SR 分支集成了可变形卷积来解决特征级别的跨模态空间错位问题。在三个公共数据集上进行的实验表明，我们的方法有效地平衡了结构准确性和真实性，在这两项任务的综合评估中表现出相对于当前最先进方法的整体优越性。该代码可在 https://github.com/papshare/FGDL 获取。

EI 1558-254X DA 2024-08-21 UT MEDLINE:39159018 PM 39159018 ER
EI 1558-254X DA 2024-08-21 UT MEDLINE：39159018 PM 39159018 ER

AU Cheng, Ziming Wang, Shidong Xin, Tong Zhou, Tao Zhang, Haofeng Shao, Ling
AU Cheng、王子明、辛世东、周同、张涛、邵浩峰、凌

Few-Shot Medical Image Segmentation via Generating Multiple Representative Descriptors
通过生成多个代表性描述符进行少样本医学图像分割

Automatic medical image segmentation has witnessed significant development with the success of large models on massive datasets. However, acquiring and annotating vast medical image datasets often proves to be impractical due to the time consumption, specialized expertise requirements, and compliance with patient privacy standards, etc. As a result, Few-shot Medical Image Segmentation (FSMIS) has become an increasingly compelling research direction. Conventional FSMIS methods usually learn prototypes from support images and apply nearest-neighbor searching to segment the query images. However, only a single prototype cannot well represent the distribution of each class, thus leading to restricted performance. To address this problem, we propose to Generate Multiple Representative Descriptors (GMRD), which can comprehensively represent the commonality within the corresponding class distribution. In addition, we design a Multiple Affinity Maps based Prediction (MAMP) module to fuse the multiple affinity maps generated by the aforementioned descriptors. Furthermore, to address intra-class variation and enhance the representativeness of descriptors, we introduce two novel losses. Notably, our model is structured as a dual-path design to achieve a balance between foreground and background differences in medical images. Extensive experiments on four publicly available medical image datasets demonstrate that our method outperforms the state-of-the-art methods, and the detailed analysis also verifies the effectiveness of our designed module.
随着大型模型在海量数据集上的成功，自动医学图像分割取得了显着的发展。然而，由于时间消耗、专业知识要求以及遵守患者隐私标准等原因，获取和注释大量医学图像数据集通常被证明是不切实际的。因此，少镜头医学图像分割（FSMIS）已成为越来越多的研究领域。引人注目的研究方向。传统的 FSMIS 方法通常从支持图像中学习原型，并应用最近邻搜索来分割查询图像。然而，只有一个原型无法很好地代表每个类的分布，从而导致性能受到限制。为了解决这个问题，我们提出生成多个代表描述符（GMRD），它可以全面地表示相应类分布内的共性。此外，我们设计了一个基于多亲和力图的预测（MAMP）模块来融合由上述描述符生成的多个亲和力图。此外，为了解决类内变异并增强描述符的代表性，我们引入了两种新颖的损失。值得注意的是，我们的模型采用双路径设计，以实现医学图像中前景和背景差异之间的平衡。对四个公开可用的医学图像数据集的广泛实验表明，我们的方法优于最先进的方法，详细的分析也验证了我们设计的模块的有效性。

AU Wang, Hongyu He, Jiang Cui, Hengfei Yuan, Bo Xia, Yong
王AU、何宏宇、崔江、袁恒飞、夏波、勇

Robust Stochastic Neural Ensemble Learning With Noisy Labels for Thoracic Disease Classification
用于胸部疾病分类的带有噪声标签的鲁棒随机神经集成学习

Chest radiography is the most common radiology examination for thoracic disease diagnosis, such as pneumonia. A tremendous number of chest X-rays prompt data-driven deep learning models in constructing computer-aided diagnosis systems for thoracic diseases. However, in realistic radiology practice, a deep learning-based model often suffers from performance degradation when trained on data with noisy labels possibly caused by different types of annotation biases. To this end, we present a novel stochastic neural ensemble learning (SNEL) framework for robust thoracic disease diagnosis using chest X-rays. The core idea of our method is to learn from noisy labels by constructing model ensembles and designing noise-robust loss functions. Specifically, we propose a fast neural ensemble method that collects parameters simultaneously across model instances and along optimization trajectories. Moreover, we propose a loss function that both optimizes a robust measure and characterizes a diversity measure of ensembles. We evaluated our proposed SNEL method on three publicly available hospital-scale chest X-ray datasets. The experimental results indicate that our method outperforms competing methods and demonstrate the effectiveness and robustness of our method in learning from noisy labels. Our code is available at https://github.com/hywang01/SNEL.
胸部X光检查是诊断肺炎等胸部疾病最常见的放射学检查。大量的胸部X光片促使数据驱动的深度学习模型构建胸部疾病的计算机辅助诊断系统。然而，在现实的放射学实践中，基于深度学习的模型在使用可能由不同类型的注释偏差引起的带有噪声标签的数据进行训练时，常常会出现性能下降的问题。为此，我们提出了一种新颖的随机神经集成学习 (SNEL) 框架，用于使用胸部 X 光进行稳健的胸部疾病诊断。我们方法的核心思想是通过构建模型集成和设计抗噪声损失函数来从噪声标签中学习。具体来说，我们提出了一种快速神经集成方法，可以跨模型实例并沿着优化轨迹同时收集参数。此外，我们提出了一种损失函数，它既可以优化鲁棒性度量，又可以表征集合的多样性度量。我们在三个公开的医院规模胸部 X 射线数据集上评估了我们提出的 SNEL 方法。实验结果表明，我们的方法优于竞争方法，并证明了我们的方法在从噪声标签中学习方面的有效性和鲁棒性。我们的代码可在 https://github.com/hywang01/SNEL 获取。

AU Zhou, Quan Yu, Bin Xiao, Feng Ding, Mingyue Wang, Zhiwei Zhang, Xuming
周AU、余泉、肖斌、丁峰、王明月、张志伟、徐明

Robust Semi-Supervised 3D Medical Image Segmentation With Diverse Joint-Task Learning and Decoupled Inter-Student Learning
具有多样化联合任务学习和解耦学生间学习的鲁棒半监督 3D 医学图像分割

Semi-supervised segmentation is highly significant in 3D medical image segmentation. The typical solutions adopt a teacher-student dual-model architecture, and they constrain the two models' decision consistency on the same segmentation task. However, the scarcity of medical samples can lower the diversity of tasks, reducing the effectiveness of consistency constraint. The issue can further worsen as the weights of the models gradually become synchronized. In this work, we have proposed to construct diverse joint-tasks using masked image modelling for enhancing the reliability of the consistency constraint, and develop a novel architecture consisting of a single teacher but multiple students to enjoy the additional knowledge decoupled from the synchronized weights. Specifically, the teacher and student models 'see' varied randomly-masked versions of an input, and are trained to segment the same targets but reconstruct different missing regions concurrently. Such joint-task of segmentation and reconstruction can have the two learners capture related but complementary features to derive instructive knowledge when constraining their consistency. Moreover, two extra students join the original one to perform an inter-student learning. The three students share the same encoding but different decoding designs, and learn decoupled knowledge by constraining their mutual consistencies, preventing themselves from suboptimally converging to the biased predictions of the dictatorial teacher. Experimental on four medical datasets show that our approach performs better than six mainstream semi-supervised methods. Particularly, our approach achieves at least 0.61% and 0.36% higher Dice and Jaccard values, respectively, than the most competitive approach on our in-house dataset. The code will be released at https://github.com/zxmboshi/DDL.
半监督分割在 3D 医学图像分割中具有非常重要的意义。典型的解决方案采用师生双模型架构，它们限制了两个模型在同一分割任务上的决策一致性。然而，医学样本的稀缺会降低任务的多样性，降低一致性约束的有效性。随着模型权重逐渐同步，这个问题可能会进一步恶化。在这项工作中，我们提出使用掩模图像建模构建不同的联合任务，以增强一致性约束的可靠性，并开发一种由单个教师和多个学生组成的新颖架构，以享受与同步权重解耦的额外知识。具体来说，教师和学生模型“看到”输入的各种随机屏蔽版本，并经过训练来分割相同的目标，但同时重建不同的缺失区域。这种分割和重建的联合任务可以让两个学习者捕获相关但互补的特征，从而在限制其一致性时导出指导性知识。此外，两名额外的学生加入原来的学生进行学生间学习。三个学生共享相同的编码但不同的解码设计，并通过约束彼此的一致性来学习解耦的知识，防止自己次优地收敛于独裁老师的有偏见的预测。对四个医学数据集的实验表明，我们的方法比六种主流半监督方法表现更好。特别是，与我们内部数据集上最具竞争力的方法相比，我们的方法的 Dice 和 Jaccard 值分别高出至少 0.61% 和 0.36%。代码将发布在https://github.com/zxmboshi/DDL。

AU Ma, Yulan Cui, Weigang Liu, Jingyu Guo, Yuzhu Chen, Huiling Li, Yang
马AU、崔玉兰、刘伟刚、郭靖宇、陈玉珠、李慧玲、杨

A Multi-Graph Cross-Attention-Based Region-Aware Feature Fusion Network Using Multi-Template for Brain Disorder Diagnosis
基于多图交叉注意力的区域感知特征融合网络，使用多模板进行脑部疾病诊断

Functional connectivity (FC) networks based on resting-state functional magnetic imaging (rs-fMRI) are reliable and sensitive for brain disorder diagnosis. However, most existing methods are limited by using a single template, which may be insufficient to reveal complex brain connectivities. Furthermore, these methods usually neglect the complementary information between static and dynamic brain networks, and the functional divergence among different brain regions, leading to suboptimal diagnosis performance. To address these limitations, we propose a novel multi-graph cross-attention based region-aware feature fusion network (MGCA-RAFFNet) by using multi-template for brain disorder diagnosis. Specifically, we first employ multi-template to parcellate the brain space into different regions of interest (ROIs). Then, a multi-graph cross-attention network (MGCAN), including static and dynamic graph convolutions, is developed to explore the deep features contained in multi-template data, which can effectively analyze complex interaction patterns of brain networks for each template, and further adopt a dual-view cross-attention (DVCA) to acquire complementary information. Finally, to efficiently fuse multiple static-dynamic features, we design a region-aware feature fusion network (RAFFNet), which is beneficial to improve the feature discrimination by considering the underlying relations among static-dynamic features in different brain regions. Our proposed method is evaluated on both public ADNI-2 and ABIDE-I datasets for diagnosing mild cognitive impairment (MCI) and autism spectrum disorder (ASD). Extensive experiments demonstrate that the proposed method outperforms the state-of-the-art methods.
基于静息态功能磁共振成像 (rs-fMRI) 的功能连接 (FC) 网络对于脑部疾病诊断来说是可靠且敏感的。然而，大多数现有方法都受到使用单一模板的限制，这可能不足以揭示复杂的大脑连接。此外，这些方法通常忽略静态和动态大脑网络之间的互补信息以及不同大脑区域之间的功能差异，导致诊断性能不佳。为了解决这些限制，我们通过使用多模板进行脑部疾病诊断，提出了一种新颖的基于多图交叉注意的区域感知特征融合网络（MGCA-RAFFNet）。具体来说，我们首先采用多模板将大脑空间分割成不同的感兴趣区域（ROI）。然后，开发了包括静态和动态图卷积的多图交叉注意网络（MGCAN）来探索多模板数据中包含的深层特征，可以有效地分析每个模板的大脑网络的复杂交互模式，并且进一步采用双视图交叉注意（DVCA）来获取补充信息。最后，为了有效地融合多个静态-动态特征，我们设计了一个区域感知特征融合网络（RAFFNet），通过考虑不同大脑区域的静态-动态特征之间的潜在关系，有利于提高特征辨别力。我们提出的方法在公共 ADNI-2 和 ABIDE-I 数据集上进行了评估，用于诊断轻度认知障碍 (MCI) 和自闭症谱系障碍 (ASD)。大量的实验表明，所提出的方法优于最先进的方法。

AU Li, Fangda Hu, Zhiqiang Chen, Wen Kak, Avinash
AU Li、胡芳达、陈志强、Wen Kak、Avinash

A Laplacian Pyramid Based Generative H&E Stain Augmentation Network
基于拉普拉斯金字塔的生成 H&E 染色增强网络

Hematoxylin and Eosin (H&E) staining is a widely used sample preparation procedure for enhancing the saturation of tissue sections and the contrast between nuclei and cytoplasm in histology images for medical diagnostics. However, various factors, such as the differences in the reagents used, result in high variability in the colors of the stains actually recorded. This variability poses a challenge in achieving generalization for machine-learning based computer-aided diagnostic tools. To desensitize the learned models to stain variations, we propose the Generative Stain Augmentation Network (G-SAN) - a GAN-based framework that augments a collection of cell images with simulated yet realistic stain variations. At its core, G-SAN uses a novel and highly computationally efficient Laplacian Pyramid (LP) based generator architecture, that is capable of disentangling stain from cell morphology. Through the task of patch classification and nucleus segmentation, we show that using G-SAN-augmented training data provides on average 15.7% improvement in F1 score and 7.3% improvement in panoptic quality, respectively. Our code is available at https://github.com/lifangda01/GSAN-Demo.
苏木精和伊红 (H&E) 染色是一种广泛使用的样品制备程序，用于增强组织切片的饱和度以及医学诊断组织学图像中细胞核和细胞质之间的对比度。然而，各种因素，例如所用试剂的差异，导致实际记录的染色颜色存在很大差异。这种可变性对实现基于机器学习的计算机辅助诊断工具的泛化提出了挑战。为了使学习模型对染色变化不敏感，我们提出了生成染色增强网络（G-SAN）——一种基于 GAN 的框架，可以通过模拟但真实的染色变化来增强细胞图像集合。 G-SAN 的核心使用了一种新颖且计算效率高的基于拉普拉斯金字塔 (LP) 的生成器架构，能够将染色与细胞形态分离。通过斑块分类和核分割的任务，我们表明使用 G-SAN 增强的训练数据分别使 F1 分数平均提高 15.7%，全景质量提高 7.3%。我们的代码可在 https://github.com/lifangda01/GSAN-Demo 获取。

AU Chen, Haobo Cai, Yehua Wang, Changyan Chen, Lin Zhang, Bo Han, Hong Guo, Yuqing Ding, Hong Zhang, Qi
陈AU、蔡浩波、王业华、陈昌彦、张林、韩波、郭洪、丁雨清、张洪、齐

Multi-Organ Foundation Model for Universal Ultrasound Image Segmentation with Task Prompt and Anatomical Prior.
具有任务提示和解剖先验的通用超声图像分割的多器官基础模型。

Semantic segmentation of ultrasound (US) images with deep learning has played a crucial role in computer-aided disease screening, diagnosis and prognosis. However, due to the scarcity of US images and small field of view, resulting segmentation models are tailored for a specific single organ and may lack robustness, overlooking correlations among anatomical structures of multiple organs. To address these challenges, we propose the Multi-Organ FOundation (MOFO) model for universal US image segmentation. The MOFO is optimized jointly from multiple organs across various anatomical regions to overcome the data scarcity and explore correlations between multiple organs. The MOFO extracts organ-invariant representations from US images. Simultaneously, the task prompt is employed to refine organ-specific representations for segmentation predictions. Moreover, the anatomical prior is incorporated to enhance the consistency of the anatomical structures. A multi-organ US database, comprising 7039 images from 10 organs across various regions of the human body, has been established to evaluate our model. Results demonstrate that the MOFO outperforms single-organ methods in terms of the Dice coefficient, 95% Hausdorff distance and average symmetric surface distance with statistically sufficient margins. Our experiments in multi-organ universal segmentation for US images serve as a pioneering exploration of improving segmentation performance by leveraging semantic and anatomical relationships within US images of multiple organs.
利用深度学习对超声（US）图像进行语义分割在计算机辅助疾病筛查、诊断和预后中发挥了至关重要的作用。然而，由于超声图像的稀缺性和视野较小，所得到的分割模型是针对特定的单个器官定制的，并且可能缺乏鲁棒性，忽略了多个器官解剖结构之间的相关性。为了应对这些挑战，我们提出了用于通用美国图像分割的多器官基础（MOFO）模型。 MOFO 由不同解剖区域的多个器官联合优化，以克服数据稀缺性并探索多个器官之间的相关性。 MOFO 从 US 图像中提取器官不变的表示。同时，任务提示用于细化分割预测的器官特异性表示。此外，结合解剖先验以增强解剖结构的一致性。已经建立了一个多器官美国数据库来评估我们的模型，该数据库包含来自人体不同区域的 10 个器官的 7039 张图像。结果表明，MOFO 在 Dice 系数、95% Hausdorff 距离和平均对称表面距离方面优于单器官方法，并且具有统计上足够的裕度。我们在美国图像的多器官通用分割方面的实验是通过利用多个器官的美国图像中的语义和解剖关系来提高分割性能的开创性探索。

AU Liu, Yuyuan Tian, Yu Wang, Chong Chen, Yuanhong Liu, Fengbei Belagiannis, Vasileios Carneiro, Gustavo
AU Liu、田雨媛、王雨、陈冲、刘媛红、Fengbei Belagiannis、Vasileios Carneiro、Gustavo

Translation Consistent Semi-supervised Segmentation for 3D Medical Images.
3D 医学图像的翻译一致半监督分割。

3D medical image segmentation methods have been successful, but their dependence on large amounts of voxel-level annotated data is a disadvantage that needs to be addressed given the high cost to obtain such annotation. Semi-supervised learning (SSL) solves this issue by training models with a large unlabelled and a small labelled dataset. The most successful SSL approaches are based on consistency learning that minimises the distance between model responses obtained from perturbed views of the unlabelled data. These perturbations usually keep the spatial input context between views fairly consistent, which may cause the model to learn segmentation patterns from the spatial input contexts instead of the foreground objects. In this paper, we introduce the Translation Consistent Co-training (TraCoCo) which is a consistency learning SSL method that perturbs the input data views by varying their spatial input context, allowing the model to learn segmentation patterns from foreground objects. Furthermore, we propose a new Confident Regional Cross entropy (CRC) loss, which improves training convergence and keeps the robustness to co-training pseudo-labelling mistakes. Our method yields state-of-the-art (SOTA) results for several 3D data benchmarks, such as the Left Atrium (LA), Pancreas-CT (Pancreas), and Brain Tumor Segmentation (BraTS19). Our method also attains best results on a 2D-slice benchmark, namely the Automated Cardiac Diagnosis Challenge (ACDC), further demonstrating its effectiveness. Our code, training logs and checkpoints are available at https://github.com/yyliu01/ TraCoCo.
3D 医学图像分割方法已经取得了成功，但它们对大量体素级注释数据的依赖是一个缺点，考虑到获得此类注释的成本很高，需要解决这一缺点。半监督学习 (SSL) 通过使用大型未标记数据集和小型标记数据集训练模型来解决此问题。最成功的 SSL 方法基于一致性学习，该学习可以最小化从未标记数据的扰动视图获得的模型响应之间的距离。这些扰动通常使视图之间的空间输入上下文相当一致，这可能导致模型从空间输入上下文而不是前景对象中学习分割模式。在本文中，我们介绍了翻译一致性协同训练（TraCoCo），这是一种一致性学习 SSL 方法，它通过改变输入数据视图的空间输入上下文来扰乱输入数据视图，从而允许模型从前景对象中学习分割模式。此外，我们提出了一种新的置信区域交叉熵（CRC）损失，它提高了训练收敛性并保持了对协同训练伪标签错误的鲁棒性。我们的方法为多个 3D 数据基准提供了最先进的 (SOTA) 结果，例如左心房 (LA)、胰腺 CT (胰腺) 和脑肿瘤分割 (BraTS19)。我们的方法还在 2D 切片基准测试（即自动心脏诊断挑战（ACDC））上获得了最佳结果，进一步证明了其有效性。我们的代码、训练日志和检查点可在 https://github.com/yyliu01/TraCoCo 上获取。

AU Huang, Wendong Hu, Jinwu Xiao, Junhao Wei, Yang Bi, Xiuli Xiao, Bin
黄AU, 胡文东, 肖金武, 魏俊豪, 毕杨, 肖秀丽, 斌

Prototype-Guided Graph Reasoning Network for Few-Shot Medical Image Segmentation.
用于少镜头医学图像分割的原型引导图推理网络。

Few-shot semantic segmentation (FSS) is of tremendous potential for data-scarce scenarios, particularly in medical segmentation tasks with merely a few labeled data. Most of the existing FSS methods typically distinguish query objects with the guidance of support prototypes. However, the variances in appearance and scale between support and query objects from the same anatomical class are often exceedingly considerable in practical clinical scenarios, thus resulting in undesirable query segmentation masks. To tackle the aforementioned challenge, we propose a novel prototype-guided graph reasoning network (PGRNet) to explicitly explore potential contextual relationships in structured query images. Specifically, a prototype-guided graph reasoning module is proposed to perform information interaction on the query graph under the guidance of support prototypes to fully exploit the structural properties of query images to overcome intra-class variances. Moreover, instead of fixed support prototypes, a dynamic prototype generation mechanism is devised to yield a collection of dynamic support prototypes by mining rich contextual information from support images to further boost the efficiency of information interaction between support and query branches. Equipped with the proposed two components, PGRNet can learn abundant contextual representations for query images and is therefore more resilient to object variations. We validate our method on three publicly available medical segmentation datasets, namely CHAOS-T2, MS-CMRSeg, and Synapse. Experiments indicate that the proposed PGRNet outperforms previous FSS methods by a considerable margin and establishes a new state-of-the-art performance.
少镜头语义分割（FSS）对于数据稀缺的场景具有巨大的潜力，特别是在只有少量标记数据的医学分割任务中。大多数现有的 FSS 方法通常在支持原型的指导下区分查询对象。然而，在实际临床场景中，来自同一解剖类别的支持对象和查询对象之间的外观和尺度差异通常非常大，从而导致不期望的查询分割掩模。为了解决上述挑战，我们提出了一种新颖的原型引导图推理网络（PGRNet）来明确探索结构化查询图像中潜在的上下文关系。具体来说，提出了原型引导的图推理模块，在支持原型的指导下对查询图进行信息交互，以充分利用查询图像的结构特性来克服类内方差。此外，设计了动态原型生成机制，而不是固定的支持原型，通过从支持图像中挖掘丰富的上下文信息来生成动态支持原型的集合，以进一步提高支持和查询分支之间的信息交互效率。配备了所提出的两个组件，PGRNet 可以学习查询图像的丰富上下文表示，因此对对象变化更具弹性。我们在三个公开可用的医学分割数据集（即 CHAOS-T2、MS-CMRSeg 和 Synapse）上验证了我们的方法。实验表明，所提出的 PGRNet 大大优于以前的 FSS 方法，并建立了新的最先进性能。

AU Luo, Yilin Huang, Hsuan-Kai Sastry, Karteekeya Hu, Peng Tong, Xin Kuo, Joseph Aborahama, Yousuf Na, Shuai Villa, Umberto Anastasio, Mark A Wang, Lihong V

Full-wave Image Reconstruction in Transcranial Photoacoustic Computed Tomography using a Finite Element Method.
使用有限元方法进行经颅光声计算机断层扫描的全波图像重建。

Transcranial photoacoustic computed tomography presents challenges in human brain imaging due to skull-induced acoustic aberration. Existing full-wave image reconstruction methods rely on a unified elastic wave equation for skull shear and longitudinal wave propagation, therefore demanding substantial computational resources. We propose an efficient discrete imaging model based on finite element discretization. The elastic wave equation for solids is solely applied to the hard-tissue skull region, while the soft-tissue or coupling-medium region that dominates the simulation domain is modeled with the simpler acoustic wave equation for liquids. The solid-liquid interfaces are explicitly modeled with elastic-acoustic coupling. Furthermore, finite element discretization allows coarser, irregular meshes to conform to object geometry. These factors significantly reduce the linear system size by 20 times to facilitate accurate whole-brain simulations with improved speed. We derive a matched forward-adjoint operator pair based on the model to enable integration with various optimization algorithms. We validate the reconstruction framework through numerical simulations and phantom experiments.
由于颅骨引起的声像差，经颅光声计算机断层扫描对人脑成像提出了挑战。现有的全波图像重建方法依赖于颅骨剪切和纵波传播的统一弹性波方程，因此需要大量的计算资源。我们提出了一种基于有限元离散化的高效离散成像模型。固体的弹性波方程仅应用于硬组织颅骨区域，而主导模拟域的软组织或耦合介质区域则使用更简单的液体声波方程进行建模。固液界面通过弹性声耦合进行显式建模。此外，有限元离散化允许更粗糙、不规则的网格符合物体的几何形状。这些因素将线性系统的尺寸显着减小了 20 倍，以促进精确的全脑模拟并提高速度。我们根据模型推导出匹配的前向伴随算子对，以实现与各种优化算法的集成。我们通过数值模拟和模型实验验证了重建框架。

AU Tang, Kunming Jiang, Zhiguo Wu, Kun Shi, Jun Xie, Fengying Wang, Wei Wu, Haibo Zheng, Yushan
唐AU、姜昆明、吴志国、施坤、谢军、王凤英、吴伟、郑海波、玉山

Self-Supervised Representation Distribution Learning for Reliable Data Augmentation in Histopathology WSI Classification.
用于组织病理学 WSI 分类中可靠数据增强的自监督表示分布学习。

Multiple instance learning (MIL) based whole slide image (WSI) classification is often carried out on the representations of patches extracted from WSI with a pre-trained patch encoder. The performance of classification relies on both patch-level representation learning and MIL classifier training. Most MIL methods utilize a frozen model pre-trained on ImageNet or a model trained with self-supervised learning on histopathology image dataset to extract patch image representations and then fix these representations in the training of the MIL classifiers for efficiency consideration. However, the invariance of representations cannot meet the diversity requirement for training a robust MIL classifier, which has significantly limited the performance of the WSI classification. In this paper, we propose a Self-Supervised Representation Distribution Learning framework (SSRDL) for patch-level representation learning with an online representation sampling strategy (ORS) for both patch feature extraction and WSI-level data augmentation. The proposed method was evaluated on three datasets under three MIL frameworks. The experimental results have demonstrated that the proposed method achieves the best performance in histopathology image representation learning and data augmentation and outperforms state-of-the-art methods under different WSI classification frameworks. The code is available at https://github.com/lazytkm/SSRDL.
基于多实例学习 (MIL) 的整个幻灯片图像 (WSI) 分类通常是使用预训练的补丁编码器对从 WSI 中提取的补丁表示进行的。分类的性能依赖于补丁级表示学习和 MIL 分类器训练。大多数 MIL 方法利用在 ImageNet 上预训练的冻结模型或在组织病理学图像数据集上经过自监督学习训练的模型来提取补丁图像表示，然后在 MIL 分类器的训练中修复这些表示以考虑效率。然而，表示的不变性无法满足训练鲁棒MIL分类器的多样性要求，这极大地限制了WSI分类的性能。在本文中，我们提出了一种用于补丁级表示学习的自监督表示分布学习框架（SSRDL），并使用在线表示采样策略（ORS）来进行补丁特征提取和 WSI 级数据增强。所提出的方法在三个 MIL 框架下的三个数据集上进行了评估。实验结果表明，所提出的方法在组织病理学图像表示学习和数据增强方面实现了最佳性能，并且在不同的 WSI 分类框架下优于最先进的方法。该代码可在 https://github.com/lazytkm/SSRDL 获取。

AU Amaan Valiuddin, M M Viviers, Christiaan G A Van Sloun, Ruud J G De With, Peter H N Sommen, Fons van der
AU Amaan Valiuddin、MM Viviers、Christiaan GA Van Sloun、Ruud JG De With、Peter HN Sommen、Fons van der

Investigating and Improving Latent Density Segmentation Models for Aleatoric Uncertainty Quantification in Medical Imaging.
研究和改进医学成像中任意不确定性量化的潜在密度分割模型。

Data uncertainties, such as sensor noise, occlusions or limitations in the acquisition method can introduce irreducible ambiguities in images, which result in varying, yet plausible, semantic hypotheses. In Machine Learning, this ambiguity is commonly referred to as aleatoric uncertainty. In image segmentation, latent density models can be utilized to address this problem. The most popular approach is the Probabilistic U-Net (PU-Net), which uses latent Normal densities to optimize the conditional data log-likelihood Evidence Lower Bound. In this work, we demonstrate that the PU-Net latent space is severely sparse and heavily under-utilized. To address this, we introduce mutual information maximization and entropy-regularized Sinkhorn Divergence in the latent space to promote homogeneity across all latent dimensions, effectively improving gradient-descent updates and latent space informativeness. Our results show that by applying this on public datasets of various clinical segmentation problems, our proposed methodology receives up to 11% performance gains compared against preceding latent variable models for probabilistic segmentation on the Hungarian-Matched Intersection over Union. The results indicate that encouraging a homogeneous latent space significantly improves latent density modeling for medical image segmentation.
数据不确定性，例如传感器噪声、遮挡或采集方法的限制，可能会在图像中引入不可减少的模糊性，从而导致不同但合理的语义假设。在机器学习中，这种模糊性通常被称为任意不确定性。在图像分割中，可以利用潜在密度模型来解决这个问题。最流行的方法是概率 U-Net (PU-Net)，它使用潜在正态密度来优化条件数据对数似然证据下界。在这项工作中，我们证明了 PU-Net 潜在空间严重稀疏且严重未得到充分利用。为了解决这个问题，我们在潜在空间中引入互信息最大化和熵正则化 Sinkhorn 散度，以促进所有潜在维度的同质性，有效提高梯度下降更新和潜在空间信息量。我们的结果表明，通过将其应用于各种临床分割问题的公共数据集，与之前在匈牙利匹配交集联合上进行概率分割的潜在变量模型相比，我们提出的方法获得了高达 11% 的性能提升。结果表明，鼓励均匀的潜在空间可以显着改善医学图像分割的潜在密度建模。

AU Xu, Jing Huang, Kai Zhong, Lianzhen Gao, Yuan Sun, Kai Liu, Wei Zhou, Yanjie Guo, Wenchao Guo, Yuan Zou, Yuanqiang Duan, Yuping Lu, Le Wang, Yu Chen, Xiang Zhao, Shuang
徐AU、黄静、钟凯、高连珍、孙元、刘凯、周伟、郭彦杰、郭文超、邹渊、段元强、路玉萍、王乐、陈宇、赵翔、爽

RemixFormer++: A Multi-modal Transformer Model for Precision Skin Tumor Differential Diagnosis with Memory-efficient Attention.
RemixFormer++：一种多模态 Transformer 模型，用于具有内存高效注意力的精确皮肤肿瘤鉴别诊断。

Diagnosing malignant skin tumors accurately at an early stage can be challenging due to ambiguous and even confusing visual characteristics displayed by various categories of skin tumors. To improve diagnosis precision, all available clinical data from multiple sources, particularly clinical images, dermoscopy images, and medical history, could be considered. Aligning with clinical practice, we propose a novel Transformer model, named Remix-Former++ that consists of a clinical image branch, a dermoscopy image branch, and a metadata branch. Given the unique characteristics inherent in clinical and dermoscopy images, specialized attention strategies are adopted for each type. Clinical images are processed through a top-down architecture, capturing both localized lesion details and global contextual information. Conversely, dermoscopy images undergo a bottom-up processing with two-level hierarchical encoders, designed to pinpoint fine-grained structural and textural features. A dedicated metadata branch seamlessly integrates non-visual information by encoding relevant patient data. Fusing features from three branches substantially boosts disease classification accuracy. RemixFormer++ demonstrates exceptional performance on four single-modality datasets (PAD-UFES-20, ISIC 2017/2018/2019). Compared with the previous best method using a public multi-modal Derm7pt dataset, we achieved an absolute 5.3% increase in averaged F1 and 1.2% in accuracy for the classification of five skin tumors. Furthermore, using a large-scale in-house dataset of 10,351 patients with the twelve most common skin tumors, our method obtained an overall classification accuracy of 92.6%. These promising results, on par or better with the performance of 191 dermatologists through a comprehensive reader study, evidently imply the potential clinical usability of our method.
由于不同类别的皮肤肿瘤所显示的视觉特征不明确甚至令人困惑，因此在早期准确诊断恶性皮肤肿瘤可能具有挑战性。为了提高诊断精度，可以考虑来自多个来源的所有可用临床数据，特别是临床图像、皮肤镜图像和病史。结合临床实践，我们提出了一种新颖的 Transformer 模型，名为 Remix-Former++，由临床图像分支、皮肤镜图像分支和元数据分支组成。鉴于临床和皮肤镜图像固有的独特特征，每种类型都采用专门的关注策略。临床图像通过自上而下的架构进行处理，捕获局部病变细节和全局上下文信息。相反，皮肤镜图像通过两级分层编码器进行自下而上的处理，旨在精确定位细粒度的结构和纹理特征。专用元数据分支通过对相关患者数据进行编码来无缝集成非视觉信息。融合三个分支的特征大大提高了疾病分类的准确性。 RemixFormer++ 在四个单模态数据集（PAD-UFES-20、ISIC 2017/2018/2019）上展示了卓越的性能。与之前使用公共多模态 Derm7pt 数据集的最佳方法相比，我们的平均 F1 绝对提高了 5.3%，五种皮肤肿瘤的分类准确率提高了 1.2%。此外，使用包含 10,351 名患有 12 种最常见皮肤肿瘤的患者的大规模内部数据集，我们的方法获得了 92.6% 的总体分类准确率。通过全面的读者研究，这些有希望的结果与 191 名皮肤科医生的表现相当或更好，显然意味着我们的方法具有潜在的临床可用性。

EI 1558-254X DA 2024-08-11 UT MEDLINE:39120989 PM 39120989 ER
EI 1558-254X DA 2024-08-11 UT MEDLINE：39120989 PM 39120989 ER

AU Quan, Quan Yao, Qingsong Zhu, Heqin Kevin Zhou, S
AU Quan, Quan Yao, 朱庆松, Heqin Kevin Zhou, S

IGU-Aug: Information-guided unsupervised augmentation and pixel-wise contrastive learning for medical image analysis.
IGU-Aug：用于医学图像分析的信息引导的无监督增强和逐像素对比学习。

Contrastive learning (CL) is a form of self-supervised learning and has been widely used for various tasks. Different from widely studied instance-level contrastive learning, pixel-wise contrastive learning mainly helps with pixel-wise dense prediction tasks. The counter-part to an instance in instance-level CL is a pixel, along with its neighboring context, in pixel-wise CL. Aiming to build better feature representation, there is a vast literature about designing instance augmentation strategies for instance-level CL; but there is little similar work on pixel augmentation for pixel-wise CL with a pixel granularity. In this paper, we attempt to bridge this gap. We first classify a pixel into three categories, namely low-, medium-, and high-informative, based on the information quantity the pixel contains. We then adaptively design separate augmentation strategies for each category in terms of augmentation intensity and sampling ratio. Extensive experiments validate that our information-guided pixel augmentation strategy succeeds in encoding more discriminative representations and surpassing other competitive approaches in unsupervised local feature matching. Furthermore, our pretrained model improves the performance of both one-shot and fully supervised models. To the best of our knowledge, we are the first to propose a pixel augmentation method with a pixel granularity for enhancing unsupervised pixel-wise contrastive learning. Code is available at https: //github.com/Curli-quan/IGU-Aug.
对比学习（CL）是自我监督学习的一种形式，已广泛应用于各种任务。与广泛研究的实例级对比学习不同，逐像素对比学习主要有助于逐像素密集预测任务。实例级 CL 中实例的对应部分是像素级 CL 中的像素及其相邻上下文。为了构建更好的特征表示，有大量关于为实例级 CL 设计实例增强策略的文献；但对于具有像素粒度的逐像素 CL 的像素增强，几乎没有类似的工作。在本文中，我们试图弥合这一差距。我们首先根据像素包含的信息量将像素分为三类，即低信息量、中信息量和高信息量。然后，我们根据增强强度和采样率自适应地为每个类别设计单独的增强策略。大量的实验验证了我们的信息引导像素增强策略成功地编码了更具辨别力的表示，并在无监督局部特征匹配中超越了其他竞争方法。此外，我们的预训练模型提高了一次性模型和完全监督模型的性能。据我们所知，我们是第一个提出一种具有像素粒度的像素增强方法，用于增强无监督的逐像素对比学习。代码可在 https://github.com/Curli-quan/IGU-Aug 获取。

AU Daneshmand, Parisa Ghaderi Rabbani, Hossein
AU Daneshmand、帕里莎·加德里·拉巴尼、侯赛因

Tensor Ring Decomposition Guided Dictionary Learning for OCT Image Denoising
用于 OCT 图像去噪的张量环分解引导字典学习

Optical coherence tomography (OCT) is a non-invasive and effective tool for the imaging of retinal tissue. However, the heavy speckle noise, resulting from multiple scattering of the light waves, obscures important morphological structures and impairs the clinical diagnosis of ocular diseases. In this paper, we propose a novel and powerful model known as tensor ring decomposition-guided dictionary learning (TRGDL) for OCT image denoising, which can simultaneously utilize two useful complementary priors, i.e., three-dimensional low-rank and sparsity priors, under a unified framework. Specifically, to effectively use the strong correlation between nearby OCT frames, we construct the OCT group tensors by extracting cubic patches from OCT images and clustering similar patches. Then, since each created OCT group tensor has a low-rank structure, to exploit spatial, non-local, and its temporal correlations in a balanced way, we enforce the TR decomposition model on each OCT group tensor. Next, to use the beneficial three-dimensional inter-group sparsity, we learn shared dictionaries in both spatial and temporal dimensions from all of the stacked OCT group tensors. Furthermore, we develop an effective algorithm to solve the resulting optimization problem by using two efficient optimization approaches, including proximal alternating minimization and the alternative direction method of multipliers. Finally, extensive experiments on OCT datasets from various imaging devices are conducted to prove the generality and usefulness of the proposed TRGDL model. Experimental simulation results show that the suggested TRGDL model outperforms state-of-the-art approaches for OCT image denoising both qualitatively and quantitatively.
光学相干断层扫描（OCT）是一种非侵入性且有效的视网膜组织成像工具。然而，光波多次散射产生的严重散斑噪声掩盖了重要的形态结构，损害了眼部疾病的临床诊断。在本文中，我们提出了一种新颖且强大的模型，称为张量环分解引导字典学习（TRGDL），用于 OCT 图像去噪，该模型可以同时利用两个有用的互补先验，即三维低秩和稀疏先验，在一个统一的框架。具体来说，为了有效利用附近 OCT 帧之间的强相关性，我们通过从 OCT 图像中提取立方块并对相似块进行聚类来构造 OCT 组张量。然后，由于每个创建的 OCT 组张量都具有低秩结构，为了以平衡的方式利用空间、非局部及其时间相关性，我们在每个 OCT 组张量上强制执行 TR 分解模型。接下来，为了利用有益的三维组间稀疏性，我们从所有堆叠的 OCT 组张量中学习空间和时间维度上的共享字典。此外，我们开发了一种有效的算法，通过使用两种有效的优化方法来解决由此产生的优化问题，包括近端交替最小化和乘法器的替代方向方法。最后，对来自各种成像设备的 OCT 数据集进行了广泛的实验，以证明所提出的 TRGDL 模型的通用性和实用性。实验模拟结果表明，所提出的 TRGDL 模型在定性和定量方面均优于 OCT 图像去噪的最先进方法。

AU Liu, Mengjun Zhang, Huifeng Liu, Mianxin Chen, Dongdong Zhuang, Zixu Wang, Xin Zhang, Lichi Peng, Daihui Wang, Qian
刘AU、张孟军、刘惠峰、陈勉新、庄东东、王子旭、张新、彭丽驰、王代辉、钱

Randomizing Human Brain Function Representation for Brain Disease Diagnosis
随机化人脑功能表征以进行脑疾病诊断

Resting-state fMRI (rs-fMRI) is an effective tool for quantifying functional connectivity (FC), which plays a crucial role in exploring various brain diseases. Due to the high dimensionality of fMRI data, FC is typically computed based on the region of interest (ROI), whose parcellation relies on a pre-defined atlas. However, utilizing the brain atlas poses several challenges including 1) subjective selection bias in choosing from various brain atlases, 2) parcellation of each subject's brain with the same atlas yet disregarding individual specificity; 3) lack of interaction between brain region parcellation and downstream ROI-based FC analysis. To address these limitations, we propose a novel randomizing strategy for generating brain function representation to facilitate neural disease diagnosis. Specifically, we randomly sample brain patches, thus avoiding ROI parcellations of the brain atlas. Then, we introduce a new brain function representation framework for the sampled patches. Each patch has its function description by referring to anchor patches, as well as the position description. Furthermore, we design an adaptive-selection-assisted Transformer network to optimize and integrate the function representations of all sampled patches within each brain for neural disease diagnosis. To validate our framework, we conduct extensive evaluations on three datasets, and the experimental results establish the effectiveness and generality of our proposed method, offering a promising avenue for advancing neural disease diagnosis beyond the confines of traditional atlas-based methods. Our code is available at https://github.com/mjliu2020/RandomFR.
静息态功能磁共振成像（rs-fMRI）是量化功能连接（FC）的有效工具，在探索各种脑部疾病中发挥着至关重要的作用。由于功能磁共振成像数据的高维性，FC 通常是根据感兴趣区域 (ROI) 计算的，其分割依赖于预定义的图集。然而，利用大脑图谱带来了一些挑战，包括1）从不同的大脑图谱中进行选择时的主观选择偏差，2）使用相同的图谱对每个受试者的大脑进行分区，但忽略个体特异性； 3）大脑区域分割和下游基于 ROI 的 FC 分析之间缺乏相互作用。为了解决这些限制，我们提出了一种新的随机策略来生成大脑功能表示，以促进神经疾病的诊断。具体来说，我们随机对大脑斑块进行采样，从而避免大脑图谱的 ROI 分割。然后，我们为采样的补丁引入了一个新的大脑功能表示框架。每个补丁都有其参考锚点补丁的功能描述，以及位置描述。此外，我们设计了一个自适应选择辅助的 Transformer 网络来优化和集成每个大脑内所有采样斑块的功能表示，以进行神经疾病诊断。为了验证我们的框架，我们对三个数据集进行了广泛的评估，实验结果证实了我们提出的方法的有效性和通用性，为超越传统基于图集的方法的范围推进神经疾病诊断提供了一条有前途的途径。我们的代码可在 https://github.com/mjliu2020/RandomFR 获取。

AU Pan, Jiazhen Huang, Wenqi Rueckert, Daniel Kustner, Thomas Hammernik, Kerstin
AU Pan、黄家珍、Wenqi Rueckert、Daniel Kustner、Thomas Hammernik、Kerstin

Motion-Compensated MR CINE Reconstruction With Reconstruction-Driven Motion Estimation
通过重建驱动的运动估计进行运动补偿 MR CINE 重建

In cardiac CINE, motion-compensated MR reconstruction (MCMR) is an effective approach to address highly undersampled acquisitions by incorporating motion information between frames. In this work, we propose a novel perspective for addressing the MCMR problem and a more integrated and efficient solution to the MCMR field. Contrary to state-of-the-art (SOTA) MCMR methods which break the original problem into two sub-optimization problems, i.e. motion estimation and reconstruction, we formulate this problem as a single entity with one single optimization. Our approach is unique in that the motion estimation is directly driven by the ultimate goal, reconstruction, but not by the canonical motion-warping loss (similarity measurement between motion-warped images and target images). We align the objectives of motion estimation and reconstruction, eliminating the drawbacks of artifacts-affected motion estimation and therefore error-propagated reconstruction. Further, we can deliver high-quality reconstruction and realistic motion without applying any regularization/smoothness loss terms, circumventing the non-trivial weighting factor tuning. We evaluate our method on two datasets: 1) an in-house acquired 2D CINE dataset for the retrospective study and 2) the public OCMR cardiac dataset for the prospective study. The conducted experiments indicate that the proposed MCMR framework can deliver artifact-free motion estimation and high-quality MR images even for imaging accelerations up to 20x, outperforming SOTA non-MCMR and MCMR methods in both qualitative and quantitative evaluation across all experiments. The code is available at https://github.com/JZPeterPan/MCMR-Recon-Driven-Motion.
在心脏 CINE 中，运动补偿 MR 重建 (MCMR) 是一种通过合并帧之间的运动信息来解决高度欠采样采集问题的有效方法。在这项工作中，我们提出了解决 MCMR 问题的新视角，以及 MCMR 领域更集成、更高效的解决方案。与将原始问题分解为两个子优化问题（即运动估计和重建）的最先进（SOTA）MCMR 方法相反，我们将此问题表述为具有单个优化的单个实体。我们的方法的独特之处在于，运动估计是由最终目标重建直接驱动的，而不是由规范运动扭曲损失（运动扭曲图像和目标图像之间的相似性测量）驱动的。我们将运动估计和重建的目标结合起来，消除了受伪影影响的运动估计的缺点，从而消除了误差传播重建的缺点。此外，我们可以提供高质量的重建和真实的运动，而无需应用任何正则化/平滑度损失项，从而规避了重要的权重因子调整。我们在两个数据集上评估我们的方法：1）用于回顾性研究的内部获取的 2D CINE 数据集和 2）用于前瞻性研究的公共 OCMR 心脏数据集。进行的实验表明，即使成像加速度高达 20 倍，所提出的 MCMR 框架也可以提供无伪影运动估计和高质量 MR 图像，在所有实验的定性和定量评估中均优于 SOTA 非 MCMR 和 MCMR 方法。该代码可在 https://github.com/JZPeterPan/MCMR-Recon-Driven-Motion 获取。

AU Han, Kangfu Li, Gang Fang, Zhiwen Yang, Feng
AU 韩、李康富、方刚、杨志文、冯

Multi-Template Meta-Information Regularized Network for Alzheimer's Disease Diagnosis Using Structural MRI
使用结构 MRI 诊断阿尔茨海默病的多模板元信息正则化网络

Structural magnetic resonance imaging (sMRI) has been widely applied in computer-aided Alzheimer's disease (AD) diagnosis, owing to its capabilities in providing detailed brain morphometric patterns and anatomical features in vivo. Although previous works have validated the effectiveness of incorporating metadata (e.g., age, gender, and educational years) for sMRI-based AD diagnosis, existing methods solely paid attention to metadata-associated correlation to AD (e.g., gender bias in AD prevalence) or confounding effects (e.g., the issue of normal aging and metadata-related heterogeneity). Hence, it is difficult to fully excavate the influence of metadata on AD diagnosis. To address these issues, we constructed a novel Multi-template Meta-information Regularized Network (MMRN) for AD diagnosis. Specifically, considering diagnostic variation resulting from different spatial transformations onto different brain templates, we first regarded different transformations as data augmentation for self-supervised learning after template selection. Since the confounding effects may arise from excessive attention to meta-information owing to its correlation with AD, we then designed the modules of weakly supervised meta-information learning and mutual information minimization to learn and disentangle meta-information from learned class-related representations, which accounts for meta-information regularization for disease diagnosis. We have evaluated our proposed MMRN on two public multi-center cohorts, including the Alzheimer's Disease Neuroimaging Initiative (ADNI) with 1,950 subjects and the National Alzheimer's Coordinating Center (NACC) with 1,163 subjects. The experimental results have shown that our proposed method outperformed the state-of-the-art approaches in both tasks of AD diagnosis, mild cognitive impairment (MCI) conversion prediction, and normal control (NC) vs. MCI vs. AD classification.
结构磁共振成像（sMRI）因其能够提供详细的大脑形态测量模式和体内解剖特征而被广泛应用于计算机辅助阿尔茨海默病（AD）诊断。尽管之前的工作已经验证了将元数据（例如年龄、性别和受教育年限）纳入基于 sMRI 的 AD 诊断的有效性，但现有方法仅关注与 AD 相关的元数据相关性（例如 AD 患病率中的性别偏见）或混杂效应（例如，正常衰老问题和元数据相关的异质性）。因此，很难充分挖掘元数据对 AD 诊断的影响。为了解决这些问题，我们构建了一种新颖的用于 AD 诊断的多模板元信息正则化网络（MMRN）。具体来说，考虑到不同大脑模板上的不同空间变换所导致的诊断变化，我们首先将不同的变换视为模板选择后自我监督学习的数据增强。由于由于元信息与AD的相关性而过度关注元信息可能会产生混杂效应，因此我们设计了弱监督元信息学习和互信息最小化模块，以从学习到的类相关表示中学习和分离元信息，这解释了疾病诊断的元信息正则化。我们在两个公共多中心队列中评估了我们提出的 MMRN，其中包括阿尔茨海默病神经影像计划 (ADNI) 的 1,950 名受试者和国家阿尔茨海默病协调中心 (NACC) 的 1,163 名受试者。实验结果表明，我们提出的方法在 AD 诊断、轻度认知障碍 (MCI) 转换预测以及正常对照 (NC) 与 MCI 与 AD 分类这两项任务中均优于最先进的方法。

AU Liang, Quanmin Ma, Junji Chen, Xitian Lin, Qixiang Shu, Ni Dai, Zhengjia Lin, Ying
区亮、马全民、陈俊吉、林西天、舒其翔、戴倪、林正佳、英

A Hybrid Routing Pattern in Human Brain Structural Network Revealed By Evolutionary Computation
进化计算揭示人脑结构网络中的混合路由模式

The human brain functional connectivity network (FCN) is constrained and shaped by the communication processes in the structural connectivity network (SCN). The underlying communication mechanism thus becomes a critical issue for understanding the formation and organization of the FCN. A number of communication models supported by different routing strategies have been proposed, with shortest path (SP), random diffusion (DIF), and spatial navigation (NAV) as the most typical, respectively requiring network global knowledge, local knowledge, and both for path seeking. Yet these models all assumed every brain region to use one routing strategy uniformly, ignoring convergent evidence that supports the regional heterogeneity in both terms of biological substrates and functional roles. In this regard, the current study developed a hybrid communication model that allowed each brain region to choose a routing strategy from SP, DIF, and NAV independently. A genetic algorithm was designed to uncover the underlying region-wise hybrid routing strategy (namely HYB). The HYB was found to outperform the three typical routing strategies in predicting FCN and facilitating robust communication. Analyses on HYB further revealed that brain regions in lower-order functional modules inclined to route signals using global knowledge, while those in higher-order functional modules preferred DIF that requires only local knowledge. Compared to regions that used global knowledge for routing, regions using DIF had denser structural connections, participated in more functional modules, but played a less dominant role within modules. Together, our findings further evidenced that hybrid routing underpins efficient SCN communication and locally heterogeneous structure-function coupling.
人脑功能连接网络（FCN）受到结构连接网络（SCN）中通信过程的约束和塑造。因此，底层的通信机制成为理解 FCN 形成和组织的关键问题。人们提出了多种支持不同路由策略的通信模型，其中最典型的是最短路径（SP）、随机扩散（DIF）和空间导航（NAV），分别需要网络全局知识、局部知识以及两者的支持。路径寻求。然而，这些模型都假设每个大脑区域统一使用一种路由策略，忽略了支持生物基质和功能角色方面区域异质性的收敛证据。对此，本研究开发了一种混合通信模型，允许每个大脑区域独立地从 SP、DIF 和 NAV 中选择路由策略。遗传算法旨在揭示潜在的区域混合路由策略（即 HYB）。研究发现 HYB 在预测 FCN 和促进稳健通信方面优于三种典型路由策略。对 HYB 的分析进一步表明，低阶功能模块中的大脑区域倾向于使用全局知识来路由信号，而高阶功能模块中的大脑区域则更喜欢仅需要局部知识的 DIF。与使用全局知识进行路由的区域相比，使用DIF的区域具有更密集的结构连接，参与更多的功能模块，但在模块内发挥的主导作用较小。总之，我们的研究结果进一步证明混合路由支持高效的 SCN 通信和局部异构结构功能耦合。

AU Yao, Qingsong He, Zecheng Li, Yuexiang Lin, Yi Ma, Kai Zheng, Yefeng Zhou, S. Kevin
姚AU、何庆松、李泽成、林跃翔、马一、郑凯、周业峰、S. Kevin

Adversarial Medical Image With Hierarchical Feature Hiding
具有层次特征隐藏的对抗性医学图像

Deep learning based methods for medical images can be easily compromised by adversarial examples (AEs), posing a great security flaw in clinical decision-making. It has been discovered that conventional adversarial attacks like PGD which optimize the classification logits, are easy to distinguish in the feature space, resulting in accurate reactive defenses. To better understand this phenomenon and reassess the reliability of the reactive defenses for medical AEs, we thoroughly investigate the characteristic of conventional medical AEs. Specifically, we first theoretically prove that conventional adversarial attacks change the outputs by continuously optimizing vulnerable features in a fixed direction, thereby leading to outlier representations in the feature space. Then, a stress test is conducted to reveal the vulnerability of medical images, by comparing with natural images. Interestingly, this vulnerability is a double-edged sword, which can be exploited to hide AEs. We then propose a simple-yet-effective hierarchical feature constraint (HFC), a novel add-on to conventional white-box attacks, which assists to hide the adversarial feature in the target feature distribution. The proposed method is evaluated on three medical datasets, both 2D and 3D, with different modalities. The experimental results demonstrate the superiority of HFC, i.e., it bypasses an array of state-of-the-art adversarial medical AE detectors more efficiently than competing adaptive attacks, which reveals the deficiencies of medical reactive defense and allows to develop more robust defenses in future.
基于深度学习的医学图像方法很容易受到对抗性例子（AE）的影响，给临床决策带来很大的安全缺陷。人们发现，像PGD这样优化分类逻辑的传统对抗性攻击很容易在特征空间中区分，从而产生准确的反应性防御。为了更好地理解这一现象并重新评估医学不良事件反应防御的可靠性，我们深入研究了传统医学不良事件的特征。具体来说，我们首先从理论上证明，传统的对抗性攻击通过在固定方向上不断优化易受攻击的特征来改变输出，从而导致特征空间中的异常表示。然后，通过与自然图像进行比较，进行压力测试以揭示医学图像的脆弱性。有趣的是，这个漏洞是一把双刃剑，可以用来隐藏 AE。然后，我们提出了一种简单而有效的分层特征约束（HFC），这是传统白盒攻击的一种新颖的附加功能，有助于隐藏目标特征分布中的对抗特征。所提出的方法在具有不同模式的 2D 和 3D 三个医学数据集上进行评估。实验结果证明了 HFC 的优越性，即它比竞争性自适应攻击更有效地绕过一系列最先进的对抗性医学 AE 探测器，这揭示了医学反应防御的缺陷，并允许在以下领域开发更强大的防御未来。

AU Zhang, Fan Cho, Kang Ik Kevin Seitz-Holland, Johanna Ning, Lipeng Legarreta, Jon Haitz Rathi, Yogesh Westin, Carl-Fredrik O'Donnell, Lauren J. Pasternak, Ofer
AU 张、Fan Cho、Kang Ik Kevin Seitz-Holland、Johanna Ning、Lipeng Legarreta、Jon Haitz Rathi、Yogesh Westin、Carl-Fredrik O'Donnell、Lauren J. Pasternak、Ofer

DDParcel: Deep Learning Anatomical Brain Parcellation From Diffusion MRI
DDParcel：利用扩散 MRI 进行深度学习解剖脑分区

Parcellation of anatomically segregated cortical and subcortical brain regions is required in diffusion MRI (dMRI) analysis for region-specific quantification and better anatomical specificity of tractography. Most current dMRI parcellation approaches compute the parcellation from anatomical MRI (T1- or T2-weighted) data, using tools such as FreeSurfer or CAT12, and then register it to the diffusion space. However, the registration is challenging due to image distortions and low resolution of dMRI data, often resulting in mislabeling in the derived brain parcellation. Furthermore, these approaches are not applicable when anatomical MRI data is unavailable. As an alternative we developed the Deep Diffusion Parcellation (DDParcel), a deep learning method for fast and accurate parcellation of brain anatomical regions directly from dMRI data. The input to DDParcel are dMRI parameter maps and the output are labels for 101 anatomical regions corresponding to the FreeSurfer Desikan-Killiany (DK) parcellation. A multi-level fusion network leverages complementary information in the different input maps, at three network levels: input, intermediate layer, and output. DDParcel learns the registration of diffusion features to anatomical MRI from the high-quality Human Connectome Project data. Then, to predict brain parcellation for a new subject, the DDParcel network no longer requires anatomical MRI data but only the dMRI data. Comparing DDParcel's parcellation with T1w-based parcellation shows higher test-retest reproducibility and a higher regional homogeneity, while requiring much less computational time. Generalizability is demonstrated on a range of populations and dMRI acquisition protocols. Utility of DDParcel's parcellation is demonstrated on tractography analysis for fiber tract identification.
在扩散 MRI (dMRI) 分析中，需要对解剖学上分离的皮质和皮质下脑区域进行分区，以实现区域特异性量化和纤维束成像更好的解剖特异性。当前大多数 dMRI 分割方法使用 FreeSurfer 或 CAT12 等工具根据解剖 MRI（T1 或 T2 加权）数据计算分割，然后将其配准到扩散空间。然而，由于图像失真和 dMRI 数据分辨率低，配准具有挑战性，通常会导致派生的大脑分区中的错误标记。此外，当解剖 MRI 数据不可用时，这些方法就不适用。作为替代方案，我们开发了深度扩散分割 (DDParcel)，这是一种深度学习方法，可直接根据 dMRI 数据快速准确地分割大脑解剖区域。 DDParcel 的输入是 dMRI 参数图，输出是与 FreeSurfer Desikan-Killiany (DK) 分区相对应的 101 个解剖区域的标签。多级融合网络在三个网络级别（输入、中间层和输出）利用不同输入映射中的互补信息。 DDParcel 从高质量的人类连接组项目数据中学习扩散特征与解剖 MRI 的配准。然后，为了预测新受试者的大脑分区，DDParcel 网络不再需要解剖 MRI 数据，而只需要 dMRI 数据。将 DDParcel 的分区与基于 T1w 的分区进行比较，显示出更高的重测再现性和更高的区域同质性，同时需要更少的计算时间。普遍性在一系列人群和 dMRI 采集协议中得到了证明。 DDParcel 分割的实用性在用于纤维束识别的纤维束成像分析中得到了证明。

AU Hashemi, Ali Cai, Chang Gao, Yijing Ghosh, Sanjay Mueller, Klaus-Robert Nagarajan, Srikantan S. Haufe, Stefan
AU Hashemi、阿里蔡、高昌、Yijing Ghosh、Sanjay Mueller、Klaus-Robert Nagarajan、Srikantan S. Haufe、Stefan

Joint Learning of Full-Structure Noise in Hierarchical Bayesian Regression Models
分层贝叶斯回归模型中全结构噪声的联合学习

We consider the reconstruction of brain activity from electroencephalography (EEG). This inverse problem can be formulated as a linear regression with independent Gaussian scale mixture priors for both the source and noise components. Crucial factors influencing the accuracy of the source estimation are not only the noise level but also its correlation structure, but existing approaches have not addressed the estimation of noise covariance matrices with full structure. To address this shortcoming, we develop hierarchical Bayesian (type-II maximum likelihood) models for observations with latent variables for source and noise, which are estimated jointly from data. As an extension to classical sparse Bayesian learning (SBL), where across-sensor observations are assumed to be independent and identically distributed, we consider Gaussian noise with full covariance structure. Using the majorization-maximization framework and Riemannian geometry, we derive an efficient algorithm for updating the noise covariance along the manifold of positive definite matrices. We demonstrate that our algorithm has guaranteed and fast convergence and validate it in simulations and with real MEG data. Our results demonstrate that the novel framework significantly improves upon state-of-the-art techniques in the real-world scenario where the noise is indeed non-diagonal and full-structured. Our method has applications in many domains beyond biomagnetic inverse problems.
我们考虑通过脑电图（EEG）重建大脑活动。这个反问题可以表述为具有源和噪声分量的独立高斯尺度混合先验的线性回归。影响源估计准确性的关键因素不仅是噪声水平，还包括其相关结构，但现有方法尚未解决具有完整结构的噪声协方差矩阵的估计。为了解决这个缺点，我们开发了分层贝叶斯（II 类最大似然）模型，用于具有源和噪声潜在变量的观测，这些变量是根据数据联合估计的。作为经典稀疏贝叶斯学习（SBL）的扩展，假设跨传感器观测值是独立且同分布的，我们考虑具有完整协方差结构的高斯噪声。使用majorization-maximization框架和黎曼几何，我们推导出一种有效的算法来更新沿正定矩阵流形的噪声协方差。我们证明我们的算法能够保证快速收敛，并在模拟和真实 MEG 数据中对其进行验证。我们的结果表明，该新颖的框架显着改进了现实场景中最先进的技术，其中噪声确实是非对角线和全结构化的。我们的方法在生物磁逆问题之外的许多领域都有应用。

AU Lian, Jie Liu, Jingyu Zhang, Shu Gao, Kai Liu, Xiaoqing Zhang, Dingwen Yu, Yizhou
AU Lian、刘杰、张靖宇、高树、刘凯、张晓庆、于丁文、一洲

A Structure-Aware Relation Network for Thoracic Diseases Detection and Segmentation (vol 40, pg 2042, 2021)
用于胸部疾病检测和分割的结构感知关系网络（第 40 卷，第 2042 页，2021 年）

C1 Deepwise Artificial Intelligence Lab, Beijing 100080, Peoples R China C1 Peking Univ, Sch Elect Engn & Comp Sci, Beijing 100871, Peoples R China C1 Northwestern Polytech Univ, Sch Automat, Brain & Artificial Intelligence Lab, Xian 710072, Peoples R China C1 Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China C3 Deepwise Artificial Intelligence Lab SN 0278-0062 EI 1558-254X DA 2024-05-25 UT WOS:001203303400013 ER
C1 Deepwise 人工智能实验室，北京 100080，人民 R 中国 C1 北京大学工程学院，北京 100871，人民 R 中国 C1 西北工业大学，Sch Automat，脑与人工智能实验室，西安 710072，人民 R 中国 C1香港大学，计算机科学系，香港，人民 R China C3 Deepwise 人工智能实验室 SN 0278-0062 EI 1558-254X DA 2024-05-25 UT WOS:001203303400013 ER

AU Wang, Jiacheng Jin, Yueming Stoyanov, Danail Wang, Liansheng
AU Wang、金家成、Yueming Stoyanov、Danail Wang、连胜

FedDP: Dual Personalization in Federated Medical Image Segmentation
FedDP：联合医学图像分割中的双重个性化

Personalized federated learning (PFL) addresses the data heterogeneity challenge faced by general federated learning (GFL). Rather than learning a single global model, with PFL a collection of models are adapted to the unique feature distribution of each site. However, current PFL methods rarely consider self-attention networks which can handle data heterogeneity by long-range dependency modeling and they do not utilize prediction inconsistencies in local models as an indicator of site uniqueness. In this paper, we propose FedDP, a novel federated learning scheme with dual personalization, which improves model personalization from both feature and prediction aspects to boost image segmentation results. We leverage long-range dependencies by designing a local query (LQ) that decouples the query embedding layer out of each local model, whose parameters are trained privately to better adapt to the respective feature distribution of the site. We then propose inconsistency-guided calibration (IGC), which exploits the inter-site prediction inconsistencies to accommodate the model learning concentration. By encouraging a model to penalize pixels with larger inconsistencies, we better tailor prediction-level patterns to each local site. Experimentally, we compare FedDP with the state-of-the-art PFL methods on two popular medical image segmentation tasks with different modalities, where our results consistently outperform others on both tasks. Our code and models are available at https://github.com/jcwang123/PFL-Seg-Trans.
个性化联邦学习 (PFL) 解决了通用联邦学习 (GFL) 面临的数据异构性挑战。通过 PFL，模型集合可以适应每个站点的独特特征分布，而不是学习单个全局模型。然而，当前的 PFL 方法很少考虑自注意力网络，它可以通过远程依赖建模来处理数据异质性，并且它们没有利用局部模型中的预测不一致作为站点唯一性的指标。在本文中，我们提出了 FedDP，一种具有双重个性化的新型联邦学习方案，它从特征和预测方面改进了模型个性化，以提高图像分割结果。我们通过设计本地查询（LQ）来利用远程依赖关系，该本地查询将查询嵌入层与每个本地模型分离，其参数经过私下训练，以更好地适应站点各自的特征分布。然后，我们提出了不一致引导校准（IGC），它利用站点间预测的不一致来适应模型学习的集中度。通过鼓励模型惩罚具有较大不一致的像素，我们可以更好地针对每个本地站点定制预测级别模式。在实验上，我们在两种不同模式的流行医学图像分割任务上将 FedDP 与最先进的 PFL 方法进行比较，我们的结果在这两项任务上始终优于其他任务。我们的代码和模型可在 https://github.com/jcwang123/PFL-Seg-Trans 获取。

AU Kadry, Karim Olender, Max L Schuh, Andreas Karmakar, Abhishek Petersen, Kersten Schaap, Michiel Marlevi, David UpdePac, Adam Mizukami, Takuya Taylor, Charles Edelman, Elazer R Nezami, Farhad R
AU Kadry、Karim Olender、Max L Schuh、Andreas Karmakar、Abhishek Petersen、Kersten Schaap、Michiel Marlevi、David UpdePac、Adam Mizukami、Takuya Taylor、Charles Edelman、Elazer R Nezami、Farhad R

Morphology-based non-rigid registration of coronary computed tomography and intravascular images through virtual catheter path optimization.
通过虚拟导管路径优化对冠状动脉计算机断层扫描和血管内图像进行基于形态学的非刚性配准。

Coronary computed tomography angiography (CCTA) provides 3D information on obstructive coronary artery disease, but cannot fully visualize high-resolution features within the vessel wall. Intravascular imaging, in contrast, can spatially resolve atherosclerotic in cross sectional slices, but is limited in capturing 3D relationships between each slice. Co-registering CCTA and intravascular images enables a variety of clinical research applications but is time consuming and user-dependent. This is due to intravascular images suffering from non-rigid distortions arising from irregularities in the imaging catheter path. To address these issues, we present a morphology-based framework for the rigid and non-rigid matching of intravascular images to CCTA images. To do this, we find the optimal virtual catheter path that samples the coronary artery in CCTA image space to recapitulate the coronary artery morphology observed in the intravascular image. We validate our framework on a multi-center cohort of 40 patients using bifurcation landmarks as ground truth for longitudinal and rotational registration. Our registration approach significantly outperforms other approaches for bifurcation alignment. By providing a differentiable framework for multi-modal vascular co-registration, our framework reduces the manual effort required to conduct large-scale multi-modal clinical studies and enables the development of machine learning-based co-registration approaches.
冠状动脉计算机断层扫描血管造影 (CCTA) 提供阻塞性冠状动脉疾病的 3D 信息，但无法完全可视化血管壁内的高分辨率特征。相比之下，血管内成像可以在空间上解析横截面切片中的动脉粥样硬化，但在捕获每个切片之间的 3D 关系方面受到限制。联合配准 CCTA 和血管内图像可实现各种临床研究应用，但非常耗时且依赖于用户。这是由于血管内图像因成像导管路径的不规则性而遭受非刚性扭曲。为了解决这些问题，我们提出了一个基于形态学的框架，用于血管内图像与 CCTA 图像的刚性和非刚性匹配。为此，我们找到了在 CCTA 图像空间中对冠状动脉进行采样的最佳虚拟导管路径，以概括在血管内图像中观察到的冠状动脉形态。我们使用分叉地标作为纵向和旋转配准的基本事实，在 40 名患者的多中心队列上验证了我们的框架。我们的配准方法明显优于其他分叉对齐方法。通过为多模式血管联合注册提供可微分的框架，我们的框架减少了进行大规模多模式临床研究所需的手动工作，并能够开发基于机器学习的联合注册方法。

EI 1558-254X DA 2024-10-09 UT MEDLINE:39374277 PM 39374277 ER
EI 1558-254X DA 2024-10-09 UT MEDLINE：39374277 PM 39374277 ER

AU Leconte, Alexis Poree, Jonathan Rauby, Brice Wu, Alice Ghigo, Nin Xing, Paul Lee, Stephen Bourquin, Chloe Ramos-Palacios, Gerardo Sadikot, Abbas F Provost, Jean
AU Leconte、Alexis Poree、Jonathan Rauby、Brice Wu、Alice Ghigo、Nin Xing、Paul Lee、Stephen Bourquin、Chloe Ramos-Palacios、Gerardo Sadikot、Abbas F Provost、Jean

A Tracking prior to Localization workflow for Ultrasound Localization Microscopy.
超声波定位显微镜的定位之前的跟踪工作流程。

Ultrasound Localization Microscopy (ULM) has proven effective in resolving microvascular structures and local mean velocities at sub-diffraction-limited scales, offering high-resolution imaging capabilities. Dynamic ULM (DULM) enables the creation of angiography or velocity movies throughout cardiac cycles. Currently, these techniques rely on a Localization-and-Tracking (LAT) workflow consisting in detecting microbubbles (MB) in the frames before pairing them to generate tracks. While conventional LAT methods perform well at low concentrations, they suffer from longer acquisition times and degraded localization and tracking accuracy at higher concentrations, leading to biased angiogram reconstruction and velocity estimation. In this study, we propose a novel approach to address these challenges by reversing the current workflow. The proposed method, Tracking-and-Localization (TAL), relies on first tracking the MB and then performing localization. Through comprehensive benchmarking using both in silico and in vivo experiments and employing various metrics to quantify ULM angiography and velocity maps, we demonstrate that the TAL method consistently outperforms the reference LAT workflow. Moreover, when applied to DULM, TAL successfully extracts velocity variations along the cardiac cycle with improved repeatability. The findings of this work highlight the effectiveness of the TAL approach in overcoming the limitations of conventional LAT methods, providing enhanced ULM angiography and velocity imaging.
超声定位显微镜 (ULM) 已被证明可以有效地解析亚衍射极限尺度的微血管结构和局部平均速度，并提供高分辨率成像功能。动态 ULM (DULM) 可以在整个心动周期内创建血管造影或速度影片。目前，这些技术依赖于定位和跟踪 (LAT) 工作流程，包括在将帧配对以生成轨迹之前检测帧中的微泡 (MB)。虽然传统的 LAT 方法在低浓度下表现良好，但在较高浓度下，它们的采集时间较长，定位和跟踪精度下降，导致血管造影重建和速度估计出现偏差。在这项研究中，我们提出了一种通过扭转当前工作流程来应对这些挑战的新方法。所提出的方法“跟踪和定位”(TAL) 依赖于首先跟踪 MB，然后执行定位。通过使用计算机和体内实验进行全面基准测试，并采用各种指标来量化 ULM 血管造影和速度图，我们证明 TAL 方法始终优于参考 LAT 工作流程。此外，当应用于 DULM 时，TAL 成功提取了心动周期中的速度变化，并提高了可重复性。这项工作的结果强调了 TAL 方法在克服传统 LAT 方法的局限性方面的有效性，提供增强的 ULM 血管造影和速度成像。

AU Song, Xuegang Shu, Kaixiang Yang, Peng Zhao, Cheng Zhou, Feng Frangi, Alejandro F Xiao, Xiaohua Dong, Lei Wang, Tianfu Wang, Shuqiang Lei, Baiying
区松、舒学刚、杨凯翔、赵鹏、周成、Feng Frangi、Alejandro F Xiao、董晓华、王雷、王天福、雷树强、白英

Knowledge-aware Multisite Adaptive Graph Transformer for Brain Disorder Diagnosis.
用于脑部疾病诊断的知识感知多站点自适应图形转换器。

Brain disorder diagnosis via resting-state functional magnetic resonance imaging (rs-fMRI) is usually limited due to the complex imaging features and sample size. For brain disorder diagnosis, the graph convolutional network (GCN) has achieved remarkable success by capturing interactions between individuals and the population. However, there are mainly three limitations: 1) The previous GCN approaches consider the non-imaging information in edge construction but ignore the sensitivity differences of features to non-imaging information. 2) The previous GCN approaches solely focus on establishing interactions between subjects (i.e., individuals and the population), disregarding the essential relationship between features. 3) Multisite data increase the sample size to help classifier training, but the inter-site heterogeneity limits the performance to some extent. This paper proposes a knowledge-aware multisite adaptive graph Transformer to address the above problems. First, we evaluate the sensitivity of features to each piece of non-imaging information, and then construct feature-sensitive and feature-insensitive subgraphs. Second, after fusing the above subgraphs, we integrate a Transformer module to capture the intrinsic relationship between features. Third, we design a domain adaptive GCN using multiple loss function terms to relieve data heterogeneity and to produce the final classification results. Last, the proposed framework is validated on two brain disorder diagnostic tasks. Experimental results show that the proposed framework can achieve state-of-the-art performance.
由于复杂的成像特征和样本量，通过静息态功能磁共振成像 (rs-fMRI) 进行脑部疾病诊断通常受到限制。对于脑部疾病诊断，图卷积网络（GCN）通过捕获个体与群体之间的相互作用取得了显着的成功。然而，主要存在三个局限性：1）先前的GCN方法在边缘构建中考虑了非图像信息，但忽略了特征对非图像信息的敏感性差异。 2）以前的GCN方法仅关注建立主体（即个体和群体）之间的相互作用，而忽略了特征之间的本质关系。 3）多站点数据增加了样本量以帮助分类器训练，但站点间的异质性在一定程度上限制了性能。本文提出了一种知识感知的多站点自适应图 Transformer 来解决上述问题。首先，我们评估特征对每条非图像信息的敏感性，然后构建特征敏感和特征不敏感子图。其次，在融合上述子图之后，我们集成了一个 Transformer 模块来捕获特征之间的内在关系。第三，我们设计了一个域自适应 GCN，使用多个损失函数项来减轻数据异质性并产生最终的分类结果。最后，所提出的框架在两项脑部疾病诊断任务上得到了验证。实验结果表明，所提出的框架可以实现最先进的性能。

AU Chakravarty, Arunava Emre, Taha Leingang, Oliver Riedl, Sophie Mai, Julia Scholl, Hendrik P. N. Sivaprasad, Sobha Rueckert, Daniel Lotery, Andrew Schmidt-Erfurth, Ursula Bogunovic, Hrvoje CA PINNACLE Consortium
AU Chakravarty、Arunava Emre、Taha Leingang、Oliver Riedl、Sophie Mai、Julia Scholl、Hendrik PN Sivaprasad、Sobha Rueckert、Daniel Lotery、Andrew Schmidt-Erfurth、Ursula Bogunovic、Hrvoje CA PINNACLE 联盟

Morph-SSL: Self-Supervision With Longitudinal Morphing for Forecasting AMD Progression From OCT Volumes
Morph-SSL：利用纵向变形进行自我监督，用于根据 OCT 体积预测 AMD 进展

The lack of reliable biomarkers makes predicting the conversion from intermediate to neovascular age-related macular degeneration (iAMD, nAMD) a challenging task. We develop a Deep Learning (DL) model to predict the future risk of conversion of an eye from iAMD to nAMD from its current OCT scan. Although eye clinics generate vast amounts of longitudinal OCT scans to monitor AMD progression, only a small subset can be manually labeled for supervised DL. To address this issue, we propose Morph-SSL, a novel Self-supervised Learning (SSL) method for longitudinal data. It uses pairs of unlabelled OCT scans from different visits and involves morphing the scan from the previous visit to the next. The Decoder predicts the transformation for morphing and ensures a smooth feature manifold that can generate intermediate scans between visits through linear interpolation. Next, the Morph-SSL trained features are input to a Classifier which is trained in a supervised manner to model the cumulative probability distribution of the time to conversion with a sigmoidal function. Morph-SSL was trained on unlabelled scans of 399 eyes (3570 visits). The Classifier was evaluated with a five-fold cross-validation on 2418 scans from 343 eyes with clinical labels of the conversion date. The Morph-SSL features achieved an AUC of 0.779 in predicting the conversion to nAMD within the next 6 months, outperforming the same network when trained end-to-end from scratch or pre-trained with popular SSL methods. Automated prediction of the future risk of nAMD onset can enable timely treatment and individualized AMD management.
由于缺乏可靠的生物标志物，预测从中间型到新生血管性年龄相关性黄斑变性（iAMD、nAMD）的转化成为一项具有挑战性的任务。我们开发了一种深度学习 (DL) 模型，通过当前的 OCT 扫描来预测眼睛从 iAMD 转换为 nAMD 的未来风险。尽管眼科诊所生成大量纵向 OCT 扫描来监测 AMD 进展，但只有一小部分可以手动标记为监督 DL。为了解决这个问题，我们提出了 Morph-SSL，一种新颖的纵向数据自监督学习（SSL）方法。它使用来自不同访问的成对未标记 OCT 扫描，并涉及将上次访问的扫描变形为下一次访问的扫描。解码器预测变形的变换并确保平滑的特征流形，可以通过线性插值在访问之间生成中间扫描。接下来，将 Morph-SSL 训练的特征输入到分类器中，该分类器以监督方式进行训练，以使用 sigmoidal 函数对转换时间的累积概率分布进行建模。 Morph-SSL 在 399 只眼睛（3570 次访问）的未标记扫描上进行了训练。该分类器通过五倍交叉验证对 343 只眼睛的 2418 次扫描进行了评估，并附有转换日期的临床标签。 Morph-SSL 功能在预测未来 6 个月内向 nAMD 的转换方面实现了 0.779 的 AUC，在从头开始进行端到端训练或使用流行的 SSL 方法进行预训练时，其性能优于同一网络。自动预测 nAMD 未来发病风险可以实现及时治疗和个性化 AMD 管理。

AU Noelke, Jan-Hinrich Adler, Tim J. Schellenberg, Melanie Dreher, Kris K. Holzwarth, Niklas Bender, Christoph J. Tizabi, Minu D. Seitel, Alexander Maier-Hein, Lena
AU Noelke、Jan-Hinrich Adler、Tim J. Schellenberg、Melanie Dreher、Kris K. Holzwarth、Niklas Bender、Christoph J. Tizabi、Minu D. Seitel、Alexander Maier-Hein、Lena

Photoacoustic Quantification of Tissue Oxygenation Using Conditional Invertible Neural Networks
使用条件可逆神经网络对组织氧合进行光声定量

Intelligent systems in interventional healthcare depend on the reliable perception of the environment. In this context, photoacoustic tomography (PAT) has emerged as a non-invasive, functional imaging modality with great clinical potential. Current research focuses on converting the high-dimensional, not human-interpretable spectral data into the underlying functional information, specifically the blood oxygenation. One of the largely unexplored issues stalling clinical advances is the fact that the quantification problem is ambiguous, i.e. that radically different tissue parameter configurations could lead to almost identical photoacoustic spectra. In the present work, we tackle this problem with conditional Invertible Neural Networks (cINNs). Going beyond traditional point estimates, our network is used to compute an approximation of the conditional posterior density of tissue parameters given the photoacoustic spectrum. To this end, an automatic mode detection algorithm extracts the plausible solution from the sample-based posterior. According to a comprehensive validation study based on both synthetic and real images, our approach is well-suited for exploring ambiguity in quantitative PAT.
介入医疗保健中的智能系统依赖于对环境的可靠感知。在这种背景下，光声断层扫描（PAT）已成为一种具有巨大临床潜力的非侵入性功能成像方式。目前的研究重点是将高维的、非人类可解释的光谱数据转换为潜在的功能信息，特别是血液氧合。阻碍临床进展的很大程度上尚未探索的问题之一是量化问题不明确，即完全不同的组织参数配置可能导致几乎相同的光声光谱。在目前的工作中，我们使用条件可逆神经网络（cINN）来解决这个问题。超越传统的点估计，我们的网络用于计算给定光声光谱的组织参数的条件后验密度的近似值。为此，自动模式检测算法从基于样本的后验中提取合理的解决方案。根据基于合成图像和真实图像的综合验证研究，我们的方法非常适合探索定量 PAT 中的模糊性。

AU Ye, Shuquan Xu, Yan Chen, Dongdong Han, Songfang Liao, Jing
区野、徐淑全、陈彦、韩东东、廖松芳、静

Learning a Single Network for Robust Medical Image Segmentation With Noisy Labels
学习单个网络以实现具有噪声标签的鲁棒医学图像分割

Robust segmenting with noisy labels is an important problem in medical imaging due to the difficulty of acquiring high-quality annotations. Despite the enormous success of recent developments, these developments still require multiple networks to construct their frameworks and focus on limited application scenarios, which leads to inflexibility in practical applications. They also do not explicitly consider the coarse boundary label problem, which results in sub-optimal results. To overcome these challenges, we propose a novel Simultaneous Edge Alignment and Memory-Assisted Learning (SEAMAL) framework for noisy-label robust segmentation. It achieves single-network robust learning, which is applicable for both 2D and 3D segmentation, in both Set-HQ-knowable and Set-HQ-agnostic scenarios. Specifically, to achieve single-model noise robustness, we design a Memory-assisted Selection and Correction module (MSC) that utilizes predictive history consistency from the Prediction Memory Bank to distinguish between reliable and non-reliable labels pixel-wisely, and that updates the reliable ones at the superpixel level. To overcome the coarse boundary label problem, which is common in practice, and to better utilize shape-relevant information at the boundary, we propose an Edge Detection Branch (EDB) that explicitly learns the boundary via an edge detection layer with only slight additional computational cost, and we improve the sharpness and precision of the boundary with a thinning loss. Extensive experiments verify that SEAMAL outperforms previous works significantly.
由于获取高质量注释的困难，带有噪声标签的鲁棒分割是医学成像中的一个重要问题。尽管最近的发展取得了巨大的成功，但这些发展仍然需要多个网络来构建其框架，并且专注于有限的应用场景，这导致实际应用中缺乏灵活性。他们也没有明确考虑粗边界标签问题，这导致了次优结果。为了克服这些挑战，我们提出了一种新颖的同步边缘对齐和记忆辅助学习（SEAMAL）框架，用于噪声标签鲁棒分割。它实现了单网络鲁棒学习，适用于 2D 和 3D 分割、Set-HQ-knowable 和 Set-HQ-agnostic 场景。具体来说，为了实现单模型噪声鲁棒性，我们设计了一个内存辅助选择和校正模块（MSC），该模块利用预测内存库中的预测历史一致性来逐像素区分可靠和不可靠标签，并更新超像素级别的可靠。为了克服实践中常见的粗边界标签问题，并更好地利用边界处的形状相关信息，我们提出了一种边缘检测分支（EDB），它通过边缘检测层显式地学习边界，只需少量的额外计算成本，并且我们通过细化损失提高了边界的清晰度和精度。大量的实验验证了 SEAMAL 的性能显着优于之前的工作。

AU Dai, Tianjie Zhang, Ruipeng Hong, Feng Yao, Jiangchao Zhang, Ya Wang, Yanfeng
戴AU、张天杰、洪瑞鹏、姚峰、张江超、王亚、燕峰

UniChest: Conquer-and-Divide Pre-Training for Multi-Source Chest X-Ray Classification
UniChest：多源胸部 X 射线分类的征服和划分预训练

Vision-Language Pre-training (VLP) that utilizes the multi-modal information to promote the training efficiency and effectiveness, has achieved great success in vision recognition of natural domains and shown promise in medical imaging diagnosis for the Chest X-Rays (CXRs). However, current works mainly pay attention to the exploration on single dataset of CXRs, which locks the potential of this powerful paradigm on larger hybrid of multi-source CXRs datasets. We identify that although blending samples from the diverse sources offers the advantages to improve the model generalization, it is still challenging to maintain the consistent superiority for the task of each source due to the existing heterogeneity among sources. To handle this dilemma, we design a Conquer-and-Divide pre-training framework, termed as UniChest, aiming to make full use of the collaboration benefit of multiple sources of CXRs while reducing the negative influence of the source heterogeneity. Specially, the "Conquer" stage in UniChest encourages the model to sufficiently capture multi-source common patterns, and the "Divide" stage helps squeeze personalized patterns into different small experts (query networks). We conduct thorough experiments on many benchmarks, e.g., ChestX-ray14, CheXpert, Vindr-CXR, Shenzhen, Open-I and SIIM-ACR Pneumothorax, verifying the effectiveness of UniChest over a range of baselines, and release our codes and pre-training models at https://github.com/Elfenreigen/UniChest.
视觉语言预训练（VLP）利用多模态信息来提高训练效率和效果，在自然领域的视觉识别方面取得了巨大成功，并在胸部X光（CXR）的医学影像诊断中显示出应用前景。然而，当前的工作主要关注对单个 CXR 数据集的探索，这将这种强大范式的潜力锁定在更大的多源 CXR 数据集混合上。我们发现，尽管混合来自不同来源的样本可以提供提高模型泛化能力的优势，但由于来源之间存在异质性，保持每个来源的任务的一致优势仍然具有挑战性。为了解决这个困境，我们设计了一个征服和划分预训练框架，称为UniChest，旨在充分利用多源CXR的协作优势，同时减少源异构性的负面影响。特别是，UniChest 中的“征服”阶段鼓励模型充分捕获多源常见模式，“划分”阶段有助于将个性化模式压缩到不同的小专家（查询网络）中。我们在ChestX-ray14、CheXpert、Vindr-CXR、Shenzhen、Open-I和SIIM-ACR Pneumothorax等许多基准上进行了深入的实验，验证了UniChest在一系列基准上的有效性，并发布了我们的代码和预训练模型位于 https://github.com/Elfenreigen/UniChest。

AU Chen, Ming Bian, Yijun Chen, Nanguang Qiu, Anqi
陈AU、卞明、陈一君、邱南光、安琪

Orthogonal Mixed-Effects Modeling for High-Dimensional Longitudinal Data: An Unsupervised Learning Approach.
高维纵向数据的正交混合效应建模：一种无监督学习方法。

The linear mixed-effects model is commonly utilized to interpret longitudinal data, characterizing both the global longitudinal trajectory across all observations and longitudinal trajectories within individuals. However, characterizing these trajectories in high-dimensional longitudinal data presents a challenge. To address this, our study proposes a novel approach, Unsupervised Orthogonal Mixed-Effects Trajectory Modeling (UOMETM), that leverages unsupervised learning to generate latent representations of both global and individual trajectories. We design an autoencoder with a latent space where an orthogonal constraint is imposed to separate the space of global trajectories from individual trajectories. We also devise a cross-reconstruction loss to ensure consistency of global trajectories and enhance the orthogonality between representation spaces. To evaluate UOMETM, we conducted simulation experiments on images to verify that every component functions as intended. Furthermore, we evaluated its performance and robustness using longitudinal brain cortical thickness from two Alzheimer's disease (AD) datasets. Comparative analyses with state-of-the-art methods revealed UOMETM's superiority in identifying global and individual longitudinal patterns, achieving a lower reconstruction error, superior orthogonality, and higher accuracy in AD classification and conversion forecasting. Remarkably, we found that the space of global trajectories did not significantly contribute to AD classification compared to the space of individual trajectories, emphasizing their clear separation. Moreover, our model exhibited satisfactory generalization and robustness across different datasets. The study shows the outstanding performance and potential clinical use of UOMETM in the context of longitudinal data analysis.
线性混合效应模型通常用于解释纵向数据，表征所有观察结果的全局纵向轨迹和个体内部的纵向轨迹。然而，在高维纵向数据中描述这些轨迹是一个挑战。为了解决这个问题，我们的研究提出了一种新方法，即无监督正交混合效应轨迹建模（UOMETM），它利用无监督学习来生成全局和个体轨迹的潜在表示。我们设计了一个具有潜在空间的自动编码器，其中施加正交约束以将全局轨迹的空间与个体轨迹分开。我们还设计了交叉重建损失来确保全局轨迹的一致性并增强表示空间之间的正交性。为了评估 UOMETM，我们对图像进行了模拟实验，以验证每个组件是否按预期运行。此外，我们使用两个阿尔茨海默病 (AD) 数据集的纵向大脑皮质厚度评估了其性能和稳健性。与最先进方法的比较分析揭示了 UOMETM 在识别全局和个体纵向模式、实现较低的重建误差、优异的正交性以及 AD 分类和转换预测方面更高的准确性方面的优越性。值得注意的是，我们发现与个体轨迹的空间相比，全局轨迹的空间对 AD 分类没有显着贡献，强调了它们的明显分离。此外，我们的模型在不同数据集上表现出令人满意的泛化性和鲁棒性。该研究在纵向数据分析的背景下展示了 UOMETM 的出色性能和潜在的临床用途。

AU Liu, Jiaxuan Li, Haitao Zeng, Bolun Wang, Huixiang Kikinis, Ron Joskowicz, Leo Chen, Xiaojun
AU 刘、李家轩、曾海涛、王博伦、慧翔 Kikinis、Ron Joskowicz、Leo Chen、晓军

An end-to-end geometry-based pipeline for automatic preoperative surgical planning of pelvic fracture reduction and fixation.
一种基于几何形状的端到端管道，用于自动进行骨盆骨折复位和固定术前手术规划。

Computer-assisted preoperative planning of pelvic fracture reduction surgery has the potential to increase the accuracy of the surgery and to reduce complications. However, the diversity of the pelvic fractures and the disturbance of small fracture fragments present a great challenge to perform reliable automatic preoperative planning. In this paper, we present a comprehensive and automatic preoperative planning pipeline for pelvic fracture surgery. It includes pelvic fracture labeling, reduction planning of the fracture, and customized screw implantation. First, automatic bone fracture labeling is performed based on the separation of the fracture sections. Then, fracture reduction planning is performed based on automatic extraction and pairing of the fracture surfaces. Finally, screw implantation is planned using the adjoint fracture surfaces. The proposed pipeline was tested on different types of pelvic fracture in 14 clinical cases. Our method achieved a translational and rotational accuracy of 2.56 mm and 3.31° in reduction planning. For fixation planning, a clinical acceptance rate of 86.7% was achieved. The results demonstrate the feasibility of the clinical application of our method. Our method has shown accuracy and reliability for complex multi-body bone fractures, which may provide effective clinical preoperative guidance and may improve the accuracy of pelvic fracture reduction surgery.
计算机辅助骨盆骨折复位手术的术前计划有可能提高手术的准确性并减少并发症。然而，骨盆骨折的多样性和小骨折碎片的干扰对进行可靠的自动术前计划提出了巨大的挑战。在本文中，我们提出了一种用于骨盆骨折手术的全面且自动的术前计划流程。它包括骨盆骨折标记、骨折复位计划和定制螺钉植入。首先，根据骨折断面的分离进行自动骨折标记。然后，基于骨折表面的自动提取和配对来执行骨折复位计划。最后，计划使用伴随断裂面进行螺钉植入。所提出的管道在 14 个临床病例中对不同类型的骨盆骨折进行了测试。我们的方法在缩减规划中实现了 2.56 毫米和 3.31° 的平移和旋转精度。对于固定计划，临床接受率为86.7%。结果证明了我们的方法临床应用的可行性。我们的方法对复杂的多体骨折表现出准确性和可靠性，可以为临床提供有效的术前指导，并可以提高骨盆骨折复位手术的准确性。

AU Karageorgos, Grigorios M Zhang, Jiayong Peters, Nils Xia, Wenjun Niu, Chuang Paganetti, Harald Wang, Ge De Man, Bruno
AU Karageorgos、Grigorios M 张、Jiayong Peters、Nils Xia、牛文君、Chuang Paganetti、Harald Wang、葛德曼、Bruno

A denoising diffusion probabilistic model for metal artifact reduction in CT.
CT 中减少金属伪影的去噪扩散概率模型。

The presence of metal objects leads to corrupted CT projection measurements, resulting in metal artifacts in the reconstructed CT images. AI promises to offer improved solutions to estimate missing sinogram data for metal artifact reduction (MAR), as previously shown with convolutional neural networks (CNNs) and generative adversarial networks (GANs). Recently, denoising diffusion probabilistic models (DDPM) have shown great promise in image generation tasks, potentially outperforming GANs. In this study, a DDPM-based approach is proposed for inpainting of missing sinogram data for improved MAR. The proposed model is unconditionally trained, free from information on metal objects, which can potentially enhance its generalization capabilities across different types of metal implants compared to conditionally trained approaches. The performance of the proposed technique was evaluated and compared to the state-of-the-art normalized MAR (NMAR) approach as well as to CNN-based and GAN-based MAR approaches. The DDPM-based approach provided significantly higher SSIM and PSNR, as compared to NMAR (SSIM: p < 10-26; PSNR: p < 10-21), the CNN (SSIM: p < 10-25; PSNR: p < 10-9) and the GAN (SSIM: p < 10-6; PSNR: p < 0.05) methods. The DDPM-MAR technique was further evaluated based on clinically relevant image quality metrics on clinical CT images with virtually introduced metal objects and metal artifacts, demonstrating superior quality relative to the other three models. In general, the AI-based techniques showed improved MAR performance compared to the non-AI-based NMAR approach. The proposed methodology shows promise in enhancing the effectiveness of MAR, and therefore improving the diagnostic accuracy of CT.
金属物体的存在会导致 CT 投影测量损坏，从而导致重建的 CT 图像中出现金属伪影。人工智能有望提供改进的解决方案来估计缺失的正弦图数据，以减少金属伪影 (MAR)，正如之前的卷积神经网络 (CNN) 和生成对抗网络 (GAN) 所示。最近，去噪扩散概率模型（DDPM）在图像生成任务中显示出巨大的前景，可能优于 GAN。在本研究中，提出了一种基于 DDPM 的方法来修复缺失的正弦图数据，以改进 MAR。所提出的模型是无条件训练的，不受金属物体信息的影响，与有条件训练的方法相比，这可以潜在地增强其在不同类型金属植入物上的泛化能力。对所提出技术的性能进行了评估，并将其与最先进的归一化 MAR (NMAR) 方法以及基于 CNN 和基于 GAN 的 MAR 方法进行了比较。与 NMAR（SSIM：p < 10-26；PSNR：p < 10-21）、CNN（SSIM：p < 10-25；PSNR：PSNR： p < 10-9）和 GAN（SSIM：p < 10-6；PSNR：p < 0.05）方法。基于临床 CT 图像上的临床相关图像质量指标，进一步评估了 DDPM-MAR 技术，其中虚拟引入了金属物体和金属伪影，证明了相对于其他三种模型的卓越质量。总体而言，与非基于 AI 的 NMAR 方法相比，基于 AI 的技术显示出改进的 MAR 性能。所提出的方法有望提高 MAR 的有效性，从而提高 CT 的诊断准确性。

AU Zhou, Jie Jie, Biao Wang, Zhengdong Zhang, Zhixiang Du, Tongchun Bian, Weixin Yang, Yang Jia, Jun
周杰、王彪、张正东、杜志祥、卞同春、杨伟新、杨佳、军

LCGNet: Local Sequential Feature Coupling Global Representation Learning for Functional Connectivity Network Analysis with fMRI.
LCGNet：使用 fMRI 进行功能连接网络分析的局部顺序特征耦合全局表示学习。

Analysis of functional connectivity networks (FCNs) derived from resting-state functional magnetic resonance imaging (rs-fMRI) has greatly advanced our understanding of brain diseases, including Alzheimer's disease (AD) and attention deficit hyperactivity disorder (ADHD). Advanced machine learning techniques, such as convolutional neural networks (CNNs), have been used to learn high-level feature representations of FCNs for automated brain disease classification. Even though convolution operations in CNNs are good at extracting local properties of FCNs, they generally cannot well capture global temporal representations of FCNs. Recently, the transformer technique has demonstrated remarkable performance in various tasks, which is attributed to its effective self-attention mechanism in capturing the global temporal feature representations. However, it cannot effectively model the local network characteristics of FCNs. To this end, in this paper, we propose a novel network structure for Local sequential feature Coupling Global representation learning (LCGNet) to take advantage of convolutional operations and self-attention mechanisms for enhanced FCN representation learning. Specifically, we first build a dynamic FCN for each subject using an overlapped sliding window approach. We then construct three sequential components (i.e., edge-to-vertex layer, vertex-to-network layer, and network-to-temporality layer) with a dual backbone branch of CNN and transformer to extract and couple from local to global topological information of brain networks. Experimental results on two real datasets (i.e., ADNI and ADHD-200) with rs-fMRI data show the superiority of our LCGNet.
对静息态功能磁共振成像 (rs-fMRI) 衍生的功能连接网络 (FCN) 的分析极大地增进了我们对脑部疾病的理解，包括阿尔茨海默病 (AD) 和注意力缺陷多动障碍 (ADHD)。先进的机器学习技术，例如卷积神经网络 (CNN)，已被用来学习 FCN 的高级特征表示，以实现自动脑部疾病分类。尽管 CNN 中的卷积运算擅长提取 FCN 的局部属性，但它们通常不能很好地捕获 FCN 的全局时间表示。最近，Transformer 技术在各种任务中表现出了卓越的性能，这归因于其在捕获全局时间特征表示方面的有效自注意力机制。然而，它不能有效地模拟FCN的本地网络特征。为此，在本文中，我们提出了一种用于局部顺序特征耦合全局表示学习（LCGNet）的新型网络结构，以利用卷积运算和自注意力机制来增强 FCN 表示学习。具体来说，我们首先使用重叠滑动窗口方法为每个主题构建动态 FCN。然后，我们使用 CNN 和 Transformer 的双主干分支构建三个顺序组件（即边到顶点层、顶点到网络层和网络到时间层），以提取和耦合局部到全局拓扑信息的大脑网络。使用 rs-fMRI 数据在两个真实数据集（即 ADNI 和 ADHD-200）上进行的实验结果表明了我们的 LCGNet 的优越性。

AU Chabouh, Georges Denis, Louise Bodard, Sylvain Lager, Franck Renault, Gilles Chavignon, Arthur Couture, Olivier
AU Chabouh、乔治·丹尼斯、路易丝·博达尔、西尔万·拉格、弗兰克·雷诺、吉尔·夏维农、亚瑟·库图尔、奥利维尔

Whole organ volumetric sensing Ultrasound Localization Microscopy for characterization of kidney structure.
全器官体积传感超声定位显微镜用于表征肾脏结构。

Glomeruli are the filtration units of the kidney and their function relies heavily on their microcirculation. Despite its obvious diagnostic importance, an accurate estimation of blood flow in the capillary bundle within glomeruli defies the resolution of conventional imaging modalities. Ultrasound Localization Microscopy (ULM) has demonstrated its ability to image in-vivo deep organs in the body. Recently, the concept of sensing ULM or sULM was introduced to classify individual microbubble behavior based on the expected physiological conditions at the micrometric scale. In the kidney of both rats and humans, it revealed glomerular structures in 2D but was severely limited by planar projection. In this work, we aim to extend sULM in 3D to image the whole organ and in order to perform an accurate characterization of the entire kidney structure. The extension of sULM into the 3D domain allows better localization and more robust tracking. The 3D metrics of velocity and pathway angular shift made glomerular mask possible. This approach facilitated the quantification of glomerular physiological parameter such as an interior traveled distance of approximately 7.5 ± 0.6 microns within the glomerulus. This study introduces a technique that characterize the kidney physiology which can serve as a method to facilite pathology assessment. Furthermore, its potential for clinical relevance could serve as a bridge between research and practical application, leading to innovative diagnostics and improved patient care..
肾小球是肾脏的过滤单位，其功能很大程度上依赖于其微循环。尽管其诊断重要性显而易见，但对肾小球内毛细血管束血流量的准确估计仍无法满足传统成像方式的分辨率。超声定位显微镜 (ULM) 已证明其能够对体内深部器官进行成像。最近，引入了传感 ULM 或 sULM 的概念，以根据微米尺度的预期生理条件对个体微泡行为进行分类。在大鼠和人类的肾脏中，它显示了二维肾小球结构，但受到平面投影的严重限制。在这项工作中，我们的目标是扩展 3D sULM 以对整个器官进行成像，并对整个肾脏结构进行准确的表征。将 sULM 扩展到 3D 域可以实现更好的定位和更稳健的跟踪。速度和通路角位移的 3D 指标使肾小球掩模成为可能。这种方法有助于量化肾小球生理参数，例如肾小球内约 7.5 ± 0.6 微米的内部行进距离。本研究介绍了一种表征肾脏生理学的技术，可作为促进病理学评估的方法。此外，其临床相关性的潜力可以作为研究和实际应用之间的桥梁，从而带来创新的诊断和改善的患者护理。

AU Yang, Bao Gong, Kuang Liu, Huafeng Li, Quanzheng Zhu, Wentao
欧阳、龚包、刘匡、李华峰、朱全正、文涛

Anatomically Guided PET Image Reconstruction Using Conditional Weakly-Supervised Multi-Task Learning Integrating Self-Attention
使用整合自注意力的条件弱监督多任务学习进行解剖引导 PET 图像重建

To address the lack of high-quality training labels in positron emission tomography (PET) imaging, weakly-supervised reconstruction methods that generate network-based mappings between prior images and noisy targets have been developed. However, the learned model has an intrinsic variance proportional to the average variance of the target image. To suppress noise and improve the accuracy and generalizability of the learned model, we propose a conditional weakly-supervised multi-task learning (MTL) strategy, in which an auxiliary task is introduced serving as an anatomical regularizer for the PET reconstruction main task. In the proposed MTL approach, we devise a novel multi-channel self-attention (MCSA) module that helps learn an optimal combination of shared and task-specific features by capturing both local and global channel-spatial dependencies. The proposed reconstruction method was evaluated on NEMA phantom PET datasets acquired at different positions in a PET/CT scanner and 26 clinical whole-body PET datasets. The phantom results demonstrate that our method outperforms state-of-the-art learning-free and weakly-supervised approaches obtaining the best noise/contrast tradeoff with a significant noise reduction of approximately 50.0% relative to the maximum likelihood (ML) reconstruction. The patient study results demonstrate that our method achieves the largest noise reductions of 67.3% and 35.5% in the liver and lung, respectively, as well as consistently small biases in 8 tumors with various volumes and intensities. In addition, network visualization reveals that adding the auxiliary task introduces more anatomical information into PET reconstruction than adding only the anatomical loss, and the developed MCSA can abstract features and retain PET image details.
为了解决正电子发射断层扫描（PET）成像中缺乏高质量训练标签的问题，开发了弱监督重建方法，可以在先前图像和噪声目标之间生成基于网络的映射。然而，学习模型具有与目标图像的平均方差成比例的内在方差。为了抑制噪声并提高学习模型的准确性和泛化性，我们提出了一种条件弱监督多任务学习（MTL）策略，其中引入辅助任务作为 PET 重建主要任务的解剖正则化器。在所提出的 MTL 方法中，我们设计了一种新颖的多通道自注意力（MCSA）模块，该模块通过捕获局部和全局通道空间依赖性来帮助学习共享和特定于任务的特征的最佳组合。所提出的重建方法在 PET/CT 扫描仪不同位置采集的 NEMA 体模 PET 数据集和 26 个临床全身 PET 数据集上进行了评估。模型结果表明，我们的方法优于最先进的无学习和弱监督方法，获得最佳噪声/对比度权衡，相对于最大似然 (ML) 重建，噪声显着降低约 50.0%。患者研究结果表明，我们的方法在肝脏和肺部分别实现了最大的噪声降低，分别为 67.3% 和 35.5%，并且在 8 个不同体积和强度的肿瘤中始终保持较小的偏差。此外，网络可视化表明，与仅添加解剖损失相比，添加辅助任务将更多的解剖信息引入到 PET 重建中，并且开发的 MCSA 可以抽象特征并保留 PET 图像细节。

AU Payen, Thomas Crouzet, Sebastien Guillen, Nicolas Chen, Yao Chapelon, Jean-Yves Lafon, Cyril Catheline, Stefan
AU Payen、Thomas Crouzet、Sebastien Guillen、Nicolas Chen、Yao Chapelon、Jean-Yves Lafon、Cyril Catheline、Stefan

Passive Elastography for Clinical HIFU Lesion Detection
用于临床 HIFU 病变检测的被动弹性成像

High-intensity Focused Ultrasound (HIFU) is a promising treatment modality for a wide range of pathologies including prostate cancer. However, the lack of a reliable ultrasound-based monitoring technique limits its clinical use. Ultrasound currently provides real-time HIFU planning, but its use for monitoring is usually limited to detecting the backscatter increase resulting from chaotic bubble appearance. HIFU has been shown to generate stiffening in various tissues, so elastography is an interesting lead for ablation monitoring. However, the standard techniques usually require the generation of a controlled push which can be problematic in deeper organs. Passive elastography offers a potential alternative as it uses the physiological wave field to estimate the elasticity in tissues and not an external perturbation. This technique was adapted to process B-mode images acquired with a clinical system. It was first shown to faithfully assess elasticity in calibrated phantoms. The technique was then implemented on the Focal One (R) clinical system to evaluate its capacity to detect HIFU lesions in vitro (CNR = 9.2 dB) showing its independence regarding the bubbles resulting from HIFU and in vivo where the physiological wave field was successfully used to detect and delineate lesions of different sizes in porcine liver. Finally, the technique was performed for the very first time in four prostate cancer patients showing strong variation in elasticity before and after HIFU treatment (average variation of 33.0 +/- 16.0% ). Passive elastography has shown evidence of its potential to monitor HIFU treatment and thus help spread its use.
高强度聚焦超声 (HIFU) 是一种有前途的治疗方法，可治疗包括前列腺癌在内的多种疾病。然而，缺乏可靠的基于超声的监测技术限制了其临床应用。超声波目前提供实时 HIFU 规划，但其监测用途通常仅限于检测由于混沌气泡出现而导致的反向散射增加。 HIFU 已被证明会在各种组织中产生僵硬，因此弹性成像是消融监测的一个有趣的线索。然而，标准技术通常需要产生受控的推动力，这在更深的器官中可能会出现问题。被动弹性成像提供了一种潜在的替代方案，因为它使用生理波场来估计组织的弹性而不是外部扰动。该技术适用于处理通过临床系统获取的 B 模式图像。它首先被证明可以忠实地评估校准模型的弹性。然后，该技术在 Focal One (R) 临床系统上实施，以评估其体外检测 HIFU 病变的能力 (CNR = 9.2 dB)，显示其对于 HIFU 产生的气泡和体内成功使用生理波场的独立性检测并描绘猪肝脏中不同大小的病变。最后，该技术首次在四名前列腺癌患者身上进行，这些患者在 HIFU 治疗前后表现出强烈的弹性变化（平均变化为 33.0 +/- 16.0%）。被动弹性成像已显示出其监测 HIFU 治疗的潜力，从而有助于推广其使用。

AU Moazami, Saeed Ray, Deep Pelletier, Daniel Oberai, Assad A.
AU Moazami、Saeed Ray、Deep Pelletier、Daniel Oberai、Assad A.

Probabilistic Brain Extraction in MR Images via Conditional Generative Adversarial Networks
通过条件生成对抗网络在 MR 图像中进行概率性大脑提取

Brain extraction, or the task of segmenting the brain in MR images, forms an essential step for many neuroimaging applications. These include quantifying brain tissue volumes, monitoring neurological diseases, and estimating brain atrophy. Several algorithms have been proposed for brain extraction, including image-to-image deep learning methods that have demonstrated significant gains in accuracy. However, none of them account for the inherent uncertainty in brain extraction. Motivated by this, we propose a novel, probabilistic deep learning algorithm for brain extraction that recasts this task as a Bayesian inference problem and utilizes a conditional generative adversarial network (cGAN) to solve it. The input to the cGAN's generator is an MR image of the head, and the output is a collection of likely brain images drawn from a probability density conditioned on the input. These images are used to generate a pixel-wise mean image, serving as the estimate for the extracted brain, and a standard deviation image, which quantifies the uncertainty in the prediction. We test our algorithm on head MR images from five datasets: NFBS, CC359, LPBA, IBSR, and their combination. Our datasets are heterogeneous regarding multiple factors, including subjects (with and without symptoms), magnetic field strengths, and manufacturers. Our experiments demonstrate that the proposed approach is more accurate and robust than a widely used brain extraction tool and at least as accurate as the other deep learning methods. They also highlight the utility of quantifying uncertainty in downstream applications.
大脑提取，或者说在 MR 图像中分割大脑的任务，是许多神经成像应用的一个重要步骤。这些包括量化脑组织体积、监测神经系统疾病和估计脑萎缩。已经提出了几种用于大脑提取的算法，包括图像到图像的深度学习方法，这些方法已经证明了准确性的显着提高。然而，它们都没有解释大脑提取中固有的不确定性。受此启发，我们提出了一种用于大脑提取的新颖的概率深度学习算法，将该任务重新定义为贝叶斯推理问题，并利用条件生成对抗网络（cGAN）来解决它。 cGAN 生成器的输入是头部的 MR 图像，输出是根据输入条件的概率密度绘制的可能大脑图像的集合。这些图像用于生成像素级平均图像（作为提取的大脑的估计）和标准差图像（用于量化预测的不确定性）。我们在来自五个数据集的头部 MR 图像上测试我们的算法：NFBS、CC359、LPBA、IBSR 及其组合。我们的数据集在多个因素方面是异构的，包括受试者（有或没有症状）、磁场强度和制造商。我们的实验表明，所提出的方法比广泛使用的大脑提取工具更准确、更稳健，并且至少与其他深度学习方法一样准确。他们还强调了量化下游应用中不确定性的效用。

AU Wang, Yijun Lang, Rui Li, Rui Zhang, Junsong
王AU、郎一君、李锐、张锐、俊松

NRTR: Neuron Reconstruction With Transformer From 3D Optical Microscopy Images
NRTR：使用 Transformer 从 3D 光学显微镜图像重建神经元

The neuron reconstruction from raw Optical Microscopy (OM) image stacks is the basis of neuroscience. Manual annotation and semi-automatic neuron tracing algorithms are time-consuming and inefficient. Existing deep learning neuron reconstruction methods, although demonstrating exemplary performance, greatly demand complex rule-based components. Therefore, a crucial challenge is designing an end-to-end neuron reconstruction method that makes the overall framework simpler and model training easier. We propose a Neuron Reconstruction Transformer (NRTR) that, discarding the complex rule-based components, views neuron reconstruction as a direct set-prediction problem. To the best of our knowledge, NRTR is the first image-to-set deep learning model for end-to-end neuron reconstruction. The overall pipeline consists of the CNN backbone, Transformer encoder-decoder, and connectivity construction module. NRTR generates a point set representing neuron morphological characteristics for raw neuron images. The relationships among the points are established through connectivity construction. The point set is saved as a standard SWC file. In experiments using the BigNeuron and VISoR-40 datasets, NRTR achieves excellent neuron reconstruction results for comprehensive benchmarks and outperforms competitive baselines. Results of extensive experiments indicate that NRTR is effective at showing that neuron reconstruction is viewed as a set-prediction problem, which makes end-to-end model training available.
根据原始光学显微镜 (OM) 图像堆栈重建神经元是神经科学的基础。手动注释和半自动神经元追踪算法耗时且低效。现有的深度学习神经元重建方法虽然表现出示范性的性能，但极大地需要复杂的基于规则的组件。因此，一个关键的挑战是设计一种端到端的神经元重建方法，使整体框架更简单，模型训练更容易。我们提出了一种神经元重建变压器（NRTR），它抛弃了复杂的基于规则的组件，将神经元重建视为直接的集合预测问题。据我们所知，NRTR 是第一个用于端到端神经元重建的图像到设置深度学习模型。整个管道由 CNN 主干、Transformer 编码器-解码器和连接构建模块组成。 NRTR 生成代表原始神经元图像的神经元形态特征的点集。点之间的关系是通过连通性构建来建立的。点集保存为标准 SWC 文件。在使用 BigNeuron 和 VISoR-40 数据集的实验中，NRTR 在综合基准方面取得了出色的神经元重建结果，并且优于竞争基准。大量实验的结果表明，NRTR 有效地表明神经元重建被视为集合预测问题，这使得端到端模型训练成为可能。

AU de Vente, Coen Vermeer, Koenraad A. Jaccard, Nicolas Wang, He Sun, Hongyi Khader, Firas Truhn, Daniel Aimyshev, Temirgali Zhanibekuly, Yerkebulan Le, Tien-Dung Galdran, Adrian Ballester, Miguel Angel Gonzalez Carneiro, Gustavo Devika, R. G. Sethumadhavan, Hrishikesh Panikkasseril Puthussery, Densen Liu, Hong Yang, Zekang Kondo, Satoshi Kasai, Satoshi Wang, Edward Durvasula, Ashritha Heras, Jonathan Zapata, Miguel Angel Araujo, Teresa Aresta, Guilherme Bogunovic, Hrvoje Arikan, Mustafa Lee, Yeong Chan Cho, Hyun Bin Choi, Yoon Ho Qayyum, Abdul Razzak, Imran van Ginneken, Bram Lemij, Hans G. Sanchez, Clara I.
AU de Vente, Coen Vermeer, Koenraad A. Jaccard, Nicolas Wang, He Sun, Hongyi Khader, Firas Truhn, Daniel Aimyshev, Temirgali Zhanibekuly, Yerkebulan Le, Tien-Dung Galdran, Adrian Ballester, Miguel Angel Gonzalez Carneiro, Gustavo Devika, RG Sethumadhavan, Hrishikesh Panikkasseril Puthussery, Densen Liu, Hong Yang, Zekang Kondo, Satoshi Kasai, Satoshi Wang, Edward Durvasula, Ashritha Heras, Jonathan Zapata, Miguel Angel Araujo, Teresa Aresta, Guilherme Bogunovic, Hrvoje Arikan, Mustafa Lee, Yeong Chan Cho, Hyun Bin Choi、Yoon Ho Qayyum、Abdul Razzak、Imran van Ginneken、Bram Lemij、Hans G. Sanchez、Clara I.

AIROGS: Artificial Intelligence for Robust Glaucoma Screening Challenge
AIROGS：人工智能应对稳健的青光眼筛查挑战

The early detection of glaucoma is essential in preventing visual impairment. Artificial intelligence (AI) can be used to analyze color fundus photographs (CFPs) in a cost-effective manner, making glaucoma screening more accessible. While AI models for glaucoma screening from CFPs have shown promising results in laboratory settings, their performance decreases significantly in real-world scenarios due to the presence of out-of-distribution and low-quality images. To address this issue, we propose the Artificial Intelligence for Robust Glaucoma Screening (AIROGS) challenge. This challenge includes a large dataset of around 113,000 images from about 60,000 patients and 500 different screening centers, and encourages the development of algorithms that are robust to ungradable and unexpected input data. We evaluated solutions from 14 teams in this paper and found that the best teams performed similarly to a set of 20 expert ophthalmologists and optometrists. The highest-scoring team achieved an area under the receiver operating characteristic curve of 0.99 (95% CI: 0.98-0.99) for detecting ungradable images on-the-fly. Additionally, many of the algorithms showed robust performance when tested on three other publicly available datasets. These results demonstrate the feasibility of robust AI-enabled glaucoma screening.
青光眼的早期发现对于预防视力障碍至关重要。人工智能 (AI) 可用于以经济高效的方式分析彩色眼底照片 (CFP)，从而使青光眼筛查变得更加容易。虽然 CFP 的青光眼筛查 AI 模型在实验室环境中显示出良好的结果，但由于存在分布不均和低质量图像，其性能在现实场景中显着下降。为了解决这个问题，我们提出了人工智能稳健青光眼筛查 (AIROGS) 挑战。该挑战包括来自约 60,000 名患者和 500 个不同筛查中心的约 113,000 张图像的大型数据集，并鼓励开发对不可分级和意外输入数据具有鲁棒性的算法。我们在本文中评估了 14 个团队的解决方案，发现最好的团队与 20 名专业眼科医生和验光师组成的团队表现相似。得分最高的团队在动态检测不可分级图像方面获得了 0.99 的接收器操作特征曲线下面积（95% CI：0.98-0.99）。此外，在其他三个公开可用的数据集上进行测试时，许多算法表现出了强大的性能。这些结果证明了人工智能支持的强大青光眼筛查的可行性。

AU Gungor, Alper Askin, Baris Soydan, Damla Alptekin Top, Can Baris Saritas, Emine Ulku Cukur, Tolga
AU Gungor、Alper Askin、Baris Soydan、Damla Alptekin Top、Can Baris Saritas、Emine Ulku Cukur、Tolga

DEQ-MPI: A Deep Equilibrium Reconstruction With Learned Consistency for Magnetic Particle Imaging
DEQ-MPI：磁粒子成像具有学习一致性的深度平衡重建

Magnetic particle imaging (MPI) offers unparalleled contrast and resolution for tracing magnetic nanoparticles. A common imaging procedure calibrates a system matrix (SM) that is used to reconstruct data from subsequent scans. The ill-posed reconstruction problem can be solved by simultaneously enforcing data consistency based on the SM and regularizing the solution based on an image prior. Traditional hand-crafted priors cannot capture the complex attributes of MPI images, whereas recent MPI methods based on learned priors can suffer from extensive inference times or limited generalization performance. Here, we introduce a novel physics-driven method for MPI reconstruction based on a deep equilibrium model with learned data consistency (DEQ-MPI). DEQ-MPI reconstructs images by augmenting neural networks into an iterative optimization, as inspired by unrolling methods in deep learning. Yet, conventional unrolling methods are computationally restricted to few iterations resulting in non-convergent solutions, and they use hand-crafted consistency measures that can yield suboptimal capture of the data distribution. DEQ-MPI instead trains an implicit mapping to maximize the quality of a convergent solution, and it incorporates a learned consistency measure to better account for the data distribution. Demonstrations on simulated and experimental data indicate that DEQ-MPI achieves superior image quality and competitive inference time to state-of-the-art MPI reconstruction methods.
磁粒子成像 (MPI) 为追踪磁性纳米粒子提供了无与伦比的对比度和分辨率。常见的成像程序会校准系统矩阵 (SM)，该系统矩阵用于从后续扫描中重建数据。不适定重建问题可以通过同时基于 SM 强制执行数据一致性和基于图像先验对解进行正则化来解决。传统的手工制作先验无法捕获 MPI 图像的复杂属性，而最近基于学习先验的 MPI 方法可能会受到推理时间过长或泛化性能有限的影响。在这里，我们介绍了一种基于具有学习数据一致性的深度平衡模型（DEQ-MPI）的新型物理驱动 MPI 重建方法。 DEQ-MPI 受到深度学习中展开方法的启发，通过将神经网络增强为迭代优化来重建图像。然而，传统的展开方法在计算上仅限于少数迭代，导致解决方案不收敛，并且它们使用手工设计的一致性度量，可能会产生数据分布的次优捕获。相反，DEQ-MPI 训练隐式映射以最大限度地提高收敛解决方案的质量，并且它结合了学习的一致性度量以更好地解释数据分布。模拟和实验数据的演示表明，与最先进的 MPI 重建方法相比，DEQ-MPI 实现了卓越的图像质量和有竞争力的推理时间。

AU Hahne, Christopher Chabouh, Georges Chavignon, Arthur Couture, Olivier Sznitman, Raphael
AU Hahne、克里斯托弗·查布、乔治·夏维农、亚瑟·库图尔、奥利维尔·斯尼特曼、拉斐尔

RF-ULM: Ultrasound Localization Microscopy Learned From Radio-Frequency Wavefronts
RF-ULM：从射频波前学习超声定位显微镜

In Ultrasound Localization Microscopy (ULM), achieving high-resolution images relies on the precise localization of contrast agent particles across a series of beamformed frames. However, our study uncovers an enormous potential: The process of delay-and-sum beamforming leads to an irreversible reduction of Radio-Frequency (RF) channel data, while its implications for localization remain largely unexplored. The rich contextual information embedded within RF wavefronts, including their hyperbolic shape and phase, offers great promise for guiding Deep Neural Networks (DNNs) in challenging localization scenarios. To fully exploit this data, we propose to directly localize scatterers in RF channel data. Our approach involves a custom super-resolution DNN using learned feature channel shuffling, non-maximum suppression, and a semi-global convolutional block for reliable and accurate wavefront localization. Additionally, we introduce a geometric point transformation that facilitates seamless mapping to the B-mode coordinate space. To understand the impact of beamforming on ULM, we validate the effectiveness of our method by conducting an extensive comparison with State-Of-The-Art (SOTA) techniques. We present the inaugural in vivo results from a wavefront-localizing DNN, highlighting its real-world practicality. Our findings show that RF-ULM bridges the domain shift between synthetic and real datasets, offering a considerable advantage in terms of precision and complexity. To enable the broader research community to benefit from our findings.
在超声定位显微镜 (ULM) 中，获得高分辨率图像依赖于造影剂粒子在一系列波束形成帧上的精确定位。然而，我们的研究揭示了巨大的潜力：延迟求和波束成形过程会导致射频 (RF) 通道数据不可逆地减少，而其对定位的影响在很大程度上仍未被探索。 RF 波前中嵌入的丰富上下文信息（包括其双曲形状和相位）为指导深度神经网络 (DNN) 应对具有挑战性的定位场景提供了巨大的希望。为了充分利用这些数据，我们建议直接定位射频通道数据中的散射体。我们的方法涉及使用学习的特征通道改组、非极大值抑制和半全局卷积块的自定义超分辨率 DNN，以实现可靠且准确的波前定位。此外，我们引入了几何点变换，有助于无缝映射到 B 模式坐标空间。为了了解波束成形对 ULM 的影响，我们通过与最先进 (SOTA) 技术进行广泛比较来验证我们方法的有效性。我们展示了波前定位 DNN 的首次体内结果，强调了其现实世界的实用性。我们的研究结果表明，RF-ULM 弥合了合成数据集和真实数据集之间的领域转换，在精度和复杂性方面提供了相当大的优势。使更广泛的研究界能够从我们的研究结果中受益。

AU Zhu, Tao Yin, Lin He, Jie Wei, Zechen Yang, Xin Tian, Jie Hui, Hui
区朱、殷涛、何林、伟杰、杨泽辰、田鑫、辉杰、辉

Accurate Concentration Recovery for Quantitative Magnetic Particle Imaging Reconstruction via Nonconvex Regularization
通过非凸正则化定量磁粒子成像重建的精确浓度恢复

Magnetic particle imaging (MPI) uses nonlinear response signals to noninvasively detect magnetic nanoparticles in space, and its quantitative properties hold promise for future precise quantitative treatments. In reconstruction, the system matrix based method necessitates suitable regularization terms, such as Tikhonov or non-negative fused lasso (NFL) regularization, to stabilize the solution. While NFL regularization offers clearer edge information than Tikhonov regularization, it carries a biased estimate of the $\mathbf {l}_{\mathbf {{1}}}$ penalty, leading to an underestimation of the reconstructed concentration and adversely affecting the quantitative properties. In this paper, a new nonconvex regularization method including min-max concave (MC) and total variation (TV) regularization is proposed. This method utilized MC penalty to provide nearly unbiased sparse constraints and adds the TV penalty to provide a uniform intensity distribution of images. By combining the alternating direction multiplication method (ADMM) and the two-step parameter selection method, a more accurate quantitative MPI reconstruction was realized. The performance of the proposed method was verified on the simulation data, the Open-MPI dataset, and measured data from a homemade MPI scanner. The results indicate that the proposed method achieves better image quality while maintaining the quantitative properties, thus overcoming the drawback of intensity underestimation by the NFL method while providing edge information. In particular, for the measured data, the proposed method reduced the relative error in the intensity of the reconstruction results from 28% to 8%.
磁粒子成像（MPI）利用非线性响应信号无创地检测空间中的磁性纳米粒子，其定量特性为未来的精确定量治疗带来了希望。在重建过程中，基于系统矩阵的方法需要适当的正则化项，例如 Tikhonov 或非负融合套索 (NFL) 正则化，以稳定解。虽然 NFL 正则化提供了比 Tikhonov 正则化更清晰的边缘信息，但它对 $\mathbf {l}_{\mathbf {{1}}}$ 惩罚有偏差估计，导致重建浓度的低估并对定量产生不利影响。特性。本文提出了一种新的非凸正则化方法，包括最小-最大凹（MC）和全变分（TV）正则化。该方法利用 MC 惩罚来提供几乎无偏的稀疏约束，并添加 TV 惩罚来提供图像的均匀强度分布。通过结合交替方向乘法（ADMM）和两步参数选择方法，实现了更准确的定量MPI重建。该方法的性能在仿真数据、Open-MPI 数据集和自制 MPI 扫描仪的测量数据上得到了验证。结果表明，该方法在保持定量特性的同时获得了更好的图像质量，从而克服了NFL方法在提供边缘信息的同时低估强度的缺点。特别是，对于测量数据，该方法将重建结果强度的相对误差从28%降低到8%。

AU Chen, Kecheng Qin, Tiexin Lee, Victor Ho-Fun Yan, Hong Li, Haoliang
AU Chen、秦克成、李铁心、严浩芬、李红、浩亮

Learning Robust Shape Regularization for Generalizable Medical Image Segmentation
学习用于通用医学图像分割的鲁棒形状正则化

Generalizable medical image segmentation enables models to generalize to unseen target domains under domain shift issues. Recent progress demonstrates that the shape of the segmentation objective, with its high consistency and robustness across domains, can serve as a reliable regularization to aid the model for better cross-domain performance, where existing methods typically seek a shared framework to render segmentation maps and shape prior concurrently. However, due to the inherent texture and style preference of modern deep neural networks, the edge or silhouette of the extracted shape will inevitably be undermined by those domain-specific texture and style interferences of medical images under domain shifts. To address this limitation, we devise a novel framework with a separation between the shape regularization and the segmentation map. Specifically, we first customize a novel whitening transform-based probabilistic shape regularization extractor namely WT-PSE to suppress undesirable domain-specific texture and style interferences, leading to more robust and high-quality shape representations. Second, we deliver a Wasserstein distance-guided knowledge distillation scheme to help the WT-PSE to achieve more flexible shape extraction during the inference phase. Finally, by incorporating domain knowledge of medical images, we propose a novel instance-domain whitening transform method to facilitate a more stable training process with improved performance. Experiments demonstrate the performance of our proposed method on both multi-domain and single-domain generalization.
可泛化的医学图像分割使模型能够泛化到域转移问题下看不见的目标域。最近的进展表明，分割目标的形状具有跨域的高度一致性和鲁棒性，可以作为可靠的正则化来帮助模型获得更好的跨域性能，其中现有方法通常寻求一个共享框架来呈现分割图和同时塑造先验。然而，由于现代深度神经网络固有的纹理和风格偏好，提取的形状的边缘或轮廓将不可避免地受到域转移下医学图像的特定领域纹理和风格干扰的破坏。为了解决这个限制，我们设计了一个新颖的框架，将形状正则化和分割图分开。具体来说，我们首先定制了一种新颖的基于白化变换的概率形状正则化提取器，即 WT-PSE，以抑制不需要的特定域纹理和风格干扰，从而获得更稳健和高质量的形状表示。其次，我们提供了 Wasserstein 距离引导的知识蒸馏方案，以帮助 WT-PSE 在推理阶段实现更灵活的形状提取。最后，通过结合医学图像的领域知识，我们提出了一种新颖的实例域白化变换方法，以促进更稳定的训练过程和更高的性能。实验证明了我们提出的方法在多域和单域泛化上的性能。

AU Huang, Zixun Zhao, Rui Leung, Frank H. F. Banerjee, Sunetra Lam, Kin-Man Zheng, Yong-Ping Ling, Sai Ho
AU Huang、赵子勋、Rui Leung、Frank HF Banerjee、Sunetra Lam、Kin-Man Cheng、Yong-Ping Ling、Sai Ho

Landmark Localization From Medical Images With Generative Distribution Prior
具有生成分布先验的医学图像的地标定位

In medical image analysis, anatomical landmarks usually contain strong prior knowledge of their structural information. In this paper, we propose to promote medical landmark localization by modeling the underlying landmark distribution via normalizing flows. Specifically, we introduce the flow-based landmark distribution prior as a learnable objective function into a regression-based landmark localization framework. Moreover, we employ an integral operation to make the mapping from heatmaps to coordinates differentiable to further enhance heatmap-based localization with the learned distribution prior. Our proposed Normalizing Flow-based Distribution Prior (NFDP) employs a straightforward backbone and non-problem-tailored architecture (i.e., ResNet18), which delivers high-fidelity outputs across three X-ray-based landmark localization datasets. Remarkably, the proposed NFDP can do the job with minimal additional computational burden as the normalizing flows module is detached from the framework on inferencing. As compared to existing techniques, our proposed NFDP provides a superior balance between prediction accuracy and inference speed, making it a highly efficient and effective approach. The source code of this paper is available at https://github.com/jacksonhzx95/NFDP.
在医学图像分析中，解剖标志通常包含对其结构信息的强大先验知识。在本文中，我们建议通过标准化流对底层地标分布进行建模来促进医学地标定位。具体来说，我们将基于流的地标分布先验作为可学习的目标函数引入到基于回归的地标定位框架中。此外，我们采用积分运算使从热图到坐标的映射可微分，以进一步增强基于学习分布先验的基于热图的定位。我们提出的基于流的归一化分布先验 (NFDP) 采用简单的主干和非问题定制的架构（即 ResNet18），它在三个基于 X 射线的地标定位数据集上提供高保真输出。值得注意的是，由于标准化流模块与推理框架分离，所提出的 NFDP 可以以最小的额外计算负担完成这项工作。与现有技术相比，我们提出的 NFDP 在预测精度和推理速度之间提供了卓越的平衡，使其成为一种高效且有效的方法。本文的源代码可在https://github.com/jacksonhzx95/NFDP获取。

AU Tajbakhsh, Kiarash Stanowska, Olga Neels, Antonia Perren, Aurel Zboray, Robert
AU Tajbakhsh、基亚拉什·斯坦诺斯卡、奥尔加·尼尔斯、安东尼娅·佩伦、Aurel Zboray、罗伯特

3D Virtual Histopathology by Phase-Contrast X-Ray Micro-CT for Follicular Thyroid Neoplasms
通过相差 X 射线显微 CT 进行 3D 虚拟组织病理学治疗滤泡性甲状腺肿瘤

Histological analysis is the core of follicular thyroid carcinoma (FTC) classification. The histopathological criteria of capsular and vascular invasion define malignancy and aggressiveness of FTC. Analysis of multiple sections is cumbersome and as only a minute tissue fraction is analyzed during histopathology, under-sampling remains a problem. Application of an efficient tool for complete tissue imaging in 3D would speed-up diagnosis and increase accuracy. We show that X-ray propagation-based imaging (XPBI) of paraffin-embedded tissue blocks is a valuable complementary method for follicular thyroid carcinoma diagnosis and assessment. It enables a fast, non-destructive and accurate 3D virtual histology of the FTC resection specimen. We demonstrate that XPBI virtual slices can reliably evaluate capsular invasions. Then we discuss the accessible morphological information from XPBI and their significance for vascular invasion diagnosis. We show 3D morphological information that allow to discern vascular invasions. The results are validated by comparing XPBI images with clinically accepted histology slides revised by and under supervision of two experienced endocrine pathologists.
组织学分析是滤泡性甲状腺癌（FTC）分类的核心。包膜和血管侵犯的组织病理学标准定义了 FTC 的恶性和侵袭性。多个切片的分析很麻烦，并且由于在组织病理学过程中仅分析微小的组织部分，因此采样不足仍然是一个问题。应用有效的 3D 完整组织成像工具将加快诊断速度并提高准确性。我们表明，石蜡包埋组织块的基于 X 射线传播的成像 (XPBI) 是滤泡性甲状腺癌诊断和评估的一种有价值的补充方法。它能够对 FTC 切除标本进行快速、无损且准确的 3D 虚拟组织学分析。我们证明 XPBI 虚拟切片可以可靠地评估包膜侵袭。然后我们讨论 XPBI 中可获取的形态学信息及其对血管侵犯诊断的意义。我们显示 3D 形态信息，可以辨别血管侵犯。通过将 XPBI 图像与由两位经验丰富的内分泌病理学家修改并在其监督下修改的临床可接受的组织学切片进行比较来验证结果。

AU Cui, Jiaqi Zeng, Pinxian Zeng, Xinyi Xu, Yuanyuan Wang, Peng Zhou, Jiliu Wang, Yan Shen, Dinggang
崔AU、曾嘉琪、曾品贤、徐欣怡、王媛媛、周鹏、王继六、沉彦、丁刚

Prior Knowledge-guided Triple-Domain Transformer-GAN for Direct PET Reconstruction from Low-Count Sinograms.
先验知识引导的三域变压器-GAN，用于从低计数正弦图直接重建 PET。

To obtain high-quality positron emission tomography (PET) images while minimizing radiation exposure, numerous methods have been dedicated to acquiring standard-count PET (SPET) from low-count PET (LPET). However, current methods have failed to take full advantage of the different emphasized information from multiple domains, i.e., the sinogram, image, and frequency domains, resulting in the loss of crucial details. Meanwhile, they overlook the unique inner-structure of the sinograms, thereby failing to fully capture its structural characteristics and relationships. To alleviate these problems, in this paper, we proposed a prior knowledge-guided transformer-GAN that unites triple domains of sinogram, image, and frequency to directly reconstruct SPET images from LPET sinograms, namely PK-TriDo. Our PK-TriDo consists of a Sinogram Inner-Structure-based Denoising Transformer (SISD-Former) to denoise the input LPET sinogram, a Frequency-adapted Image Reconstruction Transformer (FaIR-Former) to reconstruct high-quality SPET images from the denoised sinograms guided by the image domain prior knowledge, and an Adversarial Network (AdvNet) to further enhance the reconstruction quality via adversarial training. Specifically tailored for the PET imaging mechanism, we injected a sinogram embedding module that partitions the sinograms by rows and columns to obtain 1D sequences of angles and distances to faithfully preserve the inner-structure of the sinograms. Moreover, to mitigate high-frequency distortions and enhance reconstruction details, we integrated global-local frequency parsers (GLFPs) into FaIR-Former to calibrate the distributions and proportions of different frequency bands, thus compelling the network to preserve high-frequency details. Evaluations on three datasets with different dose levels and imaging scenarios demonstrated that our PK-TriDo outperforms the state-of-the-art methods.
为了获得高质量的正电子发射断层扫描 (PET) 图像，同时最大限度地减少辐射暴露，许多方法致力于从低计数 PET (LPET) 中获取标准计数 PET (SPET)。然而，当前的方法未能充分利用来自多个域（即正弦图、图像和频域）的不同强调信息，导致关键细节的丢失。同时，他们忽视了正弦图独特的内部结构，从而未能充分捕捉其结构特征和关系。为了缓解这些问题，在本文中，我们提出了一种先验知识引导的 Transformer-GAN，它将正弦图、图像和频率三重域结合起来，直接从 LPET 正弦图重建 SPET 图像，即 PK-TriDo。我们的 PK-TriDo 包含一个基于正弦图内部结构的去噪变压器 (SISD-Former)，用于对输入 LPET 正弦图进行去噪；以及一个频率自适应图像重建变压器 (FaIR-Former)，用于从去噪正弦图重建高质量 SPET 图像以图像领域先验知识和对抗网络（AdvNet）为指导，通过对抗训练进一步提高重建质量。我们注入了一个专为 PET 成像机制定制的正弦图嵌入模块，该模块可以按行和列对正弦图进行分区，以获得角度和距离的一维序列，从而忠实地保留正弦图的内部结构。此外，为了减轻高频失真并增强重建细节，我们将全局局部频率解析器（GLFP）集成到FaIR-Former中，以校准不同频段的分布和比例，从而迫使网络保留高频细节。对具有不同剂量水平和成像场景的三个数据集的评估表明，我们的 PK-TriDo 优于最先进的方法。

AU Huang, Bangyan Li, Tiantian Arino-Estrada, Gerard Dulski, Kamil Shopa, Roman Y. Moskal, Pawel Stepien, Ewa Qi, Jinyi
AU Huang、Bangyan Li、Tiantian Arino-Estrada、Gerard Dulski、Kamil Shopa、Roman Y. Moskal、Pawel Stepien、Ewa Qi、Jinyi

SPLIT: Statistical Positronium Lifetime Image Reconstruction via Time-Thresholding
SPLIT：通过时间阈值重建统计正电子寿命图像

Positron emission tomography (PET) is a widely utilized medical imaging modality that uses positron-emitting radiotracers to visualize biochemical processes in a living body. The spatiotemporal distribution of a radiotracer is estimated by detecting the coincidence photon pairs generated through positron annihilations. In human tissue, about 40% of the positrons form positroniums prior to the annihilation. The lifetime of these positroniums is influenced by the microenvironment in the tissue and could provide valuable information for better understanding of disease progression and treatment response. Currently, there are few methods available for reconstructing high-resolution lifetime images in practical applications. This paper presents an efficient statistical image reconstruction method for positronium lifetime imaging (PLI). We also analyze the random triple-coincidence events in PLI and propose a correction method for random events, which is essential for real applications. Both simulation and experimental studies demonstrate that the proposed method can produce lifetime images with high numerical accuracy, low variance, and resolution comparable to that of the activity images generated by a PET scanner with currently available time-of-flight resolution.
正电子发射断层扫描 (PET) 是一种广泛使用的医学成像方式，它使用正电子发射放射性示踪剂来可视化活体内的生化过程。通过检测正电子湮灭产生的重合光子对来估计放射性示踪剂的时空分布。在人体组织中，大约 40% 的正电子在湮灭之前形成正电子素。这些正电子素的寿命受到组织中微环境的影响，可以为更好地了解疾病进展和治疗反应提供有价值的信息。目前，在实际应用中可用于重建高分辨率寿命图像的方法很少。本文提出了一种有效的正电子寿命成像（PLI）统计图像重建方法。我们还分析了 PLI 中的随机三重符合事件，并提出了一种随机事件的校正方法，这对于实际应用至关重要。模拟和实验研究都表明，所提出的方法可以生成具有高数值精度、低方差和分辨率的终生图像，其分辨率可与具有当前可用飞行时间分辨率的 PET 扫描仪生成的活动图像相媲美。

AU Wu, Qian Chen, Yufei Liu, Wei Yue, Xiaodong Zhuang, Xiahai
吴宇、陈茜、刘宇飞、岳伟、庄晓东、夏海

Deep Closing: Enhancing Topological Connectivity in Medical Tubular Segmentation.
深度闭合：增强医学管状分割中的拓扑连接性。

Accurately segmenting tubular structures, such as blood vessels or nerves, holds significant clinical implications across various medical applications. However, existing methods often exhibit limitations in achieving satisfactory topological performance, particularly in terms of preserving connectivity. To address this challenge, we propose a novel deep-learning approach, termed Deep Closing, inspired by the well-established classic closing operation. Deep Closing first leverages an AutoEncoder trained in the Masked Image Modeling (MIM) paradigm, enhanced with digital topology knowledge, to effectively learn the inherent shape prior of tubular structures and indicate potential disconnected regions. Subsequently, a Simple Components Erosion module is employed to generate topology-focused outcomes, which refines the preceding segmentation results, ensuring all the generated regions are topologically significant. To evaluate the efficacy of Deep Closing, we conduct comprehensive experiments on 4 datasets: DRIVE, CHASE DB1, DCA1, and CREMI. The results demonstrate that our approach yields considerable improvements in topological performance compared with existing methods. Furthermore, Deep Closing exhibits the ability to generalize and transfer knowledge from external datasets, showcasing its robustness and adaptability. The code for this paper has been available at: https://github.com/5k5000/DeepClosing.
准确分割血管或神经等管状结构对于各种医疗应用具有重要的临床意义。然而，现有方法在实现令人满意的拓扑性能方面通常表现出局限性，特别是在保持连通性方面。为了应对这一挑战，我们提出了一种新颖的深度学习方法，称为深度闭运算，其灵感来自于成熟的经典闭运算。 Deep Closing 首先利用在掩模图像建模 (MIM) 范式中训练的自动编码器，并通过数字拓扑知识进行增强，以有效地学习管状结构的固有形状先验并指示潜在的断开区域。随后，采用简单组件侵蚀模块来生成以拓扑为中心的结果，从而细化前面的分割结果，确保所有生成的区域都具有拓扑意义。为了评估 Deep Closing 的效果，我们在 DRIVE、CHASE DB1、DCA1 和 CREMI 4 个数据集上进行了全面的实验。结果表明，与现有方法相比，我们的方法在拓扑性能方面取得了相当大的改进。此外，Deep Closing 还展示了从外部数据集中泛化和传输知识的能力，展示了其鲁棒性和适应性。本文的代码可在以下网址获取：https://github.com/5k5000/DeepClosing。

AU Cao, Chentao Cui, Zhuo-Xu Wang, Yue Liu, Shaonan Chen, Taijin Zheng, Hairong Liang, Dong Zhu, Yanjie
曹AU、崔晨涛、王卓旭、刘悦、陈少南、郑太金、梁海荣、朱东、燕杰

High-Frequency Space Diffusion Model for Accelerated MRI
加速 MRI 的高频空间扩散模型

Diffusion models with continuous stochastic differential equations (SDEs) have shown superior performances in image generation. It can serve as a deep generative prior to solving the inverse problem in magnetic resonance (MR) reconstruction. However, low-frequency regions of k-space data are typically fully sampled in fast MR imaging, while existing diffusion models are performed throughout the entire image or k-space, inevitably introducing uncertainty in the reconstruction of low-frequency regions. Additionally, existing diffusion models often demand substantial iterations to converge, resulting in time-consuming reconstructions. To address these challenges, we propose a novel SDE tailored specifically for MR reconstruction with the diffusion process in high-frequency space (referred to as HFS-SDE). This approach ensures determinism in the fully sampled low-frequency regions and accelerates the sampling procedure of reverse diffusion. Experiments conducted on the publicly available fastMRI dataset demonstrate that the proposed HFS-SDE method outperforms traditional parallel imaging methods, supervised deep learning, and existing diffusion models in terms of reconstruction accuracy and stability. The fast convergence properties are also confirmed through theoretical and experimental validation.
具有连续随机微分方程 (SDE) 的扩散模型在图像生成方面表现出了卓越的性能。它可以作为解决磁共振（MR）重建中的逆问题之前的深度生成。然而，在快速MR成像中，k空间数据的低频区域通常被完全采样，而现有的扩散模型是在整个图像或k空间中执行的，不可避免地在低频区域的重建中引入不确定性。此外，现有的扩散模型通常需要大量迭代才能收敛，从而导致重建耗时。为了解决这些挑战，我们提出了一种专门针对高频空间中的扩散过程进行MR重建的新型SDE（称为HFS-SDE）。这种方法确保了完全采样的低频区域的确定性，并加速了反向扩散的采样过程。在公开的 fastMRI 数据集上进行的实验表明，所提出的 HFS-SDE 方法在重建精度和稳定性方面优于传统的并行成像方法、监督深度学习和现有的扩散模型。快速收敛特性也通过理论和实验验证得到证实。

AU Liu, Min Han, Yubin Wang, Jiazheng Wang, Can Wang, Yaonan Meijering, Erik

LSKANet: Long Strip Kernel Attention Network for Robotic Surgical Scene Segmentation
LSKANet：用于机器人手术场景分割的长条核注意网络

Surgical scene segmentation is a critical task in Robotic-assisted surgery. However, the complexity of the surgical scene, which mainly includes local feature similarity (e.g., between different anatomical tissues), intraoperative complex artifacts, and indistinguishable boundaries, poses significant challenges to accurate segmentation. To tackle these problems, we propose the Long Strip Kernel Attention network (LSKANet), including two well-designed modules named Dual-block Large Kernel Attention module (DLKA) and Multiscale Affinity Feature Fusion module (MAFF), which can implement precise segmentation of surgical images. Specifically, by introducing strip convolutions with different topologies (cascaded and parallel) in two blocks and a large kernel design, DLKA can make full use of region- and strip-like surgical features and extract both visual and structural information to reduce the false segmentation caused by local feature similarity. In MAFF, affinity matrices calculated from multiscale feature maps are applied as feature fusion weights, which helps to address the interference of artifacts by suppressing the activations of irrelevant regions. Besides, the hybrid loss with Boundary Guided Head (BGH) is proposed to help the network segment indistinguishable boundaries effectively. We evaluate the proposed LSKANet on three datasets with different surgical scenes. The experimental results show that our method achieves new state-of-the-art results on all three datasets with improvements of 2.6%, 1.4%, and 3.4% mIoU, respectively. Furthermore, our method is compatible with different backbones and can significantly increase their segmentation accuracy. Code is available at https://github.com/YubinHan73/LSKANet.
手术场景分割是机器人辅助手术中的一项关键任务。然而，手术场景的复杂性，主要包括局部特征相似性（例如，不同解剖组织之间）、术中复杂伪影和难以区分的边界，对精确分割提出了重大挑战。为了解决这些问题，我们提出了长带核注意网络（LSKANet），包括两个精心设计的模块：双块大核注意模块（DLKA）和多尺度亲和特征融合模块（MAFF），可以实现对物体的精确分割。手术图像。具体来说，通过在两个块中引入具有不同拓扑（级联和并行）的带状卷积和大内核设计，DLKA可以充分利用区域和带状手术特征并提取视觉和结构信息，以减少造成的错误分割通过局部特征相似度。在MAFF中，根据多尺度特征图计算出的亲和力矩阵被用作特征融合权重，这有助于通过抑制不相关区域的激活来解决伪影的干扰。此外，提出了边界引导头（BGH）的混合损失来帮助有效地帮助网络分割不可区分的边界。我们在具有不同手术场景的三个数据集上评估了所提出的 LSKANet。实验结果表明，我们的方法在所有三个数据集上均取得了最新的结果，mIoU 分别提高了 2.6%、1.4% 和 3.4%。此外，我们的方法与不同的骨干网兼容，可以显着提高其分割精度。代码可在 https://github.com/YubinHan73/LSKANet 获取。

AU Wang, Ke Chen, Zicong Zhu, Mingjia Li, Zhetao Weng, Jian Gu, Tianlong
王AU、陈科、朱自聪、李明佳、翁哲涛、谷健、天龙

Score-based Counterfactual Generation for Interpretable Medical Image Classification and Lesion Localization.
基于分数的反事实生成，用于可解释的医学图像分类和病变定位。

Deep neural networks (DNNs) have immense potential for precise clinical decision-making in the field of biomedical imaging. However, accessing high-quality data is crucial for ensuring the high-performance of DNNs. Obtaining medical imaging data is often challenging in terms of both quantity and quality. To address these issues, we propose a score-based counterfactual generation (SCG) framework to create counterfactual images from latent space, to compensate for scarcity and imbalance of data. In addition, some uncertainties in external physical factors may introduce unnatural features and further affect the estimation of the true data distribution. Therefore, we integrated a learnable FuzzyBlock into the classifier of the proposed framework to manage these uncertainties. The proposed SCG framework can be applied to both classification and lesion localization tasks. The experimental results revealed a remarkable performance boost in classification tasks, achieving an average performance enhancement of 3-5% compared to previous state-of-the-art (SOTA) methods in interpretable lesion localization.
深度神经网络（DNN）在生物医学成像领域的精确临床决策方面具有巨大潜力。然而，访问高质量数据对于确保 DNN 的高性能至关重要。获取医学成像数据在数量和质量方面通常都具有挑战性。为了解决这些问题，我们提出了一种基于分数的反事实生成（SCG）框架，从潜在空间创建反事实图像，以弥补数据的稀缺性和不平衡。此外，外部物理因素的一些不确定性可能会引入不自然的特征，进一步影响对真实数据分布的估计。因此，我们将可学习的 FuzzyBlock 集成到所提出框架的分类器中来管理这些不确定性。所提出的 SCG 框架可应用于分类和病变定位任务。实验结果显示，分类任务的性能显着提升，与之前最先进的 (SOTA) 方法相比，在可解释病变定位方面平均性能提高了 3-5%。

AU Jin, Liang Gu, Shixuan Wei, Donglai Adhinarta, Jason Ken Kuang, Kaiming Zhang, Yongjie Jessica Pfister, Hanspeter Ni, Bingbing Yang, Jiancheng Li, Ming
区金、谷亮、魏世轩、阿迪纳塔东来、Jason Ken Kuang、张凯明、杰西卡·菲斯特永杰、倪汉斯彼特、杨冰冰、李建成、明

RibSeg v2: A Large-Scale Benchmark for Rib Labeling and Anatomical Centerline Extraction
RibSeg v2：肋骨标记和解剖中心线提取的大规模基准

Automatic rib labeling and anatomical centerline extraction are common prerequisites for various clinical applications. Prior studies either use in-house datasets that are inaccessible to communities, or focus on rib segmentation that neglects the clinical significance of rib labeling. To address these issues, we extend our prior dataset (RibSeg) on the binary rib segmentation task to a comprehensive benchmark, named RibSeg v2, with 660 CT scans (15,466 individual ribs in total) and annotations manually inspected by experts for rib labeling and anatomical centerline extraction. Based on the RibSeg v2, we develop a pipeline including deep learning-based methods for rib labeling, and a skeletonization-based method for centerline extraction. To improve computational efficiency, we propose a sparse point cloud representation of CT scans and compare it with standard dense voxel grids. Moreover, we design and analyze evaluation metrics to address the key challenges of each task. Our dataset, code, and model are available online to facilitate open research at https://github.com/M3DV/RibSeg.
自动肋骨标记和解剖中心线提取是各种临床应用的常见先决条件。先前的研究要么使用社区无法访问的内部数据集，要么专注于肋骨分割，而忽略了肋骨标记的临床意义。为了解决这些问题，我们将二进制肋骨分割任务上的先前数据集 (RibSeg) 扩展为一个名为 RibSeg v2 的综合基准，其中包含 660 个 CT 扫描（总共 15,466 根肋骨）以及由肋骨标记和解剖专家手动检查的注释。中心线提取。基于 RibSeg v2，我们开发了一个管道，包括基于深度学习的肋骨标记方法和基于骨架化的中心线提取方法。为了提高计算效率，我们提出了 CT 扫描的稀疏点云表示，并将其与标准密集体素网格进行比较。此外，我们设计和分析评估指标以解决每项任务的关键挑战。我们的数据集、代码和模型可在线获取，以促进开放研究：https://github.com/M3DV/RibSeg。

AU Lei, Wenhui Su, Qi Jiang, Tianyu Gu, Ran Wang, Na Liu, Xinglong Wang, Guotai Zhang, Xiaofan Zhang, Shaoting
区磊、苏文辉、蒋琪、顾天宇、王然、刘娜、王兴龙、张国泰、张晓凡、绍婷

One-Shot Weakly-Supervised Segmentation in 3D Medical Images
3D 医学图像中的一次性弱监督分割

Deep neural networks typically require accurate and a large number of annotations to achieve outstanding performance in medical image segmentation. One-shot and weakly-supervised learning are promising research directions that reduce labeling effort by learning a new class from only one annotated image and using coarse labels instead, respectively. In this work, we present an innovative framework for 3D medical image segmentation with one-shot and weakly-supervised settings. Firstly a propagation-reconstruction network is proposed to propagate scribbles from one annotated volume to unlabeled 3D images based on the assumption that anatomical patterns in different human bodies are similar. Then a multi-level similarity denoising module is designed to refine the scribbles based on embeddings from anatomical- to pixel-level. After expanding the scribbles to pseudo masks, we observe the miss-classified voxels mainly occur at the border region and propose to extract self-support prototypes for the specific refinement. Based on these weakly-supervised segmentation results, we further train a segmentation model for the new class with the noisy label training strategy. Experiments on three CT and one MRI datasets show the proposed method obtains significant improvement over the state-of-the-art methods and performs robustly even under severe class imbalance and low contrast. Code is publicly available at https://github.com/LWHYC/OneShot_WeaklySeg.
深度神经网络通常需要准确且大量的注释才能在医学图像分割中实现出色的性能。一次性学习和弱监督学习是有前途的研究方向，它们分别通过仅从一张带注释的图像中学习新类别和使用粗标签来减少标记工作。在这项工作中，我们提出了一种具有一次性和弱监督设置的 3D 医学图像分割创新框架。首先，基于不同人体的解剖模式相似的假设，提出了一种传播重建网络，将涂鸦从一个带注释的体积传播到未标记的 3D 图像。然后，设计了一个多级相似性去噪模块，以基于从解剖级到像素级的嵌入来细化涂鸦。将涂鸦扩展到伪掩模后，我们观察到错误分类的体素主要发生在边界区域，并提出提取自支撑原型以进行特定的细化。基于这些弱监督分割结果，我们使用噪声标签训练策略进一步训练新类的分割模型。对三个 CT 和一个 MRI 数据集的实验表明，所提出的方法比最先进的方法获得了显着改进，即使在严重的类别不平衡和低对比度下也能稳健地执行。代码可在 https://github.com/LWHYC/OneShot_WeaklySeg 上公开获取。

AU van Garderen, Karin A. van der Voort, Sebastian R. Wijnenga, Maarten M. J. Incekara, Fatih Alafandi, Ahmad Kapsas, Georgios Gahrmann, Renske Schouten, Joost W. Dubbink, Hendrikus J. Vincent, Arnaud J. P. E. van den Bent, Martin French, Pim J. Smits, Marion Klein, Stefan
AU van Garderen、Karin A. van der Voort、Sebastian R. Wijnenga、Maarten MJ Incekara、Fatih Alafandi、Ahmad Kapsas、Georgios Gahrmann、Renske Schouten、Joost W. Dubbink、Hendrikus J. Vincent、Arnaud JPE van den Bent、Martin French 、皮姆·J·史密茨、玛丽昂·克莱因、斯特凡

Evaluating the Predictive Value of Glioma Growth Models for Low-Grade Glioma After Tumor Resection
评估肿瘤切除后胶质瘤生长模型对低级别胶质瘤的预测价值

Tumor growth models have the potential to model and predict the spatiotemporal evolution of glioma in individual patients. Infiltration of glioma cells is known to be faster along the white matter tracts, and therefore structural magnetic resonance imaging (MRI) and diffusion tensor imaging (DTI) can be used to inform the model. However, applying and evaluating growth models in real patient data is challenging. In this work, we propose to formulate the problem of tumor growth as a ranking problem, as opposed to a segmentation problem, and use the average precision (AP) as a performance metric. This enables an evaluation of the spatial pattern that does not require a volume cut-off value. Using the AP metric, we evaluate diffusion-proliferation models informed by structural MRI and DTI, after tumor resection. We applied the models to a unique longitudinal dataset of 14 patients with low-grade glioma (LGG), who received no treatment after surgical resection, to predict the recurrent tumor shape after tumor resection. The diffusion models informed by structural MRI and DTI showed a small but significant increase in predictive performance with respect to homogeneous isotropic diffusion, and the DTI-informed model reached the best predictive performance. We conclude there is a significant improvement in the prediction of the recurrent tumor shape when using a DTI-informed anisotropic diffusion model with respect to istropic diffusion, and that the AP is a suitable metric to evaluate these models. All code and data used in this publication are made publicly available.
肿瘤生长模型有可能模拟和预测个体患者神经胶质瘤的时空演变。众所周知，神经胶质瘤细胞沿白质束的浸润速度更快，因此结构磁共振成像 (MRI) 和扩散张量成像 (DTI) 可用于为模型提供信息。然而，在真实患者数据中应用和评估增长模型具有挑战性。在这项工作中，我们建议将肿瘤生长问题表述为排序问题，而不是分割问题，并使用平均精度（AP）作为性能指标。这使得能够评估不需要体积截止值的空间图案。使用 AP 指标，我们评估肿瘤切除后由结构 MRI 和 DTI 提供的扩散增殖模型。我们将这些模型应用于 14 名低级别胶质瘤 (LGG) 患者的独特纵向数据集，这些患者在手术切除后未接受任何治疗，以预测肿瘤切除后复发的肿瘤形状。由结构 MRI 和 DTI 提供信息的扩散模型显示，相对于均匀各向同性扩散，预测性能有小幅但显着的提高，并且由 DTI 提供信息的模型达到了最佳预测性能。我们得出的结论是，使用基于 DTI 的各向异性扩散模型相对于各向异性扩散而言，对复发肿瘤形状的预测有显着改善，并且 AP 是评估这些模型的合适指标。本出版物中使用的所有代码和数据均公开可用。

AU Lin, Yiyang Wang, Yifeng Fang, Zijie Li, Zexin Guan, Xianchao Jiang, Danling Zhang, Yongbing
AU Lin、王一阳、方一峰、李子杰、关泽新、蒋贤超、张丹玲、永兵

A Multi-Perspective Self-Supervised Generative Adversarial Network for FS to FFPE Stain Transfer.
用于 FS 到 FFPE 污渍转移的多视角自监督生成对抗网络。

In clinical practice, frozen section (FS) images can be utilized to obtain the immediate pathological results of the patients in operation due to their fast production speed. However, compared with the formalin-fixed and paraffin-embedded (FFPE) images, the FS images greatly suffer from poor quality. Thus, it is of great significance to transfer the FS image to the FFPE one, which enables pathologists to observe high-quality images in operation. However, obtaining the paired FS and FFPE images is quite hard, so it is difficult to obtain accurate results using supervised methods. Apart from this, the FS to FFPE stain transfer faces many challenges. Firstly, the number and position of nuclei scattered throughout the image are hard to maintain during the transfer process. Secondly, transferring the blurry FS images to the clear FFPE ones is quite challenging. Thirdly, compared with the center regions of each patch, the edge regions are harder to transfer. To overcome these problems, a multi-perspective self-supervised GAN, incorporating three auxiliary tasks, is proposed to improve the performance of FS to FFPE stain transfer. Concretely, a nucleus consistency constraint is designed to enable the high-fidelity of nuclei, an FFPE guided image deblurring is proposed for improving the clarity, and a multi-field-of-view consistency constraint is designed to better generate the edge regions. Objective indicators and pathologists' evaluation for experiments on the five datasets across different countries have demonstrated the effectiveness of our method. In addition, the validation in the downstream task of microsatellite instability prediction has also proved the performance improvement by transferring the FS images to FFPE ones. Our code link is https://github.com/linyiyang98/Self-Supervised-FS2FFPE.git.
在临床实践中，冷冻切片（FS）图像由于其制作速度快，可用于获得术中患者的即时病理结果。然而，与福尔马林固定石蜡包埋（FFPE）图像相比，FS 图像质量较差。因此，将 FS 图像转移到 FFPE 图像具有重要意义，使病理学家能够在手术中观察到高质量的图像。然而，获得配对的 FS 和 FFPE 图像相当困难，因此使用监督方法很难获得准确的结果。除此之外，FS 到 FFPE 染色转移还面临许多挑战。首先，在转移过程中很难维持分散在整个图像中的核的数量和位置。其次，将模糊的 FS 图像转换为清晰的 FFPE 图像非常具有挑战性。第三，与每个补丁的中心区域相比，边缘区域更难转移。为了克服这些问题，提出了一种包含三个辅助任务的多视角自监督 GAN，以提高 FS 到 FFPE 染色剂转移的性能。具体来说，设计了核一致性约束以实现核的高保真度，提出了FFPE引导图像去模糊以提高清晰度，并设计了多视场一致性约束以更好地生成边缘区域。客观指标和病理学家对不同国家五个数据集实验的评估证明了我们方法的有效性。此外，下游微卫星不稳定预测任务的验证也证明了将FS图像转换为FFPE图像的性能提升。我们的代码链接是https://github。com/linyiyang98/Self-Supervised-FS2FFPE.git。

EI 1558-254X DA 2024-09-18 UT MEDLINE:39283778 PM 39283778 ER
EI 1558-254X DA 2024-09-18 UT MEDLINE：39283778 PM 39283778 ER

AU Chen, Zhongyu Bian, Yun Shen, Erwei Fan, Ligang Zhu, Weifang Shi, Fei Shao, Chengwei Chen, Xinjian Xiang, Dehui
陈AU、卞中宇、沉云、范尔伟、朱立刚、施伟芳、邵飞、陈成伟、项新建、德辉

Moment-Consistent Contrastive CycleGAN for Cross-Domain Pancreatic Image Segmentation.
用于跨域胰腺图像分割的时刻一致对比循环 GAN。

CT and MR are currently the most common imaging techniques for pancreatic cancer diagnosis. Accurate segmentation of the pancreas in CT and MR images can provide significant help in the diagnosis and treatment of pancreatic cancer. Traditional supervised segmentation methods require a large number of labeled CT and MR training data, which is usually time-consuming and laborious. Meanwhile, due to domain shift, traditional segmentation networks are difficult to be deployed on different imaging modality datasets. Cross-domain segmentation can utilize labeled source domain data to assist unlabeled target domains in solving the above problems. In this paper, a cross-domain pancreas segmentation algorithm is proposed based on Moment-Consistent Contrastive Cycle Generative Adversarial Networks (MC-CCycleGAN). MC-CCycleGAN is a style transfer network, in which the encoder of its generator is used to extract features from real images and style transfer images, constrain feature extraction through a contrastive loss, and fully extract structural features of input images during style transfer while eliminate redundant style features. The multi-order central moments of the pancreas are proposed to describe its anatomy in high dimensions and a contrastive loss is also proposed to constrain the moment consistency, so as to maintain consistency of the pancreatic structure and shape before and after style transfer. Multi-teacher knowledge distillation framework is proposed to transfer the knowledge from multiple teachers to a single student, so as to improve the robustness and performance of the student network. The experimental results have demonstrated the superiority of our framework over state-of-the-art domain adaptation methods.
CT和MR是目前诊断胰腺癌最常用的影像技术。 CT和MR图像中胰腺的精确分割可以为胰腺癌的诊断和治疗提供重要帮助。传统的监督分割方法需要大量带标签的CT和MR训练数据，通常费时费力。同时，由于域转移，传统的分割网络很难部署在不同的成像模态数据集上。跨域分割可以利用标记的源域数据来辅助未标记的目标域解决上述问题。本文提出了一种基于矩一致对比循环生成对抗网络（MC-CCycleGAN）的跨域胰腺分割算法。 MC-CCycleGAN是一种风格迁移网络，其中生成器的编码器用于从真实图像和风格迁移图像中提取特征，通过对比损失来约束特征提取，在风格迁移过程中充分提取输入图像的结构特征，同时消除多余的风格特征。提出胰腺的多阶中心矩来描述其高维解剖结构，并提出对比损失来约束矩一致性，从而保持样式转移前后胰腺结构和形状的一致性。提出了多教师知识蒸馏框架，将知识从多个教师转移到单个学生，从而提高学生网络的鲁棒性和性能。实验结果证明了我们的框架相对于最先进的域适应方法的优越性。

AU Mou, Lei Yan, Qifeng Lin, Jinghui Zhao, Yifan Liu, Yonghuai Ma, Shaodong Zhang, Jiong Lv, Wenhao Zhou, Tao Frangi, Alejandro F Zhao, Yitian
区某、严雷、林奇峰、赵晶辉、刘一凡、马永怀、张少东、吕炯、周文浩、陶弗朗吉、Alejandro F 赵、益田

COSTA: A Multi-center TOF-MRA Dataset and A Style Self-Consistency Network for Cerebrovascular Segmentation.
COSTA：多中心 TOF-MRA 数据集和脑血管分割风格自洽网络。

Time-of-flight magnetic resonance angiography (TOF-MRA) is the least invasive and ionizing radiation-free approach for cerebrovascular imaging, but variations in imaging artifacts across different clinical centers and imaging vendors result in inter-site and inter-vendor heterogeneity, making its accurate and robust cerebrovascular segmentation challenging. Moreover, the limited availability and quality of annotated data pose further challenges for segmentation methods to generalize well to unseen datasets. In this paper, we construct the largest and most diverse TOF-MRA dataset (COSTA) from 8 individual imaging centers, with all the volumes manually annotated. Then we propose a novel network for cerebrovascular segmentation, namely CESAR, with the ability to tackle feature granularity and image style heterogeneity issues. Specifically, a coarse-to-fine architecture is implemented to refine cerebrovascular segmentation in an iterative manner. An automatic feature selection module is proposed to selectively fuse global long-range dependencies and local contextual information of cerebrovascular structures. A style self-consistency loss is then introduced to explicitly align diverse styles of TOF-MRA images to a standardized one. Extensive experimental results on the COSTA dataset demonstrate the effectiveness of our CESAR network against state-of-the-art methods. We have made 6 subsets of COSTA with the source code online available, in order to promote relevant research in the community.
飞行时间磁共振血管造影（TOF-MRA）是脑血管成像中侵入性最小、无电离辐射的方法，但不同临床中心和成像供应商之间的成像伪影存在差异，导致站点间和供应商间的异质性，使其准确而稳健的脑血管分割具有挑战性。此外，注释数据的可用性和质量有限，这对分割方法如何泛化到未见过的数据集提出了进一步的挑战。在本文中，我们从 8 个独立成像中心构建了最大且最多样化的 TOF-MRA 数据集 (COSTA)，所有数据卷均经过手动注释。然后，我们提出了一种新颖的脑血管分割网络，即 CESAR，能够解决特征粒度和图像风格异质性问题。具体来说，采用从粗到细的架构以迭代方式细化脑血管分割。提出了一种自动特征选择模块来选择性地融合脑血管结构的全局远程依赖性和局部上下文信息。然后引入风格自一致性损失，以明确地将不同风格的 TOF-MRA 图像与标准化图像对齐。 COSTA 数据集上的大量实验结果证明了我们的 CESAR 网络相对于最先进方法的有效性。我们已经制作了 COSTA 的 6 个子集并提供了在线源代码，以促进社区的相关研究。

AU Sharifzadeh, Mostafa Goudarzi, Sobhan Tang, An Benali, Habib Rivaz, Hassan
AU Sharifzadeh、Mostafa Goudarzi、Sobhan Tang、An Benali、Habib Rivaz、Hassan

Mitigating Aberration-Induced Noise: A Deep Learning-Based Aberration-to-Aberration Approach.
减轻像差引起的噪声：一种基于深度学习的像差到像差方法。

One of the primary sources of suboptimal image quality in ultrasound imaging is phase aberration. It is caused by spatial changes in sound speed over a heterogeneous medium, which disturbs the transmitted waves and prevents coherent summation of echo signals. Obtaining non-aberrated ground truths in real-world scenarios can be extremely challenging, if not impossible. This challenge hinders the performance of deep learning-based techniques due to the domain shift between simulated and experimental data. Here, for the first time, we propose a deep learning-based method that does not require ground truth to correct the phase aberration problem and, as such, can be directly trained on real data. We train a network wherein both the input and target output are randomly aberrated radio frequency (RF) data. Moreover, we demonstrate that a conventional loss function such as mean square error is inadequate for training such a network to achieve optimal performance. Instead, we propose an adaptive mixed loss function that employs both B-mode and RF data, resulting in more efficient convergence and enhanced performance. Finally, we publicly release our dataset, comprising over 180,000 aberrated single plane-wave images (RF data), wherein phase aberrations are modeled as near-field phase screens. Although not utilized in the proposed method, each aberrated image is paired with its corresponding aberration profile and the non-aberrated version, aiming to mitigate the data scarcity problem in developing deep learning-based techniques for phase aberration correction. Source code and trained model are also available along with the dataset at http://code.sonography.ai/main-aaa.
超声成像中图像质量欠佳的主要来源之一是相位像差。它是由异质介质上声速的空间变化引起的，它会干扰传输波并阻止回声信号的相干求和。在现实场景中获得无畸变的基本事实即使不是不可能，也是极具挑战性的。由于模拟数据和实验数据之间的域转移，这一挑战阻碍了基于深度学习的技术的性能。在这里，我们首次提出一种基于深度学习的方法，不需要地面事实来纠正相位像差问题，因此可以直接在真实数据上进行训练。我们训练一个网络，其中输入和目标输出都是随机畸变的射频（RF）数据。此外，我们证明了传统的损失函数（例如均方误差）不足以训练这样的网络以实现最佳性能。相反，我们提出了一种采用 B 模式和 RF 数据的自适应混合损失函数，从而实现更有效的收敛并增强性能。最后，我们公开发布我们的数据集，其中包含超过 180,000 个像差单平面波图像（RF 数据），其中相位像差被建模为近场相位屏幕。虽然在所提出的方法中没有使用，但每个像差图像都与其相应的像差轮廓和无像差版本配对，旨在缓解开发基于深度学习的相位像差校正技术时的数据稀缺问题。源代码和训练模型也可与数据集一起在 http://code.sonography.ai/main-aaa 上获取。

AU Huang, Junzhang Zhu, Xiongfeng Chen, Ziyang Lin, Guoye Huang, Meiyan Feng, Qianjin
黄AU、朱俊章、陈雄峰、林紫阳、黄国业、冯美艳、前进

Pathological Priors Inspired Network for Vertebral Osteophytes Recognition
病理学先验启发的椎骨骨赘识别网络

Automatic vertebral osteophyte recognition in Digital Radiography is of great importance for the early prediction of degenerative disease but is still a challenge because of the tiny size and high inter-class similarity between normal and osteophyte vertebrae. Meanwhile, common sampling strategies applied in Convolution Neural Network could cause detailed context loss. All of these could lead to an incorrect positioning predicament. In this paper, based on important pathological priors, we define a set of potential lesions of each vertebra and propose a novel Pathological Priors Inspired Network (PPIN) to achieve accurate osteophyte recognition. PPIN comprises a backbone feature extractor integrating with a Wavelet Transform Sampling module for high-frequency detailed context extraction, a detection branch for locating all potential lesions and a classification branch for producing final osteophyte recognition. The Anatomical Map-guided Filter between two branches helps the network focus on the specific anatomical regions via the generated heatmaps of potential lesions in the detection branch to address the incorrect positioning problem. To reduce the inter-class similarity, a Bilateral Augmentation Module based on the graph relationship is proposed to imitate the clinical diagnosis process and to extract discriminative contextual information between adjacent vertebrae in the classification branch. Experiments on the two osteophytes-specific datasets collected from the public VinDr-Spine database show that the proposed PPIN achieves the best recognition performance among multitask frameworks and shows strong generalization. The results on a private dataset demonstrate the potential in clinical application. The Class Activation Maps also show the powerful localization capability of PPIN. The source codes are available in https://github.com/Phalo/PPIN.
数字放射成像中的自动椎骨骨赘识别对于退行性疾病的早期预测非常重要，但由于正常椎骨和骨赘椎骨之间的尺寸微小且类间高度相似，因此仍然是一个挑战。同时，卷积神经网络中应用的常见采样策略可能会导致详细的上下文丢失。所有这些都可能导致定位不正确的困境。在本文中，基于重要的病理先验，我们定义了每个椎骨的一组潜在病变，并提出了一种新颖的病理先验启发网络（PPIN）以实现准确的骨赘识别。 PPIN 包括一个与小波变换采样模块集成的主干特征提取器，用于高频详细上下文提取、一个用于定位所有潜在病变的检测分支和一个用于生成最终骨赘识别的分类分支。两个分支之间的解剖图引导过滤器帮助网络通过检测分支中潜在病变生成的热图来关注特定的解剖区域，以解决不正确的定位问题。为了减少类间相似性，提出了一种基于图关系的双边增强模块来模仿临床诊断过程并提取分类分支中相邻椎骨之间的区分性上下文信息。对从公共 VinDr-Spine 数据库收集的两个特定骨赘数据集进行的实验表明，所提出的 PPIN 在多任务框架中实现了最佳的识别性能，并表现出很强的泛化性。私人数据集的结果证明了其临床应用的潜力。类激活图也展示了PPIN强大的定位能力。源代码可在 https://github.com/Phalo/PPIN 中获取。

AU Agarwal, Saurabh Arya, K V Meena, Yogesh Kumar
AU Agarwal、Saurabh Arya、KV Meena、Yogesh Kumar

CNN-O-ELMNet: Optimized Lightweight and Generalized Model for Lung Disease Classification and Severity Assessment.
CNN-O-ELMNet：用于肺部疾病分类和严重程度评估的优化轻量级广义模型。

The high burden of lung diseases on healthcare necessitates effective detection methods. Current Computer-aided design (CAD) systems are limited by their focus on specific diseases and computationally demanding deep learning models. To overcome these challenges, we introduce CNN-O-ELMNet, a lightweight classification model designed to efficiently detect various lung diseases, surpassing the limitations of disease-specific CAD systems and the complexity of deep learning models. This model combines a convolutional neural network for deep feature extraction with an optimized extreme learning machine, utilizing the imperialistic competitive algorithm for enhanced predictions. We then evaluated the effectiveness of CNN-O-ELMNet using benchmark datasets for lung diseases: distinguishing pneumothorax vs. non-pneumothorax, tuberculosis vs. normal, and lung cancer vs. healthy cases. Our findings demonstrate that CNN-O-ELMNet significantly outperformed (p < 0.05) state-of-the-art methods in binary classifications for tuberculosis and cancer, achieving accuracies of 97.85% and 97.70%, respectively, while maintaining low computational complexity with only 2481 trainable parameters. We also extended the model to categorize lung disease severity based on Brixia scores. Achieving a 96.20% accuracy in multi-class assessment for mild, moderate, and severe cases, makes it suitable for deployment in lightweight healthcare devices.
肺部疾病对医疗保健造成的沉重负担需要有效的检测方法。当前的计算机辅助设计 (CAD) 系统因其对特定疾病的关注和计算要求较高的深度学习模型而受到限制。为了克服这些挑战，我们引入了 CNN-O-ELMNet，这是一种轻量级分类模型，旨在有效检测各种肺部疾病，超越了特定疾病 CAD 系统的局限性和深度学习模型的复杂性。该模型将用于深度特征提取的卷积神经网络与优化的极限学习机相结合，利用帝国主义竞争算法来增强预测。然后，我们使用肺部疾病的基准数据集评估了 CNN-O-ELMNet 的有效性：区分气胸与非气胸、结核病与正常病例、肺癌与健康病例。我们的研究结果表明，CNN-O-ELMNet 在结核病和癌症的二元分类方面显着优于 (p < 0.05) 最先进的方法，分别实现了 97.85% 和 97.70% 的准确率，同时保持了较低的计算复杂性只有 2481 个可训练参数。我们还扩展了该模型，以根据 Brixia 评分对肺部疾病的严重程度进行分类。在轻度、中度和重度病例的多类别评估中实现 96.20% 的准确率，使其适合部署在轻型医疗设备中。

AU Teng, Yingzhi Wu, Kai Liu, Jing Li, Yifan Teng, Xiangyi
滕AU、吴英智、刘凯、李静、滕一凡、相宜

Constructing High-order Functional Connectivity Networks with Temporal Information from fMRI Data.
利用 fMRI 数据的时间信息构建高阶功能连接网络。

Conducting functional connectivity analysis on functional magnetic resonance imaging (fMRI) data presents a significant and intricate challenge. Contemporary studies typically analyze fMRI data by constructing high-order functional connectivity networks (FCNs) due to their strong interpretability. However, these approaches often overlook temporal information, resulting in suboptimal accuracy. Temporal information plays a vital role in reflecting changes in blood oxygenation level-dependent signals. To address this shortcoming, we have devised a framework for extracting temporal dependencies from fMRI data and inferring high-order functional connectivity among regions of interest (ROIs). Our approach postulates that the current state can be determined by the FCN and the state at the previous time, effectively capturing temporal dependencies. Furthermore, we enhance FCN by incorporating high-order features through hypergraph-based manifold regularization. Our algorithm involves causal modeling of the dynamic brain system, and the obtained directed FC reveals differences in the flow of information under different pattern. We have validated the significance of integrating temporal information into FCN using four real-world fMRI datasets. On average, our framework achieves 12% higher accuracy than non-temporal hypergraph-based and low-order FCNs, all while maintaining a short processing time. Notably, our framework successfully identifies the most discriminative ROIs, aligning with previous research, thereby facilitating cognitive and behavioral studies.
对功能磁共振成像 (fMRI) 数据进行功能连接分析提出了重大而复杂的挑战。由于其强大的可解释性，当代研究通常通过构建高阶功能连接网络（FCN）来分析功能磁共振成像数据。然而，这些方法常常忽略时间信息，导致准确性不佳。时间信息在反映血氧水平依赖性信号的变化方面起着至关重要的作用。为了解决这个缺点，我们设计了一个框架，用于从功能磁共振成像数据中提取时间依赖性并推断感兴趣区域（ROI）之间的高阶功能连接。我们的方法假设当前状态可以由 FCN 和前一个时间的状态确定，从而有效地捕获时间依赖性。此外，我们通过基于超图的流形正则化合并高阶特征来增强 FCN。我们的算法涉及动态大脑系统的因果建模，获得的有向FC揭示了不同模式下信息流的差异。我们使用四个真实世界的 fMRI 数据集验证了将时间信息集成到 FCN 中的重要性。平均而言，我们的框架比基于非时间超图和低阶 FCN 的准确度高出 12%，同时保持较短的处理时间。值得注意的是，我们的框架成功地识别了最具辨别力的投资回报率，与之前的研究相一致，从而促进了认知和行为研究。

AU Li, Zihan Zheng, Yuan Shan, Dandan Yang, Shuzhou Li, Qingde Wang, Beizhan Zhang, Yuanting Hong, Qingqi Shen, Dinggang
AU Li、郑子涵、袁山、杨丹丹、李树周、王庆德、张北战、洪元廷、沉庆奇、丁刚

ScribFormer: Transformer Makes CNN Work Better for Scribble-Based Medical Image Segmentation
ScribFormer：Transformer 使 CNN 更好地进行基于 Scribble 的医学图像分割

Most recent scribble-supervised segmentation methods commonly adopt a CNN framework with an encoder-decoder architecture. Despite its multiple benefits, this framework generally can only capture small-range feature dependency for the convolutional layer with the local receptive field, which makes it difficult to learn global shape information from the limited information provided by scribble annotations. To address this issue, this paper proposes a new CNN-Transformer hybrid solution for scribble-supervised medical image segmentation called ScribFormer. The proposed ScribFormer model has a triple-branch structure, i.e., the hybrid of a CNN branch, a Transformer branch, and an attention-guided class activation map (ACAM) branch. Specifically, the CNN branch collaborates with the Transformer branch to fuse the local features learned from CNN with the global representations obtained from Transformer, which can effectively overcome limitations of existing scribble-supervised segmentation methods. Furthermore, the ACAM branch assists in unifying the shallow convolution features and the deep convolution features to improve model's performance further. Extensive experiments on two public datasets and one private dataset show that our ScribFormer has superior performance over the state-of-the-art scribble-supervised segmentation methods, and achieves even better results than the fully-supervised segmentation methods. The code is released at https://github.com/HUANGLIZI/ScribFormer.
最近的涂鸦监督分割方法通常采用具有编码器-解码器架构的 CNN 框架。尽管具有多种优点，但该框架通常只能捕获具有局部感受野的卷积层的小范围特征依赖性，这使得很难从涂鸦注释提供的有限信息中学习全局形状信息。为了解决这个问题，本文提出了一种新的 CNN-Transformer 混合解决方案，用于涂鸦监督医学图像分割，称为 ScribFormer。所提出的 ScribFormer 模型具有三分支结构，即 CNN 分支、Transformer 分支和注意力引导类激活图 (ACAM) 分支的混合。具体来说，CNN分支与Transformer分支协作，将从CNN学到的局部特征与从Transformer获得的全局表示融合，可以有效克服现有涂鸦监督分割方法的局限性。此外，ACAM分支有助于统一浅层卷积特征和深层卷积特征，以进一步提高模型的性能。对两个公共数据集和一个私有数据集的大量实验表明，我们的 ScribFormer 比最先进的涂鸦监督分割方法具有更优越的性能，并且比完全监督分割方法取得了更好的结果。代码发布于https://github.com/HUANGLIZI/ScribFormer。

AU Dong, Xiuyu Yang, Kaifan Liu, Jinyu Tang, Fan Liao, Wenjun Zhang, Yu Liang, Shujun
区东、杨秀宇、刘开凡、唐金宇、廖凡、张文军、梁宇、淑君

Cross-Domain Mutual-Assistance Learning Framework for Fully Automated Diagnosis of Primary Tumor in Nasopharyngeal Carcinoma.
鼻咽癌原发肿瘤全自动诊断的跨领域互助学习框架。

Accurate T-staging of nasopharyngeal carcinoma (NPC) holds paramount importance in guiding treatment decisions and prognosticating outcomes for distinct risk groups. Regrettably, the landscape of deep learning-based techniques for T-staging in NPC remains sparse, and existing methodologies often exhibit suboptimal performance due to their neglect of crucial domain-specific knowledge pertinent to primary tumor diagnosis. To address these issues, we propose a new cross-domain mutual-assistance learning framework for fully automated diagnosis of primary tumor using H&N MR images. Specifically, we tackle primary tumor diagnosis task with the convolutional neural network consisting of a 3D cross-domain knowledge perception network (CKP net) for excavated cross-domain-invariant features emphasizing tumor intensity variations and internal tumor heterogeneity, and a multi-domain mutual-information sharing fusion network (M2SF net), comprising a dual-pathway domain-specific representation module and a mutual information fusion module, for intelligently gauging and amalgamating multi-domain, multi-scale T-stage diagnosis-oriented features. The proposed 3D cross-domain mutual-assistance learning framework not only embraces task-specific multi-domain diagnostic knowledge but also automates the entire process of primary tumor diagnosis. We evaluate our model on an internal and an external MR images dataset in a three-fold cross-validation paradigm. Exhaustive experimental results demonstrate that our method outperforms the state-of-the-art algorithms, and obtains promising performance for tumor segmentation and T-staging. These findings underscore its potential for clinical application, offering valuable assistance to clinicians in treatment decision-making and prognostication for various risk groups.
鼻咽癌 (NPC) 的准确 T 分期对于指导不同风险群体的治疗决策和预测结果至关重要。遗憾的是，基于深度学习的鼻咽癌 T 分期技术仍然很少，而且现有的方法由于忽视了与原发性肿瘤诊断相关的关键领域特定知识，常常表现出次优的性能。为了解决这些问题，我们提出了一种新的跨领域互助学习框架，用于使用 H&N MR 图像全自动诊断原发肿瘤。具体来说，我们使用卷积神经网络来解决原发肿瘤诊断任务，该卷积神经网络由 3D 跨域知识感知网络（CKP 网络）组成，用于挖掘强调肿瘤强度变化和内部肿瘤异质性的跨域不变特征，以及多域交互网络-信息共享融合网络（M2SF net），包括双通道特定域表示模块和互信息融合模块，用于智能测量和合并多域、多尺度T阶段诊断导向特征。所提出的3D跨领域互助学习框架不仅包含特定任务的多领域诊断知识，而且还自动化了原发肿瘤诊断的整个过程。我们以三重交叉验证范例在内部和外部 MR 图像数据集上评估我们的模型。详尽的实验结果表明，我们的方法优于最先进的算法，并在肿瘤分割和 T 分期方面获得了有前景的性能。这些发现强调了其临床应用潜力，为临床医生对各种风险群体的治疗决策和预测提供了宝贵的帮助。

EI 1558-254X DA 2024-05-16 UT MEDLINE:38739507 PM 38739507 ER
EI 1558-254X DA 2024-05-16 UT MEDLINE：38739507 PM 38739507 ER

AU Li, Zimeng Xiao, Sa Wang, Cheng Li, Haidong Zhao, Xiuchao Duan, Caohui Zhou, Qian Rao, Qiuchen Fang, Yuan Xie, Junshuai Shi, Lei Guo, Fumin Ye, Chaohui Zhou, Xin
李AU、肖子萌、王飒、李成、赵海东、段秀超、周曹慧、饶谦、方秋晨、谢元、史俊帅、郭雷、叶富民、周朝辉、辛

Encoding Enhanced Complex CNN for Accurate and Highly Accelerated MRI
编码增强型复杂 CNN，以实现准确且高度加速的 MRI

Magnetic resonance imaging (MRI) using hyperpolarized noble gases provides a way to visualize the structure and function of human lung, but the long imaging time limits its broad research and clinical applications. Deep learning has demonstrated great potential for accelerating MRI by reconstructing images from undersampled data. However, most existing deep convolutional neural networks (CNN) directly apply square convolution to k-space data without considering the inherent properties of k-space sampling, limiting k-space learning efficiency and image reconstruction quality. In this work, we propose an encoding enhanced (EN2) complex CNN for highly undersampled pulmonary MRI reconstruction. EN2 complex CNN employs convolution along either the frequency or phase-encoding direction, resembling the mechanisms of k-space sampling, to maximize the utilization of the encoding correlation and integrity within a row or column of k-space. We also employ complex convolution to learn rich representations from the complex k-space data. In addition, we develop a feature-strengthened modularized unit to further boost the reconstruction performance. Experiments demonstrate that our approach can accurately reconstruct hyperpolarized Xe-129 and H-1 lung MRI from 6-fold undersampled k-space data and provide lung function measurements with minimal biases compared with fully sampled images. These results demonstrate the effectiveness of the proposed algorithmic components and indicate that the proposed approach could be used for accelerated pulmonary MRI in research and clinical lung disease patient care.
使用超极化惰性气体的磁共振成像（MRI）提供了一种可视化人体肺部结构和功能的方法，但成像时间长限制了其广泛的研究和临床应用。深度学习已展现出通过欠采样数据重建图像来加速 MRI 的巨大潜力。然而，大多数现有的深度卷积神经网络（CNN）直接将平方卷积应用于k空间数据，没有考虑k空间采样的固有属性，限制了k空间学习效率和图像重建质量。在这项工作中，我们提出了一种编码增强 (EN2) 复杂 CNN，用于高度欠采样的肺部 MRI 重建。 EN2 复合 CNN 采用沿频率或相位编码方向的卷积，类似于 k 空间采样机制，以最大限度地利用 k 空间行或列内的编码相关性和完整性。我们还采用复杂的卷积从复杂的 k 空间数据中学习丰富的表示。此外，我们还开发了功能强化的模块化单元，以进一步提高重建性能。实验表明，我们的方法可以根据 6 倍欠采样 k 空间数据准确重建超极化 Xe-129 和 H-1 肺 MRI，并提供与完全采样图像相比偏差最小的肺功能测量结果。这些结果证明了所提出的算法组件的有效性，并表明所提出的方法可用于研究和临床肺病患者护理中的加速肺部 MRI。

AU Tay, Zhiwei Kim, Han-Joon Ho, John S. Olivo, Malini
AU Tay、Zhiwei Kim、Han-Joon Ho、John S. Olivo、Malini

A Magnetic Particle Imaging Approach for Minimally Invasive Imaging and Sensing With Implantable Bioelectronic Circuits
利用可植入生物电子电路进行微创成像和传感的磁粒子成像方法

Minimally-invasive and biocompatible implantable bioelectronic circuits are used for long-term monitoring of physiological processes in the body. However, there is a lack of methods that can cheaply and conveniently image the device within the body while simultaneously extracting sensor information. Magnetic Particle Imaging (MPI) with zero background signal, high contrast, and high sensitivity with quantitative images is ideal for this challenge because the magnetic signal is not absorbed with increasing tissue depth and incurs no radiation dose. We show how to easily modify common implantable devices to be imaged by MPI by encapsulating and magnetically-coupling magnetic nanoparticles (SPIOs) to the device circuit. These modified implantable devices not only provide spatial information via MPI, but also couple to our handheld MPI reader to transmit sensor information by modulating harmonic signals from magnetic nanoparticles via switching or frequency-shifting with resistive or capacitive sensors. This paper provides proof-of-concept of an optimized MPI imaging technique for implantable devices to extract spatial information as well as other information transmitted by the implanted circuit (such as biosensing) via encoding in the magnetic particle spectrum. The 4D images present 3D position and a changing color tone in response to a variable biometric. Biophysical sensing via bioelectronic circuits that take advantage of the unique imaging properties of MPI may enable a wide range of minimally invasive applications in biomedicine and diagnosis.
微创且生物相容的植入式生物电子电路用于长期监测体内的生理过程。然而，缺乏能够廉价且方便地对体内设备进行成像并同时提取传感器信息的方法。具有零背景信号、高对比度和高灵敏度的定量图像的磁粒子成像 (MPI) 是应对这一挑战的理想选择，因为磁信号不会随着组织深度的增加而被吸收，并且不会产生辐射剂量。我们展示了如何通过将磁性纳米粒子 (SPIO) 封装并磁性耦合到设备电路来轻松修改要通过 MPI 成像的常见植入设备。这些改进的植入式设备不仅通过 MPI 提供空间信息，而且还与我们的手持式 MPI 读取器耦合，通过电阻或电容传感器的切换或频移来调制来自磁性纳米颗粒的谐波信号，从而传输传感器信息。本文提供了一种优化的 MPI 成像技术的概念验证，该技术适用于植入式设备，通过磁性粒子频谱中的编码来提取空间信息以及植入电路传输的其他信息（例如生物传感）。 4D 图像呈现 3D 位置和响应可变生物特征而变化的色调。通过生物电子电路进行生物物理传感，利用 MPI 独特的成像特性，可以在生物医学和诊断领域实现广泛的微创应用。

AU Fu, Minghan Zhang, Na Huang, Zhenxing Zhou, Chao Zhang, Xu Yuan, Jianmin He, Qiang Yang, Yongfeng Zheng, Hairong Liang, Dong Wu, Fang-Xiang Fan, Wei Hu, Zhanli
AU Fu, 张明汉, 黄娜, 周振兴, 张超, 袁旭, 何建民, 杨强, 郑永峰, 梁海蓉, 吴栋, 范方翔, 胡伟, 占利

OIF-Net: An Optical Flow Registration-Based PET/MR Cross-Modal Interactive Fusion Network for Low-Count Brain PET Image Denoising
OIF-Net：基于光流配准的 PET/MR 跨模态交互式融合网络，用于低计数脑部 PET 图像去噪

The short frames of low-count positron emission tomography (PET) images generally cause high levels of statistical noise. Thus, improving the quality of low-count images by using image postprocessing algorithms to achieve better clinical diagnoses has attracted widespread attention in the medical imaging community. Most existing deep learning-based low-count PET image enhancement methods have achieved satisfying results, however, few of them focus on denoising low-count PET images with the magnetic resonance (MR) image modality as guidance. The prior context features contained in MR images can provide abundant and complementary information for single low-count PET image denoising, especially in ultralow-count (2.5%) cases. To this end, we propose a novel two-stream dual PET/MR cross-modal interactive fusion network with an optical flow pre-alignment module, namely, OIF-Net. Specifically, the learnable optical flow registration module enables the spatial manipulation of MR imaging inputs within the network without any extra training supervision. Registered MR images fundamentally solve the problem of feature misalignment in the multimodal fusion stage, which greatly benefits the subsequent denoising process. In addition, we design a spatial-channel feature enhancement module (SC-FEM) that considers the interactive impacts of multiple modalities and provides additional information flexibility in both the spatial and channel dimensions. Furthermore, instead of simply concatenating two extracted features from these two modalities as an intermediate fusion method, the proposed cross-modal feature fusion module (CM-FFM) adopts cross-attention at multiple feature levels and greatly improves the two modalities' feature fusion procedure. Extensive experimental assessments conducted on real clinical datasets, as well as an independent clinical testing dataset, demonstrate that the proposed OIF-Net outperforms the state-of-the-art methods.
低计数正电子发射断层扫描 (PET) 图像的短帧通常会导致高水平的统计噪声。因此，利用图像后处理算法提高低计数图像的质量以实现更好的临床诊断已引起医学影像界的广泛关注。现有的大多数基于深度学习的低计数PET图像增强方法都取得了令人满意的结果，然而，很少有方法专注于以磁共振（MR）图像模态为指导的低计数PET图像去噪。 MR 图像中包含的先验上下文特征可以为单个低计数 PET 图像去噪提供丰富且互补的信息，特别是在超低计数（2.5%）的情况下。为此，我们提出了一种带有光流预对准模块的新型双流双PET/MR跨模态交互式融合网络，即OIF-Net。具体来说，可学习的光流配准模块能够在网络内对 MR 成像输入进行空间操作，而无需任何额外的训练监督。配准后的MR图像从根本上解决了多模态融合阶段特征错位的问题，极大有利于后续的去噪过程。此外，我们设计了一个空间通道特征增强模块（SC-FEM），该模块考虑了多种模态的交互影响，并在空间和通道维度上提供了额外的信息灵活性。此外，所提出的跨模态特征融合模块（CM-FFM）不是简单地连接从这两种模态中提取的两个特征作为中间融合方法，而是在多个特征级别上采用交叉注意，并极大地改进了两种模态的特征融合过程。对真实临床数据集以及独立临床测试数据集进行的广泛实验评估表明，所提出的 OIF-Net 优于最先进的方法。

AU Yin, Ziying Li, Guo-Yang Zhang, Zhaoyi Zheng, Yang Cao, Yanping
AU Yin、李子英、张国阳、郑昭仪、曹阳、燕平

SWENet: A Physics-Informed Deep Neural Network (PINN) for Shear Wave Elastography
SWENet：用于剪切波弹性成像的物理信息深度神经网络 (PINN)

Shear wave elastography (SWE) enables the measurement of elastic properties of soft materials in a non-invasive manner and finds broad applications in various disciplines. The state-of-the-art SWE methods rely on the measurement of local shear wave speeds to infer material parameters and suffer from wave diffraction when applied to soft materials with strong heterogeneity. In the present study, we overcome this challenge by proposing a physics-informed neural network (PINN)-based SWE (SWENet) method. The spatial variation of elastic properties of inhomogeneous materials has been introduced in the governing equations, which are encoded in SWENet as loss functions. Snapshots of wave motions have been used to train neural networks, and during this course, the elastic properties within a region of interest illuminated by shear waves are inferred simultaneously. We performed finite element simulations, tissue-mimicking phantom experiments, and ex vivo experiments to validate the method. Our results show that the shear moduli of soft composites consisting of matrix and inclusions of several millimeters in cross-section dimensions with either regular or irregular geometries can be identified with excellent accuracy. The advantages of the SWENet over conventional SWE methods consist of using more features of the wave motions and enabling seamless integration of multi-source data in the inverse analysis. Given the advantages of SWENet, it may find broad applications where full wave fields get involved to infer heterogeneous mechanical properties, such as identifying small solid tumors with ultrasound SWE, and differentiating gray and white matters of the brain with magnetic resonance elastography.
剪切波弹性成像（SWE）能够以非侵入方式测量软材料的弹性特性，并在各个学科中得到广泛应用。最先进的 SWE 方法依赖于局部剪切波速度的测量来推断材料参数，并且在应用于具有强异质性的软材料时会遭受波衍射。在本研究中，我们通过提出一种基于物理信息神经网络 (PINN) 的 SWE (SWENet) 方法来克服这一挑战。非均质材料弹性特性的空间变化已被引入控制方程中，并在 SWENet 中编码为损失函数。波浪运动的快照已用于训练神经网络，在此过程中，同时推断出剪切波照射的感兴趣区域内的弹性特性。我们进行了有限元模拟、模仿组织的体模实验和离体实验来验证该方法。我们的结果表明，由基体和横截面尺寸为几毫米、具有规则或不规则几何形状的夹杂物组成的软复合材料的剪切模量可以非常准确地识别。与传统 SWE 方法相比，SWENet 的优点包括使用更多的波动特征，并能够在反演分析中无缝集成多源数据。鉴于 SWENet 的优势，它可能会在涉及全波场以推断异质机械特性的情况下找到广泛的应用，例如使用超声 SWE 识别小型实体瘤，以及使用磁共振弹性成像区分大脑的灰质和白质。

AU Ji, Wen Chung, Albert C. S.
欧吉、文中、Albert CS

Unsupervised Domain Adaptation for Medical Image Segmentation Using Transformer With Meta Attention
使用带有元注意力的 Transformer 进行医学图像分割的无监督域适应

Image segmentation is essential to medical image analysis as it provides the labeled regions of interest for the subsequent diagnosis and treatment. However, fully-supervised segmentation methods require high-quality annotations produced by experts, which is laborious and expensive. In addition, when performing segmentation on another unlabeled image modality, the segmentation performance will be adversely affected due to the domain shift. Unsupervised domain adaptation (UDA) is an effective way to tackle these problems, but the performance of the existing methods is still desired to improve. Also, despite the effectiveness of recent Transformer-based methods in medical image segmentation, the adaptability of Transformers is rarely investigated. In this paper, we present a novel UDA framework using a Transformer for building a cross-modality segmentation method with the advantages of learning long-range dependencies and transferring attentive information. To fully utilize the attention learned by the Transformer in UDA, we propose Meta Attention (MA) and use it to perform a fully attention-based alignment scheme, which can learn the hierarchical consistencies of attention and transfer more discriminative information between two modalities. We have conducted extensive experiments on cross-modality segmentation using three datasets, including a whole heart segmentation dataset (MMWHS), an abdominal organ segmentation dataset, and a brain tumor segmentation dataset. The promising results show that our method can significantly improve performance compared with the state-of-the-art UDA methods.
图像分割对于医学图像分析至关重要，因为它为后续的诊断和治疗提供了标记的感兴趣区域。然而，完全监督的分割方法需要专家提供高质量的注释，这是费力且昂贵的。此外，当对另一种未标记的图像模态执行分割时，由于域移位，分割性能将受到不利影响。无监督域适应（UDA）是解决这些问题的有效方法，但现有方法的性能仍有待提高。此外，尽管最近基于 Transformer 的方法在医学图像分割中非常有效，但 Transformer 的适应性却很少被研究。在本文中，我们提出了一种新颖的 UDA 框架，使用 Transformer 构建跨模态分割方法，具有学习远程依赖性和传输注意力信息的优点。为了充分利用 UDA 中 Transformer 学到的注意力，我们提出了元注意力（MA），并用它来执行完全基于注意力的对齐方案，该方案可以学习注意力的层次一致性并在两种模态之间传递更多区分信息。我们使用三个数据集进行了跨模态分割的广泛实验，包括全心脏分割数据集（MMWHS）、腹部器官分割数据集和脑肿瘤分割数据集。有希望的结果表明，与最先进的 UDA 方法相比，我们的方法可以显着提高性能。

AU Pak, Daniel H. Liu, Minliang Kim, Theodore Liang, Liang Caballero, Andres Onofrey, John Ahn, Shawn S. Xu, Yilin McKay, Raymond Sun, Wei Gleason, Rudolph Duncan, James S.
AU Pak、Daniel H. Liu、Minliang Kim、Theodore Liang、Liang Caballero、Andres Onofrey、John Ahn、Shawn S. Xu、Yilin McKay、Raymond Sun、Wei Gleason、Rudolph Duncan、James S.

Patient-Specific Heart Geometry Modeling for Solid Biomechanics Using Deep Learning
使用深度学习进行实体生物力学的患者特定心脏几何建模

Automated volumetric meshing of patient-specific heart geometry can help expedite various biomechanics studies, such as post-intervention stress estimation. Prior meshing techniques often neglect important modeling characteristics for successful downstream analyses, especially for thin structures like the valve leaflets. In this work, we present DeepCarve (Deep Cardiac Volumetric Mesh): a novel deformation-based deep learning method that automatically generates patient-specific volumetric meshes with high spatial accuracy and element quality. The main novelty in our method is the use of minimally sufficient surface mesh labels for precise spatial accuracy and the simultaneous optimization of isotropic and anisotropic deformation energies for volumetric mesh quality. Mesh generation takes only 0.13 seconds/scan during inference, and each mesh can be directly used for finite element analyses without any manual post-processing. Calcification meshes can also be subsequently incorporated for increased simulation accuracy. Numerous stent deployment simulations validate the viability of our approach for large-batch analyses.
患者特定心脏几何形状的自动体积网格划分可以帮助加快各种生物力学研究，例如干预后应力估计。先前的网格划分技术常常忽略成功下游分析的重要建模特征，特别是对于瓣膜小叶等薄结构。在这项工作中，我们提出了 DeepCarve（深度心脏体积网格）：一种新颖的基于变形的深度学习方法，可自动生成具有高空间精度和元素质量的患者特定体积网格。我们方法的主要新颖之处在于使用最小足够的表面网格标签来实现精确的空间精度，并同时优化各向同性和各向异性变形能以实现体积网格质量。推理过程中网格生成/扫描仅需0.13秒，每个网格可直接用于有限元分析，无需任何手动后处理。随后还可以合并钙化网格以提高模拟精度。大量的支架部署模拟验证了我们的大批量分析方法的可行性。

AU Xie, Qingsong Li, Yuexiang He, Nanjun Ning, Munan Ma, Kai Wang, Guoxing Lian, Yong Zheng, Yefeng
谢AU、李青松、何跃翔、宁南军、马木楠、王凯、连国兴、郑勇、叶峰

Unsupervised Domain Adaptation for Medical Image Segmentation by Disentanglement Learning and Self-Training
通过解纠缠学习和自我训练进行医学图像分割的无监督域适应

Unsupervised domain adaption (UDA), which aims to enhance the segmentation performance of deep models on unlabeled data, has recently drawn much attention. In this paper, we propose a novel UDA method (namely DLaST) for medical image segmentation via disentanglement learning and self-training. Disentanglement learning factorizes an image into domain-invariant anatomy and domain-specific modality components. To make the best of disentanglement learning, we propose a novel shape constraint to boost the adaptation performance. The self-training strategy further adaptively improves the segmentation performance of the model for the target domain through adversarial learning and pseudo label, which implicitly facilitates feature alignment in the anatomy space. Experimental results demonstrate that the proposed method outperforms the state-of-the-art UDA methods for medical image segmentation on three public datasets, i.e., a cardiac dataset, an abdominal dataset and a brain dataset. The code will be released soon.
无监督域适应（UDA）旨在增强深度模型在未标记数据上的分割性能，最近引起了广泛关注。在本文中，我们提出了一种通过解纠缠学习和自训练进行医学图像分割的新型 UDA 方法（即 DLaST）。解缠结学习将图像分解为领域不变的解剖结构和领域特定的模态组件。为了充分利用解缠结学习，我们提出了一种新颖的形状约束来提高适应性能。自训练策略通过对抗性学习和伪标签进一步自适应地提高了目标域模型的分割性能，这隐式地促进了解剖空间中的特征对齐。实验结果表明，所提出的方法在三个公共数据集（心脏数据集、腹部数据集和大脑数据集）上优于最先进的医学图像分割 UDA 方法。该代码即将发布。

AU Zhang, Jiaojiao Zhang, Shuo Shen, Xiaoqian Lukasiewicz, Thomas Xu, Zhenghua
张AU、张娇娇、沉硕、Lukasiewicz 小倩、徐托马斯、正华

Multi-ConDoS: Multimodal Contrastive Domain Sharing Generative Adversarial Networks for Self-Supervised Medical Image Segmentation
Multi-ConDoS：用于自监督医学图像分割的多模态对比域共享生成对抗网络

Existing self-supervised medical image segmentation usually encounters the domain shift problem (i.e., the input distribution of pre-training is different from that of fine-tuning) and/or the multimodality problem (i.e., it is based on single-modal data only and cannot utilize the fruitful multimodal information of medical images). To solve these problems, in this work, we propose multimodal contrastive domain sharing (Multi-ConDoS) generative adversarial networks to achieve effective multimodal contrastive self-supervised medical image segmentation. Compared to the existing self-supervised approaches, Multi-ConDoS has the following three advantages: (i) it utilizes multimodal medical images to learn more comprehensive object features via multimodal contrastive learning; (ii) domain translation is achieved by integrating the cyclic learning strategy of CycleGAN and the cross-domain translation loss of Pix2Pix; (iii) novel domain sharing layers are introduced to learn not only domain-specific but also domain-sharing information from the multimodal medical images. Extensive experiments on two publicly multimodal medical image segmentation datasets show that, with only 5% (resp., 10%) of labeled data, Multi-ConDoS not only greatly outperforms the state-of-the-art self-supervised and semi-supervised medical image segmentation baselines with the same ratio of labeled data, but also achieves similar (sometimes even better) performances as fully supervised segmentation methods with 50% (resp., 100%) of labeled data, which thus proves that our work can achieve superior segmentation performances with very low labeling workload. Furthermore, ablation studies prove that the above three improvements are all effective and essential for Multi-ConDoS to achieve this very superior performance.
现有的自监督医学图像分割通常会遇到域移位问题（即预训练的输入分布与微调的输入分布不同）和/或多模态问题（即仅基于单模态数据）并且无法利用医学图像丰富的多模态信息）。为了解决这些问题，在这项工作中，我们提出了多模态对比域共享（Multi-ConDoS）生成对抗网络，以实现有效的多模态对比自监督医学图像分割。与现有的自监督方法相比，Multi-ConDoS具有以下三个优点：（i）它利用多模态医学图像通过多模态对比学习来学习更全面的对象特征；（ii）通过整合CycleGAN的循环学习策略和Pix2Pix的跨域翻译损失来实现领域翻译；（iii）引入新颖的域共享层，不仅可以从多模态医学图像中学习特定域的信息，还可以学习域共享的信息。对两个公开多模态医学图像分割数据集的大量实验表明，仅使用 5%（分别是 10%）的标记数据，Multi-ConDoS 不仅大大优于最先进的自监督和半监督医学图像分割基线具有相同比例的标记数据，但也实现了与具有 50%（或 100%）标记数据的完全监督分割方法相似（有时甚至更好）的性能，从而证明我们的工作可以实现卓越的性能。具有非常低的标记工作量的分割性能。此外，消融研究证明，上述三项改进对于 Multi-ConDoS 实现如此卓越的性能来说都是有效且必不可少的。

AU Zhang, Qiyang Hu, Yingying Zhao, Yumo Cheng, Jing Fan, Wei Hu, Debin Shi, Fuxiao Cao, Shuangliang Zhou, Yun Yang, Yongfeng Liu, Xin Zheng, Hairong Liang, Dong Hu, Zhanli
张AU、胡启阳、赵莹莹、程雨墨、范静、胡伟、施德斌、曹福晓、周双良、杨云、刘永峰、郑新、梁海荣、胡东、占利

Deep Generalized Learning Model for PET Image Reconstruction
PET 图像重建的深度广义学习模型

Low-count positron emission tomography (PET) imaging is challenging because of the ill-posedness of this inverse problem. Previous studies have demonstrated that deep learning (DL) holds promise for achieving improved low-count PET image quality. However, almost all data-driven DL methods suffer from fine structure degradation and blurring effects after denoising. Incorporating DL into the traditional iterative optimization model can effectively improve its image quality and recover fine structures, but little research has considered the full relaxation of the model, resulting in the performance of this hybrid model not being sufficiently exploited. In this paper, we propose a learning framework that deeply integrates DL and an alternating direction of multipliers method (ADMM)-based iterative optimization model. The innovative feature of this method is that we break the inherent forms of the fidelity operators and use neural networks to process them. The regularization term is deeply generalized. The proposed method is evaluated on simulated data and real data. Both the qualitative and quantitative results show that our proposed neural network method can outperform partial operator expansion-based neural network methods, neural network denoising methods and traditional methods.
由于该逆问题的不适定性，低计数正电子发射断层扫描 (PET) 成像具有挑战性。先前的研究表明，深度学习 (DL) 有望提高低计数 PET 图像质量。然而，几乎所有数据驱动的深度学习方法在去噪后都会遭受精细结构退化和模糊效应的影响。将深度学习融入传统的迭代优化模型中可以有效提高其图像质量并恢复精细结构，但很少有研究考虑模型的完全松弛，导致这种混合模型的性能没有得到充分的发挥。在本文中，我们提出了一种深度集成深度学习和基于交替方向乘子法（ADMM）的迭代优化模型的学习框架。该方法的创新点在于我们打破了保真算子的固有形式，采用神经网络对其进行处理。正则化项是深度概括的。所提出的方法在模拟数据和实际数据上进行了评估。定性和定量结果都表明，我们提出的神经网络方法可以优于基于部分算子扩展的神经网络方法、神经网络去噪方法和传统方法。

AU Zhang, Shengjie Shen, Xin Chen, Xiang Yu, Ziqi Ren, Bohan Yang, Haibo Zhang, Xiao-Yong Zhou, Yuan
张AU、沉胜杰、陈鑫、项宇、任子奇、杨博涵、张海波、周小勇、袁

CQformer: Learning Dynamics Across Slices in Medical Image Segmentation.
CQformer：医学图像分割中跨切片的学习动态。

Prevalent studies on deep learning-based 3D medical image segmentation capture the continuous variation across 2D slices mainly via convolution, Transformer, inter-slice interaction, and time series models. In this work, via modeling this variation by an ordinary differential equation (ODE), we propose a cross instance query-guided Transformer architecture (CQformer) that leverages features from preceding slices to improve the segmentation performance of subsequent slices. Its key components include a cross-attention mechanism in an ODE formulation, which bridges the features of contiguous 2D slices of the 3D volumetric data. In addition, a regression head is employed to shorten the gap between the bottleneck and the prediction layer. Extensive experiments on 7 datasets with various modalities (CT, MRI) and tasks (organ, tissue, and lesion) demonstrate that CQformer outperforms previous state-of-the-art segmentation algorithms on 6 datasets by 0.44%-2.45%, and achieves the second highest performance of 88.30% on the BTCV dataset. The code will be publicly available after acceptance.
基于深度学习的 3D 医学图像分割的流行研究主要通过卷积、Transformer、切片间交互和时间序列模型来捕获 2D 切片的连续变化。在这项工作中，通过常微分方程（ODE）对这种变化进行建模，我们提出了一种跨实例查询引导的 Transformer 架构（CQformer），该架构利用先前切片的特征来提高后续切片的分割性能。其关键组件包括 ODE 公式中的交叉注意机制，该机制桥接了 3D 体积数据的连续 2D 切片的特征。此外，还采用回归头来缩短瓶颈和预测层之间的差距。对具有不同模式（CT、MRI）和任务（器官、组织和病变）的 7 个数据集进行的广泛实验表明，CQformer 在 6 个数据集上的性能优于之前最先进的分割算法 0.44%-2.45%，并实现了在 BTCV 数据集上表现第二高，达到 88.30%。该代码将在接受后公开。

EI 1558-254X DA 2024-10-12 UT MEDLINE:39388328 PM 39388328 ER
EI 1558-254X DA 2024-10-12 UT MEDLINE：39388328 PM 39388328 ER

AU Chen, Zhi Liu, Yongguo Zhang, Yun Zhu, Jiajing Li, Qiaoqin Wu, Xindong
陈AU、刘志、张永国、朱云、李嘉静、吴巧勤、鑫东

Enhanced Multimodal Low-rank Embedding based Feature Selection Model for Multimodal Alzheimer's Disease Diagnosis.
用于多模态阿尔茨海默病诊断的增强型多模态低秩嵌入特征选择模型。

Identification of Alzheimer's disease (AD) with multimodal neuroimaging data has been receiving increasing attention. However, the presence of numerous redundant features and corrupted neuroimages within multimodal datasets poses significant challenges for existing methods. In this paper, we propose a feature selection method named Enhanced Multimodal Low-rank Embedding (EMLE) for multimodal AD diagnosis. Unlike previous methods utilizing convex relaxations of the ℓ2,0-norm, EMLE exploits an ℓ2,gamma-norm regularized projection matrix to obtain an embedding representation and select informative features jointly for each modality. The ℓ2,gamma-norm, employing an upper-bounded nonconvex Minimax Concave Penalty (MCP) function to characterize sparsity, offers a superior approximation for the ℓ2,0-norm compared to other convex relaxations. Next, a similarity graph is learned based on the self-expressiveness property to increase the robustness to corrupted data. As the approximation coefficient vectors of samples from the same class should be highly correlated, an MCP function introduced norm, i.e., matrix gamma-norm, is applied to constrain the rank of the graph. Furthermore, recognizing that diverse modalities should share an underlying structure related to AD, we establish a consensus graph for all modalities to unveil intrinsic structures across multiple modalities. Finally, we fuse the embedding representations of all modalities into the label space to incorporate supervisory information. The results of extensive experiments on the Alzheimer's Disease Neuroimaging Initiative datasets verify the discriminability of the features selected by EMLE.
利用多模态神经影像数据识别阿尔茨海默病（AD）已受到越来越多的关注。然而，多模态数据集中存在大量冗余特征和损坏的神经图像，对现有方法提出了重大挑战。在本文中，我们提出了一种用于多模态 AD 诊断的特征选择方法，称为增强型多模态低秩嵌入（EMLE）。与之前利用 ℓ2,0-范数的凸松弛的方法不同，EMLE 利用 ℓ2,gamma-范数正则化投影矩阵来获得嵌入表示并为每种模态联合选择信息特征。 ℓ2,gamma-范数采用上界非凸极小极大凹罚分 (MCP) 函数来表征稀疏性，与其他凸松弛相比，为 ℓ2,0-范数提供了更好的近似。接下来，基于自我表达特性学习相似图，以提高对损坏数据的鲁棒性。由于同一类样本的逼近系数向量应该高度相关，因此采用引入范数的MCP函数，即矩阵伽马范数来约束图的秩。此外，认识到不同的模式应该共享与 AD 相关的底层结构，我们为所有模式建立了一个共识图，以揭示跨多种模式的内在结构。最后，我们将所有模态的嵌入表示融合到标签空间中以纳入监督信息。对阿尔茨海默病神经影像计划数据集进行的大量实验结果验证了 EMLE 所选特征的可区分性。

AU Li, Wen Cao, Fuzhi An, Nan Wang, Wenli Wang, Chunhui Xu, Weinan Gao, Yang Ning, Xiaolin
李区、曹文、安富志、王楠、王文丽、徐春辉、高伟南、宁杨、小林

Source Extent Estimation in OPM-MEG: A Two-Stage Champagne Approach.
OPM-MEG 中的源范围估计：两阶段香槟法。

The accurate estimation of source extent using magnetoencephalography (MEG) is important for the study of preoperative functional localization in epilepsy. Conventional source imaging techniques tend to produce diffuse or focused source estimates that fail to capture the source extent accurately. To address this issue, we propose a novel method called the two-stage Champagne approach (TS-Champagne). TS-Champagne divides source extent estimation into two stages. In the first stage, the Champagne algorithm with noise learning (Champagne-NL) is employed to obtain an initial source estimate. In the second stage, spatial basis functions are constructed from the initial source estimate. These spatial basis functions consist of potential activation source centers and their neighbors, and serve as spatial priors, which are incorporated into Champagne-NL to obtain a final source estimate. We evaluated the performance of TS-Champagne through numerical simulations. TS-Champagne achieved more robust performance under various conditions (i.e., varying source extent, number of sources, signal-to-noise level, and correlation coefficients between sources) than Champagne-NL and several benchmark methods. Furthermore, auditory and median nerve stimulation experiments were conducted using a 31-channel optically pumped magnetometer (OPM)-MEG system. The validation results indicated that the reconstructed source activity was spatially and temporally consistent with the neurophysiological results of previous OPM-MEG studies, further demonstrating the feasibility of TS-Champagne for practical applications.
使用脑磁图（MEG）准确估计源范围对于癫痫术前功能定位的研究非常重要。传统的源成像技术往往会产生漫射或聚焦源估计，而无法准确捕获源范围。为了解决这个问题，我们提出了一种称为两阶段香槟法（TS-Champagne）的新方法。 TS-Champagne 将源范围估计分为两个阶段。在第一阶段，采用带有噪声学习的香槟算法（Champagne-NL）来获得初始源估计。在第二阶段，根据初始源估计构建空间基函数。这些空间基函数由潜在的激活源中心及其邻居组成，并充当空间先验，将其合并到 Champagne-NL 中以获得最终的源估计。我们通过数值模拟评估了 TS-Champagne 的性能。与 Champagne-NL 和几种基准方法相比，TS-Champagne 在各种条件（即变化的源范围、源数量、信噪比以及源之间的相关系数）下实现了更稳健的性能。此外，使用 31 通道光泵磁力计 (OPM)-MEG 系统进行听觉和正中神经刺激实验。验证结果表明，重建的源活动在空间和时间上与之前OPM-MEG研究的神经生理学结果一致，进一步证明了TS-Champagne实际应用的可行性。

AU Zhou, Huajun Zhou, Fengtao Chen, Hao
周AU、周华军、陈风涛、郝

Cohort-Individual Cooperative Learning for Multimodal Cancer Survival Analysis.
多模式癌症生存分析的队列个体合作学习。

Recently, we have witnessed impressive achievements in cancer survival analysis by integrating multimodal data, e.g., pathology images and genomic profiles. However, the heterogeneity and high dimensionality of these modalities pose significant challenges for extracting discriminative representations while maintaining good generalization. In this paper, we propose a Cohortindividual Cooperative Learning (CCL) framework to advance cancer survival analysis by collaborating knowledge decomposition and cohort guidance. Specifically, first, we propose a Multimodal Knowledge Decomposition (MKD) module to explicitly decompose multimodal knowledge into four distinct components: redundancy, synergy and uniqueness of the two modalities. Such a comprehensive decomposition can enlighten the models to perceive easily overlooked yet important information, facilitating an effective multimodal fusion. Second, we propose a Cohort Guidance Modeling (CGM) to mitigate the risk of overfitting task-irrelevant information. It can promote a more comprehensive and robust understanding of the underlying multimodal data, while avoiding the pitfalls of overfitting and enhancing the generalization ability of the model. By cooperating the knowledge decomposition and cohort guidance methods, we develop a robust multimodal survival analysis model with enhanced discrimination and generalization abilities. Extensive experimental results on five cancer datasets demonstrate the effectiveness of our model in integrating multimodal data for survival analysis. The code will be publicly available soon.
最近，我们通过整合病理图像和基因组图谱等多模态数据，在癌症生存分析方面取得了令人瞩目的成就。然而，这些模态的异质性和高维性对提取判别性表示同时保持良好的泛化提出了重大挑战。在本文中，我们提出了一个队列个体合作学习（CCL）框架，通过协作知识分解和队列指导来推进癌症生存分析。具体来说，首先，我们提出了一个多模态知识分解（MKD）模块，将多模态知识显式分解为四个不同的组成部分：两种模态的冗余、协同和独特性。这种全面的分解可以启发模型感知容易被忽视但重要的信息，促进有效的多模态融合。其次，我们提出了队列指导模型（CGM）来减轻过度拟合与任务无关的信息的风险。它可以促进对底层多模态数据更全面、更稳健的理解，同时避免过度拟合的陷阱并增强模型的泛化能力。通过配合知识分解和队列指导方法，我们开发了一个强大的多模态生存分析模型，具有增强的辨别力和泛化能力。对五个癌症数据集的广泛实验结果证明了我们的模型在整合多模式数据进行生存分析方面的有效性。该代码很快就会公开。

AU Lerendegui, Marcelo Riemer, Kai Papageorgiou, Georgios Wang, Bingxue Arthur, Lachlan Chavignon, Arthur Zhang, Tao Couture, Olivier Huang, Pingtong Ashikuzzaman, Md Dencks, Stefanie Dunsby, Chris Helfield, Brandon Jensen, Jorgen Arendt Lisson, Thomas Lowerison, Matthew R. Rivaz, Hassan Samir, Anthony E. Schmitz, Georg Schoen, Scott van Sloun, Ruud Song, Pengfei Stevens, Tristan Yan, Jipeng Sboros, Vassilis Tang, Meng-Xing
AU Lerendegui、Marcelo Riemer、Kai Papageorgiou、Georgios Wang、Bingxue Arthur、Lachlan Chavignon、Arthur 张、Tao Couture、Olivier Huang、Pingtong Ashikuzzaman、Md Dencks、Stefanie Dunsby、Chris Helfield、Brandon Jensen、Jorgen Arendt Lisson、Thomas Lowerison、Matthew R. Rivaz、Hassan Samir、Anthony E. Schmitz、Georg Schoen、Scott van Sloun、Ruud Song、Pengfei Stevens、Tristan Yan、Jipeng Sboros、Vassilis Tang、Meng-Xing

ULTRA-SR Challenge: Assessment of Ultrasound Localization and TRacking Algorithms for Super-Resolution Imaging
ULTRA-SR 挑战：超分辨率成像超声定位和跟踪算法的评估

With the widespread interest and uptake of super-resolution ultrasound (SRUS) through localization and tracking of microbubbles, also known as ultrasound localization microscopy (ULM), many localization and tracking algorithms have been developed. ULM can image many centimeters into tissue in-vivo and track microvascular flow non-invasively with sub-diffraction resolution. In a significant community effort, we organized a challenge, Ultrasound Localization and TRacking Algorithms for Super-Resolution (ULTRA-SR). The aims of this paper are threefold: to describe the challenge organization, data generation, and winning algorithms; to present the metrics and methods for evaluating challenge entrants; and to report results and findings of the evaluation. Realistic ultrasound datasets containing microvascular flow for different clinical ultrasound frequencies were simulated, using vascular flow physics, acoustic field simulation and nonlinear bubble dynamics simulation. Based on these datasets, 38 submissions from 24 research groups were evaluated against ground truth using an evaluation framework with six metrics, three for localization and three for tracking. In-vivo mouse brain and human lymph node data were also provided, and performance assessed by an expert panel. Winning algorithms are described and discussed. The publicly available data with ground truth and the defined metrics for both localization and tracking present a valuable resource for researchers to benchmark algorithms and software, identify optimized methods/software for their data, and provide insight into the current limits of the field. In conclusion, Ultra-SR challenge has provided benchmarking data and tools as well as direct comparison and insights for a number of the state-of-the art localization and tracking algorithms.
随着通过微泡定位和跟踪的超分辨率超声（SRUS）（也称为超声定位显微镜（ULM））受到广泛关注和采用，许多定位和跟踪算法已经被开发出来。 ULM 可以将体内组织成像许多厘米，并以亚衍射分辨率非侵入性地跟踪微血管流动。在社区的一项重大努力中，我们组织了一项挑战：超分辨率超声定位和跟踪算法 (ULTRA-SR)。本文的目的有三个：描述挑战组织、数据生成和获胜算法；提出评估挑战者的指标和方法；并报告评估结果和结果。使用血管流物理、声场模拟和非线性气泡动力学模拟，模拟了包含不同临床超声频率的微血管流的真实超声数据集。基于这些数据集，使用具有六个指标（三个用于本地化、三个用于跟踪）的评估框架，根据真实情况对来自 24 个研究小组的 38 份提交内容进行了评估。还提供了体内小鼠大脑和人类淋巴结数据，并由专家小组评估了性能。描述并讨论了获胜算法。具有真实事实的公开数据以及定义的定位和跟踪指标为研究人员提供了宝贵的资源，可以对算法和软件进行基准测试，确定针对其数据的优化方法/软件，并提供对该领域当前限制的洞察。总之，Ultra-SR 挑战赛提供了基准数据和工具，以及对许多最先进的定位和跟踪算法的直接比较和见解。

AU Ye, Yiwen Zhang, Jianpeng Chen, Ziyang Xia, Yong
区野、张艺文、陈建鹏、夏紫阳、勇

CADS: A Self-supervised Learner via Cross-modal Alignment and Deep Self-distillation for CT Volume Segmentation.
CADS：通过跨模式对齐和深度自蒸馏进行 CT 体积分割的自我监督学习器。

Self-supervised learning (SSL) has long had great success in advancing the field of annotation-efficient learning. However, when applied to CT volume segmentation, most SSL methods suffer from two limitations, including rarely using the information acquired by different imaging modalities and providing supervision only to the bottleneck encoder layer. To address both limitations, we design a pretext task to align the information in each 3D CT volume and the corresponding 2D generated X-ray image and extend self-distillation to deep self-distillation. Thus, we propose a self-supervised learner based on Cross-modal Alignment and Deep Self-distillation (CADS) to improve the encoder's ability to characterize CT volumes. The cross-modal alignment is a more challenging pretext task that forces the encoder to learn better image representation ability. Deep self-distillation provides supervision to not only the bottleneck layer but also shallow layers, thus boosting the abilities of both. Comparative experiments show that, during pre-training, our CADS has lower computational complexity and GPU memory cost than competing SSL methods. Based on the pre-trained encoder, we construct PVT-UNet for 3D CT volume segmentation. Our results on seven downstream tasks indicate that PVT-UNet outperforms state-of-the-art SSL methods like MOCOv3 and DiRA, as well as prevalent medical image segmentation methods like nnUNet and CoTr. Code and pre-trained weight will be available at https://github.com/yeerwen/CADS.
自监督学习（SSL）长期以来在推进注释高效学习领域取得了巨大成功。然而，当应用于 CT 体积分割时，大多数 SSL 方法都存在两个局限性，包括很少使用不同成像方式获取的信息以及仅向瓶颈编码器层提供监督。为了解决这两个限制，我们设计了一个借口任务来对齐每个 3D CT 体积中的信息和相应的 2D 生成的 X 射线图像，并将自蒸馏扩展到深度自蒸馏。因此，我们提出了一种基于跨模态对齐和深度自蒸馏（CADS）的自监督学习器，以提高编码器表征 CT 体积的能力。跨模态对齐是一项更具挑战性的借口任务，它迫使编码器学习更好的图像表示能力。深度自蒸馏不仅可以对瓶颈层进行监督，还可以对浅层进行监督，从而提高两者的能力。对比实验表明，在预训练期间，我们的 CADS 比竞争的 SSL 方法具有更低的计算复杂度和 GPU 内存成本。基于预训练的编码器，我们构建了用于 3D CT 体积分割的 PVT-UNet。我们在七个下游任务上的结果表明，PVT-UNet 的性能优于 MOCOv3 和 DiRA 等最先进的 SSL 方法，以及 nnUNet 和 CoTr 等流行的医学图像分割方法。代码和预训练权重将在 https://github.com/yeerwen/CADS 上提供。

AU Smith, Nathaniel J. Newton, David T. Gunderman, David Hutchins, Gary D.
AU 史密斯、纳撒尼尔·牛顿、大卫·T·冈德曼、大卫·哈钦斯、加里·D.

A Comparison of Arterial Input Function Interpolation Methods for Patlak Plot Analysis of 68Ga-PSMA-11 PET Prostate Cancer Studies
68Ga-PSMA-11 PET 前列腺癌研究的 Patlak 图分析的动脉输入函数插值方法比较

Positron emission tomography (PET) imaging enables quantitative assessment of tissue physiology. Dynamic pharmacokinetic analysis of PET images requires accurate estimation of the radiotracer plasma input function to derive meaningful parameter estimates, and small discrepancies in parameter estimation can mimic subtle physiologic tissue variation. This study evaluates the impact of input function interpolation method on the accuracy of Patlak kinetic parameter estimation through simulations modeling the pharmacokinetic properties of [Ga-68]-PSMA-11. This study evaluated both trained and untrained methods. Although the mean kinetic parameter accuracy was similar across all interpolation models, the trained node weighting interpolation model estimated accurate kinetic parameters with reduced overall variability relative to standard linear interpolation. Trained node weighting interpolation reduced kinetic parameter estimation variance by a magnitude approximating the underlying physiologic differences between normal and diseased prostatic tissue. Overall, this analysis suggests that trained node weighting improves the reliability of Patlak kinetic parameter estimation for [Ga-68]-PSMA-11 PET.
正电子发射断层扫描 (PET) 成像可以对组织生理学进行定量评估。 PET 图像的动态药代动力学分析需要准确估计放射性示踪剂血浆输入函数，以获得有意义的参数估计，参数估计中的微小差异可以模拟微妙的生理组织变化。本研究通过对 [Ga-68]-PSMA-11 的药代动力学特性进行模拟建模，评估输入函数插值法对 Patlak 动力学参数估计准确性的影响。这项研究评估了经过训练和未经训练的方法。尽管所有插值模型的平均动力学参数精度相似，但经过训练的节点加权插值模型估计了准确的动力学参数，相对于标准线性插值，整体变异性降低。经过训练的节点加权插值减少了动力学参数估计方差，其幅度接近正常和患病前列腺组织之间的潜在生理差异。总体而言，该分析表明，经过训练的节点权重提高了 [Ga-68]-PSMA-11 PET 的 Patlak 动力学参数估计的可靠性。

AU Ni, Guangming Wu, Renxiong Zheng, Fei Li, Meixuan Huang, Shaoyan Ge, Xin Liu, Linbo Liu, Yong
AU Ni、吴光明、郑仁雄、李飞、黄美轩、葛少彦、刘鑫、刘林波、勇

Toward Ground-Truth Optical Coherence Tomography via Three-Dimensional Unsupervised Deep Learning Processing and Data
通过三维无监督深度学习处理和数据实现地面真实光学相干断层扫描

Optical coherence tomography (OCT) can perform non-invasive high-resolution three-dimensional (3D) imaging and has been widely used in biomedical fields, while it is inevitably affected by coherence speckle noise which degrades OCT imaging performance and restricts its applications. Here we present a novel speckle-free OCT imaging strategy, named toward-ground-truth OCT (tGT-OCT), that utilizes unsupervised 3D deep-learning processing and leverages OCT 3D imaging features to achieve speckle-free OCT imaging. Specifically, our proposed tGT-OCT utilizes an unsupervised 3D-convolution deep-learning network trained using random 3D volumetric data to distinguish and separate speckle from real structures in 3D imaging volumetric space; moreover, tGT-OCT effectively further reduces speckle noise and reveals structures that would otherwise be obscured by speckle noise while preserving spatial resolution. Results derived from different samples demonstrated the high-quality speckle-free 3D imaging performance of tGT-OCT and its advancement beyond the previous state-of-the-art. The code is available online: https://github.com/Voluntino/tGT-OCT.
光学相干断层扫描（OCT）可以进行非侵入性高分辨率三维（3D）成像，已广泛应用于生物医学领域，但它不可避免地受到相干散斑噪声的影响，降低了OCT成像性能并限制了其应用。在这里，我们提出了一种新颖的无散斑 OCT 成像策略，称为面向地面实况 OCT (tGT-OCT)，该策略利用无监督 3D 深度学习处理并利用 OCT 3D 成像功能来实现无散斑 OCT 成像。具体来说，我们提出的 tGT-OCT 利用使用随机 3D 体积数据训练的无监督 3D 卷积深度学习网络来区分和分离 3D 成像体积空间中真实结构的散斑；此外，tGT-OCT 有效地进一步降低了散斑噪声，揭示了原本会被散斑噪声掩盖的结构，同时保留了空间分辨率。来自不同样本的结果证明了 tGT-OCT 的高质量无散斑 3D 成像性能及其超越先前最先进技术的进步。该代码可在线获取：https://github.com/Voluntino/tGT-OCT。

AU Haeusele, Jakob Schmid, Clemens Viermetz, Manuel Gustschin, Nikolai Lasser, Tobias Koehler, Thomas Pfeiffer, Franz
AU Haeusele、雅各布·施密德、克莱门斯·维尔梅茨、曼努埃尔·古斯钦、尼古拉·拉瑟、托拜厄斯·克勒、托马斯·菲佛、弗朗茨

Robust Sample Information Retrieval in Dark-Field Computed Tomography with a Vibrating Talbot-Lau Interferometer.
使用振动 Talbot-Lau 干涉仪在暗场计算机断层扫描中进行鲁棒样本信息检索。

X-ray computed tomography (CT) is a crucial tool for non-invasive medical diagnosis that uses differences in materials' attenuation coefficients to generate contrast and provide 3D information. Grating-based dark-field-contrast X-ray imaging is an innovative technique that utilizes small-angle scattering to generate additional co-registered images with additional microstructural information. While it is already possible to perform human chest dark-field radiography, it is assumed that its diagnostic value increases when performed in a tomographic setup. However, the susceptibility of Talbot-Lau interferometers to mechanical vibrations coupled with a need to minimize data acquisition times has hindered its application in clinical routines and the combination of X-ray dark-field imaging and large field-of-view (FOV) tomography in the past. In this work, we propose a processing pipeline to address this issue in a human-sized clinical dark-field CT prototype. We present the corrective measures that are applied in the employed processing and reconstruction algorithms to mitigate the effects of vibrations and deformations of the interferometer gratings. This is achieved by identifying spatially and temporally variable vibrations in air reference scans. By translating the found correlations to the sample scan, we can identify and mitigate relevant fluctuation modes for scans with arbitrary sample sizes. This approach effectively eliminates the requirement for sample-free detector area, while still distinctly separating fluctuation and sample information. As a result, samples of arbitrary dimensions can be reconstructed without being affected by vibration artifacts. To demonstrate the viability of the technique for human-scale objects, we present reconstructions of an anthropomorphic thorax phantom.
X 射线计算机断层扫描 (CT) 是无创医学诊断的重要工具，它利用材料衰减系数的差异来生成对比度并提供 3D 信息。基于光栅的暗场对比 X 射线成像是一种创新技术，利用小角度散射生成具有附加微观结构信息的附加共同配准图像。虽然已经可以进行人体胸部暗场放射线摄影，但假设在断层摄影设置中进行时其诊断价值会增加。然而，Talbot-Lau 干涉仪对机械振动的敏感性加上最小化数据采集时间的需要阻碍了其在临床常规中的应用以及 X 射线暗场成像和大视场 (FOV) 断层扫描的结合在过去。在这项工作中，我们提出了一种处理流程，以在人体大小的临床暗场 CT 原型中解决这个问题。我们提出了在所采用的处理和重建算法中应用的纠正措施，以减轻干涉仪光栅的振动和变形的影响。这是通过识别空气参考扫描中空间和时间变化的振动来实现的。通过将发现的相关性转化为样本扫描，我们可以识别并减轻任意样本大小扫描的相关波动模式。这种方法有效地消除了对无样品检测器区域的要求，同时仍然清晰地分离波动和样品信息。因此，可以重建任意维度的样本，而不受振动伪影的影响。为了证明该技术对于人体尺度物体的可行性，我们展示了拟人化胸部模型的重建。

AU Wang, Yan Zhen, Liangli Tan, Tien-En Fu, Huazhu Feng, Yangqin Wang, Zizhou Xu, Xinxing Goh, Rick Siow Mong Ng, Yipin Calhoun, Claire Tan, Gavin Siew Wei Sun, Jennifer K. Liu, Yong Ting, Daniel Shu Wei
AU Wang、Yanzhen、Liangli Tan、Tien-En Fu、Huazhu Feng、Yangqin Wang、Zi Zhou Xu、Xinshing Goh、Rick Siow Mong Ng、Yipin Calhoun、Claire Tan、Gavin Siew Wei Sun、Jennifer K. Liu、Yong Ting、丹尼尔舒伟

Geometric Correspondence-Based Multimodal Learning for Ophthalmic Image Analysis
用于眼科图像分析的基于几何对应的多模态学习

Color fundus photography (CFP) and Optical coherence tomography (OCT) images are two of the most widely used modalities in the clinical diagnosis and management of retinal diseases. Despite the widespread use of multimodal imaging in clinical practice, few methods for automated diagnosis of eye diseases utilize correlated and complementary information from multiple modalities effectively. This paper explores how to leverage the information from CFP and OCT images to improve the automated diagnosis of retinal diseases. We propose a novel multimodal learning method, named geometric correspondence-based multimodal learning network (GeCoM-Net), to achieve the fusion of CFP and OCT images. Specifically, inspired by clinical observations, we consider the geometric correspondence between the OCT slice and the CFP region to learn the correlated features of the two modalities for robust fusion. Furthermore, we design a new feature selection strategy to extract discriminative OCT representations by automatically selecting the important feature maps from OCT slices. Unlike the existing multimodal learning methods, GeCoM-Net is the first method that formulates the geometric relationships between the OCT slice and the corresponding region of the CFP image explicitly for CFP and OCT fusion. Experiments have been conducted on a large-scale private dataset and a publicly available dataset to evaluate the effectiveness of GeCoM-Net for diagnosing diabetic macular edema (DME), impaired visual acuity (VA) and glaucoma. The empirical results show that our method outperforms the current state-of-the-art multimodal learning methods by improving the AUROC score 0.4%, 1.9% and 2.9% for DME, VA and glaucoma detection, respectively.
彩色眼底摄影（CFP）和光学相干断层扫描（OCT）图像是视网膜疾病临床诊断和治疗中使用最广泛的两种模式。尽管多模态成像在临床实践中广泛使用，但很少有自动诊断眼部疾病的方法能够有效利用来自多种模态的相关和互补信息。本文探讨了如何利用 CFP 和 OCT 图像的信息来改进视网膜疾病的自动诊断。我们提出了一种新颖的多模态学习方法，称为基于几何对应的多模态学习网络（GeCoM-Net），以实现 CFP 和 OCT 图像的融合。具体来说，受临床观察的启发，我们考虑 OCT 切片和 CFP 区域之间的几何对应关系，以了解两种模式的相关特征以实现稳健融合。此外，我们设计了一种新的特征选择策略，通过自动从 OCT 切片中选择重要的特征图来提取有区别的 OCT 表示。与现有的多模态学习方法不同，GeCoM-Net 是第一个明确制定 OCT 切片与 CFP 图像相应区域之间的几何关系以进行 CFP 和 OCT 融合的方法。我们在大规模私人数据集和公开数据集上进行了实验，以评估 GeCoM-Net 在诊断糖尿病黄斑水肿 (DME)、视力受损 (VA) 和青光眼方面的有效性。实证结果表明，我们的方法优于当前最先进的多模态学习方法，将 DME、VA 和青光眼检测的 AUROC 分数分别提高了 0.4%、1.9% 和 2.9%。

AU Huang, Kun Ma, Xiao Zhang, Zetian Zhang, Yuhan Yuan, Songtao Fu, Huazhu Chen, Qiang
AU黄、马坤、张晓、张泽天、袁雨涵、付松涛、陈华柱、强

Diverse Data Generation for Retinal Layer Segmentation with Potential Structure Modelling.
具有潜在结构建模的视网膜层分割的多样化数据生成。

Accurate retinal layer segmentation on optical coherence tomography (OCT) images is hampered by the challenges of collecting OCT images with diverse pathological characterization and balanced distribution. Current generative models can produce high-realistic images and corresponding labels without quantitative limitations by fitting distributions of real collected data. Nevertheless, the diversity of their generated data is still limited due to the inherent imbalance of training data. To address these issues, we propose an image-label pair generation framework that generates diverse and balanced potential data from imbalanced real samples. Specifically, the framework first generates diverse layer masks, and then generates plausible OCT images corresponding to these layer masks using two customized diffusion probabilistic models respectively. To learn from imbalanced data and facilitate balanced generation, we introduce pathological-related conditions to guide the generation processes. To enhance the diversity of the generated image-label pairs, we propose a potential structure modeling technique that transfers the knowledge of diverse sub-structures from lowly- or non-pathological samples to highly pathological samples. We conducted extensive experiments on two public datasets for retinal layer segmentation. Firstly, our method generates OCT images with higher image quality and diversity compared to other generative methods. Furthermore, based on the extensive training with the generated OCT images, downstream retinal layer segmentation tasks demonstrate improved results. The code is publicly available at: https://github.com/nicetomeetu21/GenPSM.
收集具有不同病理特征和平衡分布的 OCT 图像的挑战阻碍了光学相干断层扫描 (OCT) 图像上准确的视网膜层分割。当前的生成模型可以通过拟合真实收集数据的分布来生成高度真实的图像和相应的标签，而不受数量限制。然而，由于训练数据固有的不平衡，他们生成的数据的多样性仍然受到限制。为了解决这些问题，我们提出了一种图像标签对生成框架，该框架可以从不平衡的真实样本中生成多样化且平衡的潜在数据。具体来说，该框架首先生成不同的层掩模，然后分别使用两个定制的扩散概率模型生成与这些层掩模相对应的可信 OCT 图像。为了从不平衡数据中学习并促进平衡生成，我们引入病理相关条件来指导生成过程。为了增强生成的图像标签对的多样性，我们提出了一种潜在的结构建模技术，该技术将不同子结构的知识从低度或非病理样本转移到高度病理样本。我们对两个用于视网膜层分割的公共数据集进行了广泛的实验。首先，与其他生成方法相比，我们的方法生成的 OCT 图像具有更高的图像质量和多样性。此外，基于对生成的 OCT 图像的广泛训练，下游视网膜层分割任务显示出改进的结果。该代码可在以下网址公开获取：https://github.com/nicetomeetu21/GenPSM。

AU Chen, Jiachen Li, Mengyang Han, Hu Zhao, Zhiming Chen, Xilin
陈AU、李佳辰、韩梦阳、赵胡、陈志明、西林

SurgNet: Self-Supervised Pretraining With Semantic Consistency for Vessel and Instrument Segmentation in Surgical Images
SurgNet：具有语义一致性的自监督预训练，用于手术图像中的血管和器械分割

Blood vessel and surgical instrument segmentation is a fundamental technique for robot-assisted surgical navigation. Despite the significant progress in natural image segmentation, surgical image-based vessel and instrument segmentation are rarely studied. In this work, we propose a novel self-supervised pretraining method (SurgNet) that can effectively learn representative vessel and instrument features from unlabeled surgical images. As a result, it allows for precise and efficient segmentation of vessels and instruments with only a small amount of labeled data. Specifically, we first construct a region adjacency graph (RAG) based on local semantic consistency in unlabeled surgical images and use it as a self-supervision signal for pseudo-mask segmentation. We then use the pseudo-mask to perform guided masked image modeling (GMIM) to learn representations that integrate structural information of intraoperative objectives more effectively. Our pretrained model, paired with various segmentation methods, can be applied to perform vessel and instrument segmentation accurately using limited labeled data for fine-tuning. We build an Intraoperative Vessel and Instrument Segmentation (IVIS) dataset, comprised of similar to 3 million unlabeled images and over 4,000 labeled images with manual vessel and instrument annotations to evaluate the effectiveness of our self-supervised pretraining method. We also evaluated the generalizability of our method to similar tasks using two public datasets. The results demonstrate that our approach outperforms the current state-of-the-art (SOTA) self-supervised representation learning methods in various surgical image segmentation tasks.
血管和手术器械分割是机器人辅助手术导航的基本技术。尽管自然图像分割取得了重大进展，但基于手术图像的血管和器械分割的研究却很少。在这项工作中，我们提出了一种新颖的自监督预训练方法（SurgNet），可以有效地从未标记的手术图像中学习代表性血管和器械特征。因此，它只需少量的标记数据即可对血管和仪器进行精确有效的分割。具体来说，我们首先基于未标记的手术图像中的局部语义一致性构建区域邻接图（RAG），并将其用作伪掩模分割的自监督信号。然后，我们使用伪掩模执行引导掩模图像建模（GMIM），以学习更有效地整合术中目标结构信息的表示。我们的预训练模型与各种分割方法相结合，可以使用有限的标记数据进行微调，从而准确地执行血管和器械分割。我们构建了一个术中血管和器械分割 (IVIS) 数据集，其中包含大约 300 万张未标记图像和超过 4,000 张带有手动血管和器械注释的标记图像，以评估我们的自监督预训练方法的有效性。我们还使用两个公共数据集评估了我们的方法对类似任务的通用性。结果表明，我们的方法在各种手术图像分割任务中优于当前最先进的（SOTA）自监督表示学习方法。

AU Lin, Chen Zhu, Zhenfeng Zhao, Yawei Zhang, Ying He, Kunlun Zhao, Yao
AU Lin、陈竺、赵振峰、张亚伟、何瑛、赵昆仑、姚

SGT plus plus : Improved Scene Graph-Guided Transformer for Surgical Report Generation
SGT plus plus：用于生成手术报告的改进场景图引导变压器

Automatically recording surgical procedures and generating surgical reports are crucial for alleviating surgeons' workload and enabling them to concentrate more on the operations. Despite some achievements, there still exist several issues for the previous works: 1) failure to model the interactive relationship between surgical instruments and tissue; and 2) neglect of fine-grained differences within different surgical images in the same surgery. To address these two issues, we propose an improved scene graph-guided Transformer, also named by SGT++, to generate more accurate surgical report, in which the complex interactions between surgical instruments and tissue are learnt from both explicit and implicit perspectives. Specifically, to facilitate the understanding of the surgical scene graph under a graph learning framework, a simple yet effective approach is proposed for homogenizing the input heterogeneous scene graph. For the homogeneous scene graph that contains explicit structured and fine-grained semantic relationships, we design an attention-induced graph transformer for node aggregation via an explicit relation-aware encoder. In addition, to characterize the implicit relationships about the instrument, tissue, and the interaction between them, the implicit relational attention is proposed to take full advantage of the prior knowledge from the interactional prototype memory. With the learnt explicit and implicit relation-aware representations, they are then coalesced to obtain the fused relation-aware representations contributing to generating reports. Some comprehensive experiments on two surgical datasets show that the proposed STG++ model achieves state-of-the-art results.
自动记录手术过程并生成手术报告对于减轻外科医生的工作量并使他们更加专注于手术至关重要。尽管取得了一些成果，但之前的工作仍然存在一些问题：1）未能模拟手术器械与组织之间的交互关系； 2）忽略同一手术中不同手术图像的细粒度差异。为了解决这两个问题，我们提出了一种改进的场景图引导 Transformer（也称为 SGT++），以生成更准确的手术报告，其中从显式和隐式角度学习手术器械和组织之间的复杂相互作用。具体来说，为了便于在图学习框架下理解手术场景图，提出了一种简单而有效的方法来均匀化输入异构场景图。对于包含显式结构化和细粒度语义关系的同构场景图，我们通过显式关系感知编码器设计了一种用于节点聚合的注意力诱导图转换器。此外，为了表征仪器、组织以及它们之间的交互的隐式关系，提出了隐式关系注意，以充分利用交互原型记忆中的先验知识。利用学习到的显式和隐式关系感知表示，然后将它们合并以获得有助于生成报告的融合关系感知表示。对两个手术数据集的一些综合实验表明，所提出的 STG++ 模型取得了最先进的结果。

AU Grohl, Janek Else, Thomas R. Hacker, Lina Bunce, Ellie V. Sweeney, Paul W. Bohndiek, Sarah E.
AU Grohl、Janek Else、Thomas R. Hacker、Lina Bunce、Ellie V. Sweeney、Paul W. Bohndiek、Sarah E.

Moving Beyond Simulation: Data-Driven Quantitative Photoacoustic Imaging Using Tissue-Mimicking Phantoms
超越模拟：使用模拟组织模型进行数据驱动的定量光声成像

Accurate measurement of optical absorption coefficients from photoacoustic imaging (PAI) data would enable direct mapping of molecular concentrations, providing vital clinical insight. The ill-posed nature of the problem of absorption coefficient recovery has prohibited PAI from achieving this goal in living systems due to the domain gap between simulation and experiment. To bridge this gap, we introduce a collection of experimentally well-characterised imaging phantoms and their digital twins. This first-of-a-kind phantom data set enables supervised training of a U-Net on experimental data for pixel-wise estimation of absorption coefficients. We show that training on simulated data results in artefacts and biases in the estimates, reinforcing the existence of a domain gap between simulation and experiment. Training on experimentally acquired data, however, yielded more accurate and robust estimates of optical absorption coefficients. We compare the results to fluence correction with a Monte Carlo model from reference optical properties of the materials, which yields a quantification error of approximately 20%. Application of the trained U-Nets to a blood flow phantom demonstrated spectral biases when training on simulated data, while application to a mouse model highlighted the ability of both learning-based approaches to recover the depth-dependent loss of signal intensity. We demonstrate that training on experimental phantoms can restore the correlation of signal amplitudes measured in depth. While the absolute quantification error remains high and further improvements are needed, our results highlight the promise of deep learning to advance quantitative PAI.
通过光声成像（PAI）数据精确测量光吸收系数将能够直接绘制分子浓度图，从而提供重要的临床洞察力。由于模拟和实验之间的领域差距，吸收系数恢复问题的不适定性质阻碍了 PAI 在生命系统中实现这一目标。为了弥补这一差距，我们引入了一系列经过实验充分表征的成像模型及其数字双胞胎。这个史无前例的模型数据集能够根据实验数据对 U-Net 进行监督训练，以逐像素估计吸收系数。我们表明，对模拟数据的训练会导致估计中的伪影和偏差，从而强化了模拟与实验之间域差距的存在。然而，对实验获得的数据进行训练可以产生更准确、更稳健的光吸收系数估计。我们将结果与根据材料的参考光学特性使用蒙特卡罗模型进行的注量校正进行比较，产生大约 20% 的量化误差。将经过训练的 U-Net 应用于血流模型，在模拟数据上进行训练时表现出光谱偏差，而应用于小鼠模型则强调了两种基于学习的方法恢复与深度相关的信号强度损失的能力。我们证明，对实验体模的训练可以恢复深度测量的信号幅度的相关性。虽然绝对量化误差仍然很高，需要进一步改进，但我们的结果凸显了深度学习在推进定量 PAI 方面的前景。

AU Sewani, Alykhan Roa, Carlos-Felipe Zhou, James J. Alawneh, Yara Quadri, Amaar Gilliland-Rocque, Rene Cherin, Emmanuel Dueck, Andrew Demore, Christine Wright, Graham Tavallaei, M. Ali
AU Sewani、Alykhan Roa、Carlos-Felipe Zhou、James J. Alawneh、Yara Quadri、Amaar Gilliland-Rocque、Rene Cherin、Emmanuel Dueck、Andrew Demore、Christine Wright、Graham Tavallaei、M. Ali

The CathEye: A Forward-Looking Ultrasound Catheter for Image-Guided Cardiovascular Procedures
CathEye：用于图像引导心血管手术的前瞻性超声导管

Catheter based procedures are typically guided by X-Ray, which suffers from low soft tissue contrast and only provides 2D projection images of a 3D volume. Intravascular ultrasound (IVUS) can serve as a complementary imaging technique. Forward viewing catheters are useful for visualizing obstructions along the path of the catheter. The CathEye system mechanically steers a single-element transducer to generate a forward-looking surface reconstruction from an irregularly spaced 2-D scan pattern. The steerable catheter leverages an expandable frame with cables to manipulate the distal end independently of vessel tortuosity. The tip position is estimated by measuring the cable displacements and used to create surface reconstructions of the imaging workspace with the single-element transducer. CathEye's imaging capabilities were tested with an agar phantom and an ex vivo chronic total occlusion (CTO) sample while the catheter was confined to various tortuous paths. The CathEye maintained similar scan patterns regardless of path tortuosity and was able to recreate major features of the imaging targets, such as holes and extrusions. The feasibility of forward-looking IVUS with the CathEye is demonstrated in this study. The CathEye mechanism can be applied to other imaging modalities with field-of-view (FOV) limitations and represents the basis for an interventional device fully integrated with image guidance.
基于导管的手术通常由 X 射线引导，其软组织对比度较低，并且仅提供 3D 体积的 2D 投影图像。血管内超声（IVUS）可以作为一种补充成像技术。前视导管可用于观察导管路径上的障碍物。 CathEye 系统机械地引导单元件传感器，根据不规则间隔的二维扫描图案生成前视表面重建。可操纵导管利用带有电缆的可扩展框架来独立于血管弯曲度来操纵远端。通过测量电缆位移来估计尖端位置，并用于使用单元件传感器创建成像工作空间的表面重建。 CathEye 的成像能力通过琼脂模型和离体慢性完全闭塞 (CTO) 样本进行了测试，同时导管被限制在各种曲折路径中。无论路径曲折如何，CathEye 都保持相似的扫描模式，并且能够重新创建成像目标的主要特征，例如孔和挤压。本研究证明了使用 CathEye 进行前瞻性 IVUS 的可行性。 CathEye 机制可应用于具有视场 (FOV) 限制的其他成像模式，并代表了与图像引导完全集成的介入设备的基础。

AU Fu, Junhu Chen, Ke Dou, Qi Gao, Yun He, Yiping Zhou, Pinghong Lin, Shengli Wang, Yuanyuan Guo, Yi
AU Fu, 陈俊虎, 窦柯, 高奇, 何云, 周一平, 林平红, 王胜利, 郭媛媛, 易

IPNet: An Interpretable Network with Progressive Loss for Whole-stage Colorectal Disease Diagnosis.
IPNet：用于全阶段结直肠疾病诊断的渐进式损失的可解释网络。

Colorectal cancer plays a dominant role in cancer-related deaths, primarily due to the absence of obvious early-stage symptoms. Whole-stage colorectal disease diagnosis is crucial for assessing lesion evolution and determining treatment plans. However, locality difference and disease progression lead to intra-class disparities and inter-class similarities for colorectal lesion representation. In addition, interpretable algorithms explaining the lesion progression are still lacking, making the prediction process a "black box". In this paper, we propose IPNet, a dual-branch interpretable network with progressive loss for whole-stage colorectal disease diagnosis. The dual-branch architecture captures unbiased features representing diverse localities to suppress intra-class variation. The progressive loss function considers inter-class relationship, using prior knowledge of disease evolution to guide classification. Furthermore, a novel Grain-CAM is designed to interpret IPNet by visualizing pixel-wise attention maps from shallow to deep layers, providing regions semantically related to IPNet's progressive classification. We conducted whole-stage diagnosis on two image modalities, i.e., colorectal lesion classification on 129,893 endoscopic optical images and rectal tumor T-staging on 11,072 endoscopic ultrasound images. IPNet is shown to surpass other state-of-the-art algorithms, accordingly achieving an accuracy of 93.15% and 89.62%. Especially, it establishes effective decision boundaries for challenges like polyp vs. adenoma and T2 vs. T3. The results demonstrate an explainable attempt for colorectal lesion classification at a whole-stage level, and rectal tumor T-staging by endoscopic ultrasound is also unprecedentedly explored. IPNet is expected to be further applied, assisting physicians in whole-stage disease diagnosis and enhancing diagnostic interpretability.
结直肠癌在癌症相关死亡中占主导地位，这主要是由于没有明显的早期症状。全阶段结直肠疾病诊断对于评估病变演变和确定治疗方案至关重要。然而，局部差异和疾病进展导致结直肠病变表征的类内差异和类间相似性。此外，仍然缺乏解释病变进展的可解释算法，使得预测过程成为“黑匣子”。在本文中，我们提出了 IPNet，这是一种具有渐进损失的双分支可解释网络，用于全阶段结直肠疾病诊断。双分支架构捕获代表不同位置的无偏特征，以抑制类内变异。渐进损失函数考虑类间关系，利用疾病进化的先验知识来指导分类。此外，一种新颖的 Grain-CAM 旨在通过可视化从浅层到深层的像素级注意力图来解释 IPNet，提供与 IPNet 渐进分类语义相关的区域。我们对两种图像模式进行了全阶段诊断，即对129,893张内镜光学图像进行结直肠病变分类，对11,072张内镜超声图像进行直肠肿瘤T分期。 IPNet 被证明超越了其他最先进的算法，相应地达到了 93.15% 和 89.62% 的准确率。特别是，它为息肉与腺瘤以及 T2 与 T3 等挑战建立了有效的决策边界。结果表明，在全阶段水平上对结直肠病变分类进行了可解释的尝试，并且通过内镜超声对直肠肿瘤T分期也进行了前所未有的探索。 IPNet有望得到进一步应用，辅助医生进行全阶段疾病诊断，增强诊断的可解释性。

EI 1558-254X DA 2024-09-21 UT MEDLINE:39298304 PM 39298304 ER
EI 1558-254X DA 2024-09-21 UT MEDLINE：39298304 PM 39298304 ER

AU Zhang, Yirui Zou, Yanni Liu, Peter X
张AU、邹一瑞、刘燕妮、Peter X

Point Cloud Registration in Laparoscopic Liver Surgery Using Keypoint Correspondence Registration Network.
使用关键点对应配准网络进行腹腔镜肝脏手术中的点云配准。

Laparoscopic liver surgery is a newly developed minimally invasive technique and represents an inevitable trend in the future development of surgical methods. By using augmented reality (AR) technology to overlay preoperative CT models with intraoperative laparoscopic videos, surgeons can accurately locate blood vessels and tumors, significantly enhancing the safety and precision of surgeries. Point cloud registration technology is key to achieving this effect. However, there are two major challenges in registering the CT model with the point cloud surface reconstructed from intraoperative laparoscopy. First, the surface features of the organ are not prominent. Second, due to the limited field of view of the laparoscope, the reconstructed surface typically represents only a very small portion of the entire organ. To address these issues, this paper proposes the keypoint correspondence registration network (KCR-Net). This network first uses the neighborhood feature fusion module (NFFM) to aggregate and interact features from different regions and structures within a pair of point clouds to obtain comprehensive feature representations. Then, through correspondence generation, it directly generates keypoints and their corresponding weights, with keypoints located in the common structures of the point clouds to be registered, and corresponding weights learned automatically by the network. This approach enables accurate point cloud registration even under conditions of extremely low overlap. Experiments conducted on the ModelNet40, 3Dircadb, DePoLL demonstrate that our method achieves excellent registration accuracy and is capable of meeting the requirements of real-world scenarios.
腹腔镜肝脏手术是一种新兴的微创技术，是未来手术方法发展的必然趋势。通过使用增强现实（AR）技术将术前CT模型与术中腹腔镜视频叠加，外科医生可以准确定位血管和肿瘤，显着提高手术的安全性和精准度。点云配准技术是实现这一效果的关键。然而，将 CT 模型与术中腹腔镜重建的点云表面配准存在两个主要挑战。首先，器官的表面特征不突出。其次，由于腹腔镜的视野有限，重建的表面通常仅代表整个器官的很小一部分。为了解决这些问题，本文提出了关键点对应注册网络（KCR-Net）。该网络首先使用邻域特征融合模块（NFFM）来聚合和交互来自一对点云内不同区域和结构的特征，以获得全面的特征表示。然后，通过对应生成，直接生成关键点及其对应的权重，关键点位于要注册的点云的公共结构中，并由网络自动学习对应的权重。即使在重叠度极低的情况下，这种方法也能实现准确的点云配准。在ModelNet40、3Dircadb、DePoLL上进行的实验表明，我们的方法具有出色的配准精度，能够满足实际场景的要求。

AU Kaji, Shizuo Tanabe, Naoya Maetani, Tomoki Shiraishi, Yusuke Sakamoto, Ryo Oguma, Tsuyoshi Suzuki, Katsuhiro Terada, Kunihiko Fukui, Motonari Muro, Shigeo Sato, Susumu Hirai, Toyohiro
梶AU、田边静雄、前谷直也、白石智树、坂本佑介、小熊亮、铃木刚、寺田胜宏、福井邦彦、室元就、佐藤繁夫、平井进、丰博

Quantification of Airway Structures by Persistent Homology
通过持续同源性对气道结构进行量化

We propose two types of novel morphological metrics for quantifying the geometry of tubular structures on computed tomography (CT) images. We apply our metrics to identify irregularities in the airway of patients with chronic obstructive pulmonary disease (COPD) and demonstrate that they provide complementary information to the conventional metrics used to assess COPD, such as the tissue density distribution in lung parenchyma and the wall area ratio of the segmented airway. The three-dimensional shape of the airway and its abstraction as a rooted tree with the root at the trachea carina are automatically extracted from a lung CT volume, and the two metrics are computed based on a mathematical tool called persistent homology; treeH0 quantifies the distribution of branch lengths to assess the complexity of the tree-like structure and radialH0 quantifies the irregularities in the luminal radius along the airway. We show our metrics are associated with clinical outcomes.
我们提出了两种新颖的形态学度量来量化计算机断层扫描（CT）图像上管状结构的几何形状。我们应用我们的指标来识别慢性阻塞性肺病 (COPD) 患者气道的不规则性，并证明它们为用于评估 COPD 的传统指标提供了补充信息，例如肺实质中的组织密度分布和壁面积比的分段气道。气道的三维形状及其抽象为根部位于气管隆突的有根树，是从肺部 CT 体积中自动提取的，并且这两个指标是基于称为持久同源性的数学工具计算的； treeH0 量化分支长度的分布以评估树状结构的复杂性，radialH0 量化沿气道的管腔半径的不规则性。我们表明我们的指标与临床结果相关。

AU Chen, Haomin Dreizin, David Gomez, Catalina Zapaishchykova, Anna Unberath, Mathias
AU Chen、Haomin Dreizin、David Gomez、Catalina Zapaishchykova、Anna Unberath、Mathias

Interpretable Severity Scoring of Pelvic Trauma Through Automated Fracture Detection and Bayesian Inference.
通过自动骨折检测和贝叶斯推理对骨盆创伤进行可解释的严重程度评分。

Pelvic ring disruptions result from blunt injury mechanisms and are potentially lethal mainly due to associated injuries and massive pelvic hemorrhage. The severity of pelvic fractures in trauma victims is frequently assessed by grading the fracture according to the Tile AO/OTA classification in whole-body Computed Tomography (CT) scans. Due to the high volume of whole-body CT scans generated in trauma centers, the overall information content of a single whole-body CT scan and low manual CT reading speed, an automatic approach to Tile classification would provide substantial value, e. g., to prioritize the reading sequence of the trauma radiologists or enable them to focus on other major injuries in multi-trauma patients. In such a high-stakes scenario, an automated method for Tile grading should ideally be transparent such that the symbolic information provided by the method follows the same logic a radiologist or orthopedic surgeon would use to determine the fracture grade. This paper introduces an automated yet interpretable pelvic trauma decision support system to assist radiologists in fracture detection and Tile grading. To achieve interpretability despite processing high-dimensional whole-body CT images, we design a neurosymbolic algorithm that operates similarly to human interpretation of CT scans. The algorithm first detects relevant pelvic fractures on CTs with high specificity using Faster-RCNN. To generate robust fracture detections and associated detection (un)certainties, we perform test-time augmentation of the CT scans to apply fracture detection several times in a self-ensembling approach. The fracture detections are interpreted using a structural causal model based on clinical best practices to infer an initial Tile grade. We apply a Bayesian causal model to recover likely co-occurring fractures that may have been rejected initially due to the highly specific operating point of the detector, resulting in an updated list of detected fractures and corresponding final Tile grade. Our method is transparent in that it provides fracture location and types, as well as information on important counterfactuals that would invalidate the system's recommendation. Our approach achieves an AUC of 0.89/0.74 for translational and rotational instability,which is comparable to radiologist performance. Despite being designed for human-machine teaming, our approach does not compromise on performance compared to previous black-box methods.
骨盆环破裂是由钝性损伤机制引起的，并且主要由于相关损伤和大量骨盆出血而可能致命。通常根据全身计算机断层扫描 (CT) 扫描中的 Tile AO/OTA 分类对骨折进行分级来评估创伤受害者骨盆骨折的严重程度。由于创伤中心生成的全身 CT 扫描量大、单次全身 CT 扫描的总体信息内容以及手动 CT 读取速度低，Tile 分类的自动方法将提供巨大的价值，例如，确定优先级创伤放射科医生的阅读顺序或使他们能够专注于多发伤患者的其他重大损伤。在这种高风险的情况下，Tile 分级的自动化方法在理想情况下应该是透明的，以便该方法提供的符号信息遵循放射科医生或整形外科医生用来确定骨折分级的相同逻辑。本文介绍了一种自动化但可解释的骨盆创伤决策支持系统，可协助放射科医生进行骨折检测和 Tile 分级。为了在处理高维全身 CT 图像的情况下实现可解释性，我们设计了一种神经符号算法，其操作类似于人类对 CT 扫描的解释。该算法首先使用 Faster-RCNN 在 CT 上以高特异性检测相关骨盆骨折。为了生成可靠的断裂检测和相关的检测（不确定）确定性，我们对 CT 扫描进行测试时间增强，以自组装方法多次应用断裂检测。使用基于临床最佳实践的结构因果模型来解释断裂检测，以推断初始 Tile 等级。我们应用贝叶斯因果模型来恢复可能同时发生的裂缝，这些裂缝最初可能由于探测器的高度特定的操作点而被拒绝，从而产生检测到的裂缝和相应的最终瓷砖等级的更新列表。我们的方法是透明的，因为它提供了裂缝位置和类型，以及有关会使系统建议无效的重要反事实的信息。我们的方法在平移和旋转不稳定性方面实现了 0.89/0.74 的 AUC，与放射科医生的表现相当。尽管是为人机协作而设计的，但与以前的黑盒方法相比，我们的方法并没有影响性能。

AU Zhou, Lifang Jiang, Yu Li, Weisheng Hu, Jun Zheng, Shenhai
周AU、姜丽芳、李宇、胡伟胜、郑军、申海

Shape-Scale Co-Awareness Network for 3D Brain Tumor Segmentation
用于 3D 脑肿瘤分割的形状尺度协同感知网络

The accurate segmentation of brain tumor is significant in clinical practice. Convolutional Neural Network (CNN)-based methods have made great progress in brain tumor segmentation due to powerful local modeling ability. However, brain tumors are frequently pattern-agnostic, i.e. variable in shape, size and location, which can not be effectively matched by traditional CNN-based methods with local and regular receptive fields. To address the above issues, we propose a shape-scale co-awareness network (S2CA-Net) for brain tumor segmentation, which can efficiently learn shape-aware and scale-aware features simultaneously to enhance pattern-agnostic representations. Primarily, three key components are proposed to accomplish the co-awareness of shape and scale. The Local-Global Scale Mixer (LGSM) decouples the extraction of local and global context by adopting the CNN-Former parallel structure, which contributes to obtaining finer hierarchical features. The Multi-level Context Aggregator (MCA) enriches the scale diversity of input patches by modeling global features across multiple receptive fields. The Multi-Scale Attentive Deformable Convolution (MS-ADC) learns the target deformation based on the multiscale inputs, which motivates the network to enforce feature constraints both in terms of scale and shape for optimal feature matching. Overall, LGSM and MCA focus on enhancing the scale-awareness of the network to cope with the size and location variations, while MS-ADC focuses on capturing deformation information for optimal shape matching. Finally, their effective integration prompts the network to perceive variations in shape and scale simultaneously, which can robustly tackle the variations in patterns of brain tumors. The experimental results on BraTS 2019, BraTS 2020, MSD BTS Task and BraTS2023-MEN show that S2CA-Net has superior overall performance in accuracy and efficiency compared to other state-of-the-art methods. Code: https://github.com/jiangyu945/S2CA-Net.
脑肿瘤的准确分割在临床实践中具有重要意义。基于卷积神经网络（CNN）的方法凭借强大的局部建模能力，在脑肿瘤分割方面取得了巨大进展。然而，脑肿瘤通常是模式不可知的，即形状、大小和位置可变，这不能通过具有局部和规则感受野的基于 CNN 的传统方法有效匹配。为了解决上述问题，我们提出了一种用于脑肿瘤分割的形状尺度协同感知网络（S2CA-Net），它可以有效地同时学习形状感知和尺度感知特征，以增强与模式无关的表示。首先，提出了三个关键组成部分来实现形状和尺度的共同意识。局部-全局尺度混合器（LGSM）通过采用CNN-Former并行结构解耦局部和全局上下文的提取，这有助于获得更精细的层次特征。多级上下文聚合器 (MCA) 通过对多个感受野的全局特征进行建模，丰富了输入块的尺度多样性。多尺度注意力变形卷积（MS-ADC）根据多尺度输入学习目标变形，这促使网络在尺度和形状方面强制实施特征约束，以实现最佳特征匹配。总体而言，LGSM 和 MCA 侧重于增强网络的尺度感知，以应对尺寸和位置变化，而 MS-ADC 侧重于捕获变形信息以实现最佳形状匹配。最后，它们的有效整合促使网络同时感知形状和尺度的变化，这可以稳健地应对脑肿瘤模式的变化。在BraTS 2019、BraTS 2020、MSD BTS Task和BraTS2023-MEN上的实验结果表明，与其他最先进的方法相比，S2CA-Net在准确性和效率方面具有优越的整体性能。代码：https://github.com/jianyu945/S2CA-Net。

C1 Chongqing Univ Posts & Telecommun, Key Lab Image Cognit, Chongqing 400065, Peoples R China C1 Chongqing Univ Posts & Telecommun, Coll Software, Chongqing 400065, Peoples R China C1 Guizhou Univ, Key Lab Adv Mfg Technol, Minist Educ, Guiyang 550025, Guizhou, Peoples R China C1 Third Mil Med Univ, Southwest Hosp, Dept Neurol, Chongqing 400065, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-07-22 UT WOS:001263692100005 PM 38386578 ER
C1 重庆邮电大学，图像认知重点实验室，重庆 400065，人民 R C1 重庆邮电大学，科尔软件，重庆 400065，人民 R C1 贵州大学，先进制造技术重点实验室，工信部教育学院，贵阳 550025贵州，人民 R 中国 C1 第三军医大学，西南医院，神经科，重庆 400065，人民 R 中国 SN 0278-0062 EI 1558-254X DA 2024-07-22 UT WOS:001263692100005 PM 38386578 ER

AU Wang, Puyang Guo, Dazhou Zheng, Dandan Zhang, Minghui Yu, Haogang Sun, Xin Ge, Jia Gu, Yun Lu, Le Ye, Xianghua Jin, Dakai
王AU、郭濮阳、郑大周、张丹丹、于明辉、孙浩刚、葛鑫、谷佳、陆云、叶乐、金向华、大凯

Accurate Airway Tree Segmentation in CT Scans via Anatomy-aware Multi-class Segmentation and Topology-guided Iterative Learning.
通过解剖感知的多类分割和拓扑引导的迭代学习，在 CT 扫描中进行准确的气道树分割。

Intrathoracic airway segmentation in computed tomography is a prerequisite for various respiratory disease analyses such as chronic obstructive pulmonary disease, asthma and lung cancer. Due to the low imaging contrast and noises execrated at peripheral branches, the topological-complexity and the intra-class imbalance of airway tree, it remains challenging for deep learning-based methods to segment the complete airway tree (on extracting deeper branches). Unlike other organs with simpler shapes or topology, the airway's complex tree structure imposes an unbearable burden to generate the "ground truth" label (up to 7 or 3 hours of manual or semi-automatic annotation per case). Most of the existing airway datasets are incompletely labeled/annotated, thus limiting the completeness of computer-segmented airway. In this paper, we propose a new anatomy-aware multi-class airway segmentation method enhanced by topology-guided iterative self-learning. Based on the natural airway anatomy, we formulate a simple yet highly effective anatomy-aware multi-class segmentation task to intuitively handle the severe intra-class imbalance of the airway. To solve the incomplete labeling issue, we propose a tailored iterative self-learning scheme to segment toward the complete airway tree. For generating pseudo-labels to achieve higher sensitivity (while retaining similar specificity), we introduce a novel breakage attention map and design a topology-guided pseudo-label refinement method by iteratively connecting breaking branches commonly existed from initial pseudo-labels. Extensive experiments have been conducted on four datasets including two public challenges. The proposed method achieves the top performance in both EXACT'09 challenge using average score and ATM'22 challenge on weighted average score. In a public BAS dataset and a private lung cancer dataset, our method significantly improves previous leading approaches by extracting at least (absolute) 6.1% more detected tree length and 5.2% more tree branches, while maintaining comparable precision.
计算机断层扫描中的胸内气道分割是各种呼吸系统疾病分析的先决条件，例如慢性阻塞性肺病、哮喘和肺癌。由于外围分支的成像对比度低、噪声大、气道树的拓扑复杂性和类内不平衡，基于深度学习的方法分割完整的气道树（提取更深的分支）仍然具有挑战性。与具有更简单形状或拓扑的其他器官不同，气道复杂的树结构给生成“地面真相”标签带来了难以承受的负担（每个病例最多需要 7 或 3 小时的手动或半自动注释）。大多数现有气道数据集的标记/注释不完整，从而限制了计算机分段气道的完整性。在本文中，我们提出了一种通过拓扑引导迭代自学习增强的新的解剖感知多类气道分割方法。基于自然气道解剖结构，我们制定了一个简单而高效的解剖感知多类分割任务，以直观地处理气道严重的类内不平衡。为了解决不完整的标记问题，我们提出了一种定制的迭代自学习方案来分割完整的气道树。为了生成伪标签以实现更高的灵敏度（同时保留相似的特异性），我们引入了一种新颖的断裂注意图，并通过迭代连接初始伪标签中常见的断裂分支来设计拓扑引导的伪标签细化方法。已经对四个数据集进行了广泛的实验，其中包括两个公共挑战。所提出的方法在使用平均分数的 EXACT'09 挑战和使用加权平均分数的 ATM'22 挑战中均取得了最佳性能。在公共 BAS 数据集和私有肺癌数据集中，我们的方法通过提取至少（绝对）多出 6.1% 的检测树长度和多 5.2% 的树枝，同时保持相当的精度，显着改进了以前的领先方法。

AU Liu, Jiaxuan Zhang, Hui Tian, Jiang-Huai Su, Yingjian Chen, Yurong Wang, Yaonan
刘AU、张家轩、田辉、苏江怀、陈英健、王玉蓉、耀南

R2D2-GAN: Robust Dual Discriminator Generative Adversarial Network for Microscopy Hyperspectral Image Super-Resolution.
R2D2-GAN：用于显微镜高光谱图像超分辨率的鲁棒双鉴别器生成对抗网络。

High-resolution microscopy hyperspectral (HS) images can provide highly detailed spatial and spectral information, enabling the identification and analysis of biological tissues at a microscale level. Recently, significant efforts have been devoted to enhancing the resolution of HS images by leveraging high spatial resolution multispectral (MS) images. However, the inherent hardware constraints lead to a significant distribution gap between HS and MS images, posing challenges for image super-resolution within biomedical domains. This discrepancy may arise from various factors, including variations in camera imaging principles (e.g., snapshot and push-broom imaging), shooting positions, and the presence of noise interference. To address these challenges, we introduced a unique unsupervised super-resolution framework named R2D2-GAN. This framework utilizes a generative adversarial network (GAN) to efficiently merge the two data modalities and improve the resolution of microscopy HS images. Traditionally, supervised approaches have relied on intuitive and sensitive loss functions, such as mean squared error (MSE). Our method, trained in a real-world unsupervised setting, benefits from exploiting consistent information across the two modalities. It employs a game-theoretic strategy and dynamic adversarial loss, rather than relying solely on fixed training strategies for reconstruction loss. Furthermore, we have augmented our proposed model with a central consistency regularization (CCR) module, aiming to further enhance the robustness of the R2D2-GAN. Our experimental results show that the proposed method is accurate and robust for super-resolution images. We specifically tested our proposed method on both a real and a synthetic dataset, obtaining promising results in comparison to other state-of-the-art methods. Our code and datasets are accessible through Multimedia Content.
高分辨率显微镜高光谱 (HS) 图像可以提供高度详细的空间和光谱信息，从而能够在微观尺度上识别和分析生物组织。最近，人们致力于利用高空间分辨率多光谱（MS）图像来提高 HS 图像的分辨率。然而，固有的硬件限制导致HS和MS图像之间存在显着的分布差距，给生物医学领域的图像超分辨率带来了挑战。这种差异可能由多种因素引起，包括相机成像原理的变化（例如快照和推扫式成像）、拍摄位置以及噪声干扰的存在。为了应对这些挑战，我们引入了一种独特的无监督超分辨率框架，名为 R2D2-GAN。该框架利用生成对抗网络 (GAN) 有效地合并两种数据模式并提高显微 HS 图像的分辨率。传统上，监督方法依赖于直观且敏感的损失函数，例如均方误差（MSE）。我们的方法在现实世界的无监督环境中进行训练，受益于利用两种模式的一致信息。它采用博弈论策略和动态对抗性损失，而不是仅仅依靠固定的训练策略来重建损失。此外，我们还使用中央一致性正则化（CCR）模块增强了我们提出的模型，旨在进一步增强 R2D2-GAN 的鲁棒性。我们的实验结果表明，所提出的方法对于超分辨率图像是准确且鲁棒的。我们专门在真实数据集和合成数据集上测试了我们提出的方法，与其他最先进的方法相比，获得了有希望的结果。我们的代码和数据集可通过多媒体内容访问。

AU Zhang, Zhenxuan Yu, Chengjin Zhang, Heye Gao, Zhifan
张AU、于振轩、张成金、高荷叶、志凡

Embedding Tasks Into the Latent Space: Cross-Space Consistency for Multi-Dimensional Analysis in Echocardiography
将任务嵌入到潜在空间中：超声心动图多维分析的跨空间一致性

Multi-dimensional analysis in echocardiography has attracted attention due to its potential for clinical indices quantification and computer-aided diagnosis. It can utilize various information to provide the estimation of multiple cardiac indices. However, it still has the challenge of inter-task conflict. This is owing to regional confusion, global abnormalities, and time-accumulated errors. Task mapping methods have the potential to address inter-task conflict. However, they may overlook the inherent differences between tasks, especially for multi-level tasks (e.g., pixel-level, image-level, and sequence-level tasks). This may lead to inappropriate local and spurious task constraints. We propose cross-space consistency (CSC) to overcome the challenge. The CSC embeds multi-level tasks to the same-level to reduce inherent task differences. This allows multi-level task features to be consistent in a unified latent space. The latent space extracts task-common features and constrains the distance in these features. This constrains the task weight region that satisfies multiple task conditions. Extensive experiments compare the CSC with fifteen state-of-the-art echocardiographic analysis methods on five datasets (10,908 patients). The result shows that the CSC can provide left ventricular (LV) segmentation, (DSC = 0.932), keypoint detection (MAE = 3.06mm), and keyframe identification (accuracy = 0.943). These results demonstrate that our method can provide a multi-dimensional analysis of cardiac function and is robust in large-scale datasets.
超声心动图的多维分析因其在临床指标量化和计算机辅助诊断方面的潜力而引起人们的关注。它可以利用各种信息来提供多种心脏指数的估计。然而，它仍然面临任务间冲突的挑战。这是由于区域混乱、全局异常和时间累积的错误造成的。任务映射方法有可能解决任务间冲突。然而，他们可能忽略了任务之间的固有差异，特别是对于多级任务（例如，像素级、图像级和序列级任务）。这可能会导致不适当的本地和虚假任务限制。我们提出跨空间一致性（CSC）来克服这一挑战。 CSC将多级任务嵌入到同一级中，以减少固有的任务差异。这使得多级任务特征在统一的潜在空间中保持一致。潜在空间提取任务共同特征并约束这些特征中的距离。这限制了满足多个任务条件的任务权重区域。大量实验将 CSC 与 5 个数据集（10,908 名患者）上的 15 种最先进的超声心动图分析方法进行了比较。结果表明，CSC 可以提供左心室 (LV) 分割（DSC = 0.932）、关键点检测（MAE = 3.06mm）和关键帧识别（精度 = 0.943）。这些结果表明，我们的方法可以提供心脏功能的多维分析，并且在大规模数据集中具有鲁棒性。

AU Li, Xing Jing, Kaili Yang, Yan Wang, Yongbo Ma, Jianhua Zheng, Hairong Xu, Zongben
AU Li、Xing Jing、杨凯丽、王彦、马永波、郑建华、徐海荣、宗本

Noise-Generating and Imaging Mechanism Inspired Implicit Regularization Learning Network for Low Dose CT Reconstrution
噪声产生和成像机制启发低剂量 CT 重建的隐式正则化学习网络

Low-dose computed tomography (LDCT) helps to reduce radiation risks in CT scanning while maintaining image quality, which involves a consistent pursuit of lower incident rays and higher reconstruction performance. Although deep learning approaches have achieved encouraging success in LDCT reconstruction, most of them treat the task as a general inverse problem in either the image domain or the dual (sinogram and image) domains. Such frameworks have not considered the original noise generation of the projection data and suffer from limited performance improvement for the LDCT task. In this paper, we propose a novel reconstruction model based on noise-generating and imaging mechanism in full-domain, which fully considers the statistical properties of intrinsic noises in LDCT and prior information in sinogram and image domains. To solve the model, we propose an optimization algorithm based on the proximal gradient technique. Specifically, we derive the approximate solutions of the integer programming problem on the projection data theoretically. Instead of hand-crafting the sinogram and image regularizers, we propose to unroll the optimization algorithm to be a deep network. The network implicitly learns the proximal operators of sinogram and image regularizers with two deep neural networks, providing a more interpretable and effective reconstruction procedure. Numerical results demonstrate our proposed method improvements of > 2.9 dB in peak signal to noise ratio, > 1.4% promotion in structural similarity metric, and > 9 HU decrements in root mean square error over current state-of-the-art LDCT methods.
低剂量计算机断层扫描（LDCT）有助于降低CT扫描中的辐射风险，同时保持图像质量，这涉及对更低入射射线和更高重建性能的持续追求。尽管深度学习方法在 LDCT 重建方面取得了令人鼓舞的成功，但大多数方法都将该任务视为图像域或双（正弦图和图像）域中的一般逆问题。此类框架没有考虑投影数据的原始噪声生成，并且 LDCT 任务的性能改进有限。在本文中，我们提出了一种基于全域噪声生成和成像机制的新型重建模型，该模型充分考虑了LDCT中固有噪声的统计特性以及正弦图和图像域中的先验信息。为了求解该模型，我们提出了一种基于近端梯度技术的优化算法。具体来说，我们从理论上推导了投影数据上的整数规划问题的近似解。我们建议将优化算法展开为深度网络，而不是手工制作正弦图和图像正则化器。该网络通过两个深度神经网络隐式学习正弦图和图像正则化器的近端算子，提供更可解释和更有效的重建过程。数值结果表明，与当前最先进的 LDCT 相比，我们提出的方法在峰值信噪比方面提高了 > 2.9 dB，在结构相似性度量方面提高了 > 1.4%，在均方根误差方面降低了 > 9 HU方法。

AU Onishi, Yuya Hashimoto, Fumio Ote, Kibo Ota, Ryosuke
大西AU、桥本裕也、大手文雄、太田喜房、凉介

Whole Reconstruction-Free System Design for Direct Positron Emission Imaging From Image Generation to Attenuation Correction
直接正电子发射成像从图像生成到衰减校正的整体免重构系统设计

Direct positron emission imaging (dPEI), which does not require a mathematical reconstruction step, is a next-generation molecular imaging modality. To maximize the practical applicability of the dPEI system to clinical practice, we introduce a novel reconstruction-free image-formation method called direct mu(Compton) imaging, which directly localizes the interaction position of Compton scattering from the annihilation photons in a three-dimensional space by utilizing the same compact geometry as that for dPEI, involving ultrafast time-of-flight radiation detectors. This unique imaging method not only provides the anatomical information about an object but can also be applied to attenuation correction of dPEI images. Evaluations through Monte Carlo simulation showed that functional and anatomical hybrid images can be acquired using this multimodal imaging system. By fusing the images, it is possible to simultaneously access various object data, which ensures the synergistic effect of the two imaging methodologies. In addition, attenuation correction improves the quantification of dPEI images. The realization of the whole reconstruction-free imaging system from image generation to quantitative correction provides a new perspective in molecular imaging.
直接正电子发射成像（dPEI）不需要数学重建步骤，是下一代分子成像模式。为了最大限度地提高 dPEI 系统在临床实践中的实用性，我们引入了一种称为直接 mu(康普顿) 成像的新型免重建成像方法，该方法直接定位三维湮灭光子的康普顿散射的相互作用位置。利用与 dPEI 相同的紧凑几何结构，包括超快飞行时间辐射探测器。这种独特的成像方法不仅可以提供物体的解剖信息，还可以应用于 dPEI 图像的衰减校正。通过蒙特卡罗模拟的评估表明，可以使用这种多模态成像系统获取功能和解剖混合图像。通过融合图像，可以同时访问各种对象数据，从而确保两种成像方法的协同效应。此外，衰减校正提高了 dPEI 图像的量化。从图像生成到定量校正的整个免重建成像系统的实现，为分子成像提供了新的视角。

AU Ta, Kevinminh Ahn, Shawn S. Thorn, Stephanie L. Stendahl, John C. Zhang, Xiaoran Langdon, Jonathan Staib, Lawrence H. Sinusas, Albert J. Duncan, James S.
AU Ta、Kevinminh Ahn、Shawn S. Thorn、Stephanie L. Stendahl、John C. 张、Xiaoran Langdon、Jonathan Staib、Lawrence H. Sinusas、Albert J. Duncan、James S.

Multi-Task Learning for Motion Analysis and Segmentation in 3D Echocardiography
3D 超声心动图运动分析和分割的多任务学习

Characterizing left ventricular deformation and strain using 3D+time echocardiography provides useful insights into cardiac function and can be used to detect and localize myocardial injury. To achieve this, it is imperative to obtain accurate motion estimates of the left ventricle. In many strain analysis pipelines, this step is often accompanied by a separate segmentation step; however, recent works have shown both tasks to be highly related and can be complementary when optimized jointly. In this work, we present a multi-task learning network that can simultaneously segment the left ventricle and track its motion between multiple time frames. Two task-specific networks are trained using a composite loss function. Cross-stitch units combine the activations of these networks by learning shared representations between the tasks at different levels. We also propose a novel shape-consistency unit that encourages motion propagated segmentations to match directly predicted segmentations. Using a combined synthetic and in-vivo 3D echocardiography dataset, we demonstrate that our proposed model can achieve excellent estimates of left ventricular motion displacement and myocardial segmentation. Additionally, we observe strong correlation of our image-based strain measurements with crystal-based strain measurements as well as good correspondence with SPECT perfusion mappings. Finally, we demonstrate the clinical utility of the segmentation masks in estimating ejection fraction and sphericity indices that correspond well with benchmark measurements.
使用 3D+时间超声心动图表征左心室变形和应变可以提供对心脏功能的有用见解，并可用于检测和定位心肌损伤。为了实现这一目标，必须获得左心室的准确运动估计。在许多应变分析流程中，此步骤通常伴随着单独的分割步骤；然而，最近的研究表明这两项任务高度相关，并且在联合优化时可以互补。在这项工作中，我们提出了一个多任务学习网络，可以同时分割左心室并跟踪其在多个时间帧之间的运动。使用复合损失函数训练两个特定于任务的网络。十字绣单元通过学习不同级别的任务之间的共享表示来组合这些网络的激活。我们还提出了一种新颖的形状一致性单元，它鼓励运动传播的分割来匹配直接预测的分割。使用组合的合成和体内 3D 超声心动图数据集，我们证明我们提出的模型可以实现左心室运动位移和心肌分割的出色估计。此外，我们观察到基于图像的应变测量与基于晶体的应变测量之间存在很强的相关性，并且与 SPECT 灌注映射具有良好的对应性。最后，我们展示了分割掩模在估计射血分数和球形指数方面的临床实用性，这些指数与基准测量值很好地对应。

AU Liang, Yinhao Tang, Wenjie Wang, Ting Ng, Wing W. Y. Chen, Siyi Jiang, Kuiming Wei, Xinhua Jiang, Xinqing Guo, Yuan
区亮、汤银浩、王文杰、吴婷、陈永永、姜思怡、魏奎明、姜新华、郭新庆、袁

HRadNet: A Hierarchical Radiomics-Based Network for Multicenter Breast Cancer Molecular Subtypes Prediction
HRadNet：基于分层放射组学的多中心乳腺癌分子亚型预测网络

Breast cancer is a heterogeneous disease, where molecular subtypes of breast cancer are closely related to the treatment and prognosis. Therefore, the goal of this work is to differentiate between luminal and non-luminal subtypes of breast cancer. The hierarchical radiomics network (HRadNet) is proposed for breast cancer molecular subtypes prediction based on dynamic contrast-enhanced magnetic resonance imaging. HRadNet fuses multilayer features with the metadata of images to take advantage of conventional radiomics methods and general convolutional neural networks. A two-stage training mechanism is adopted to improve the generalization capability of the network for multicenter breast cancer data. The ablation study shows the effectiveness of each component of HRadNet. Furthermore, the influence of features from different layers and metadata fusion are also analyzed. It reveals that selecting certain layers of features for a specified domain can make further performance improvements. Experimental results on three data sets from different devices demonstrate the effectiveness of the proposed network. HRadNet also has good performance when transferring to other domains without fine-tuning.
乳腺癌是一种异质性疾病，乳腺癌的分子亚型与治疗和预后密切相关。因此，这项工作的目标是区分乳腺癌的管腔亚型和非管腔亚型。分层放射组学网络（HRadNet）被提出用于基于动态增强磁共振成像的乳腺癌分子亚型预测。 HRadNet 将多层特征与图像元数据融合，以利用传统放射组学方法和通用卷积神经网络。采用两阶段训练机制，提高网络对多中心乳腺癌数据的泛化能力。消融研究显示了 HRadNet 每个组件的有效性。此外，还分析了不同层特征和元数据融合的影响。它表明，为指定领域选择某些特征层可以进一步提高性能。来自不同设备的三个数据集的实验结果证明了所提出网络的有效性。 HRadNet 在无需微调的情况下转移到其他域时也具有良好的性能。

AU Ahmadi, N. Tsang, M. Y. Gu, A. N. Tsang, T. S. M. Abolmaesumi, P.
AU Ahmadi, N. Tsang, MY Gu, AN Tsang, TSM Abolmaesumi, P.

Transformer-Based Spatio-Temporal Analysis for Classification of Aortic Stenosis Severity From Echocardiography Cine Series
基于变压器的时空分析对超声心动图电影系列中的主动脉瓣狭窄严重程度进行分类

Aortic stenosis (AS) is characterized by restricted motion and calcification of the aortic valve and is the deadliest valvular cardiac disease. Assessment of AS severity is typically done by expert cardiologists using Doppler measurements of valvular flow from echocardiography. However, this limits the assessment of AS to hospitals staffed with experts to provide comprehensive echocardiography service. As accurate Doppler acquisition requires significant clinical training, in this paper, we present a deep learning framework to determine the feasibility of AS detection and severity classification based only on two-dimensional echocardiographic data. We demonstrate that our proposed spatio-temporal architecture effectively and efficiently combines both anatomical features and motion of the aortic valve for AS severity classification. Our model can process cardiac echo cine series of varying length and can identify, without explicit supervision, the frames that are most informative towards the AS diagnosis. We present an empirical study on how the model learns phases of the heart cycle without any supervision and frame-level annotations. Our architecture outperforms state-of-the-art results on a private and a public dataset, achieving 95.2% and 91.5% in AS detection, and 78.1% and 83.8% in AS severity classification on the private and public datasets, respectively. Notably, due to the lack of a large public video dataset for AS, we made slight adjustments to our architecture for the public dataset. Furthermore, our method addresses common problems in training deep networks with clinical ultrasound data, such as a low signal-to-noise ratio and frequently uninformative frames. Our source code is available at: https://github.com/neda77aa/FTC.git
主动脉瓣狭窄（AS）的特点是主动脉瓣运动受限和钙化，是最致命的瓣膜性心脏病。 AS 严重程度的评估通常由心脏病专家使用超声心动图对瓣膜血流的多普勒测量来完成。然而，这限制了对AS的评估仅限于配备专家提供全面超声心动图服务的医院。由于准确的多普勒采集需要大量的临床培训，因此在本文中，我们提出了一个深度学习框架，以确定仅基于二维超声心动图数据进行 AS 检测和严重程度分类的可行性。我们证明了我们提出的时空架构有效地结合了主动脉瓣的解剖特征和运动，以进行 AS 严重程度分类。我们的模型可以处理不同长度的心脏回声电影系列，并且可以在没有明确监督的情况下识别对 AS 诊断信息最丰富的帧。我们提出了一项关于模型如何在没有任何监督和帧级注释的情况下学习心动周期阶段的实证研究。我们的架构在私有和公共数据集上的表现优于最先进的结果，在私有和公共数据集上分别实现了 AS 检测的 95.2% 和 91.5%，以及 AS 严重性分类的 78.1% 和 83.8%。值得注意的是，由于缺乏大量的 AS 公共视频数据集，我们对公共数据集的架构进行了轻微调整。此外，我们的方法解决了使用临床超声数据训练深度网络的常见问题，例如低信噪比和经常缺乏信息的帧。我们的源代码位于：https://github.com/neda77aa/FTC.git

AU Li, Yiyue Qian, Guangwu Jiang, Xiaoshuang Jiang, Zekun Wen, Wen Zhang, Shaoting Li, Kang Lao, Qicheng
AU Li、钱一跃、蒋光武、蒋小双、文泽坤、张文、李少婷、康老、启成

Hierarchical-Instance Contrastive Learning for Minority Detection on Imbalanced Medical Datasets
用于不平衡医学数据集少数检测的分层实例对比学习

Deep learning methods are often hampered by issues such as data imbalance and data-hungry. In medical imaging, malignant or rare diseases are frequently of minority classes in the dataset, featured by diversified distribution. Besides that, insufficient labels and unseen cases also present conundrums for training on the minority classes. To confront the stated problems, we propose a novel Hierarchical-instance Contrastive Learning (HCLe) method for minority detection by only involving data from the majority class in the training stage. To tackle inconsistent intra-class distribution in majority classes, our method introduces two branches, where the first branch employs an auto-encoder network augmented with three constraint functions to effectively extract image-level features, and the second branch designs a novel contrastive learning network by taking into account the consistency of features among hierarchical samples from majority classes. The proposed method is further refined with a diverse mini-batch strategy, enabling the identification of minority classes under multiple conditions. Extensive experiments have been conducted to evaluate the proposed method on three datasets of different diseases and modalities. The experimental results show that the proposed method outperforms the state-of-the-art methods.
深度学习方法常常受到数据不平衡和数据匮乏等问题的阻碍。在医学影像中，恶性或罕见疾病在数据集中往往属于少数类别，且分布多样化。除此之外，标签不足、案例未见也给少数民族班的培训带来了难题。为了解决上述问题，我们提出了一种新颖的分层实例对比学习（HCLe）方法，通过在训练阶段仅涉及来自多数类别的数据来进行少数检测。为了解决大多数类中类内分布不一致的问题，我们的方法引入了两个分支，其中第一个分支采用增强了三个约束函数的自动编码器网络来有效提取图像级特征，第二个分支设计了一种新颖的对比学习网络通过考虑大多数类别的分层样本之间特征的一致性。所提出的方法通过多样化的小批量策略进一步完善，从而能够在多种条件下识别少数类别。已经进行了大量的实验，以在不同疾病和模式的三个数据集上评估所提出的方法。实验结果表明，所提出的方法优于最先进的方法。

AU Mei, Lanzhuju Fang, Yu Zhao, Yue Zhou, Xiang Sean Zhu, Min Cui, Zhiming Shen, Dinggang
区梅、方兰珠菊、赵宇、周悦、朱翔、崔敏、沉志明、丁刚

DTR-Net: Dual-Space 3D Tooth Model Reconstruction From Panoramic X-Ray Images
DTR-Net：根据全景 X 射线图像重建双空间 3D 牙齿模型

In digital dentistry, cone-beam computed tomography (CBCT) can provide complete 3D tooth models, yet suffers from a long concern of requiring excessive radiation dose and higher expense. Therefore, 3D tooth model reconstruction from 2D panoramic X-ray image is more cost-effective, and has attracted great interest in clinical applications. In this paper, we propose a novel dual-space framework, namely DTR-Net, to reconstruct 3D tooth model from 2D panoramic X-ray images in both image and geometric spaces. Specifically, in the image space, we apply a 2D-to-3D generative model to recover intensities of CBCT image, guided by a task-oriented tooth segmentation network in a collaborative training manner. Meanwhile, in the geometric space, we benefit from an implicit function network in the continuous space, learning using points to capture complicated tooth shapes with geometric properties. Experimental results demonstrate that our proposed DTR-Net achieves state-of-the-art performance both quantitatively and qualitatively in 3D tooth model reconstruction, indicating its potential application in dental practice.
在数字牙科领域，锥形束计算机断层扫描（CBCT）可以提供完整的3D牙齿模型，但长期以来一直存在辐射剂量过高和费用较高的问题。因此，从2D全景X射线图像重建3D牙齿模型更具成本效益，并引起了临床应用的极大兴趣。在本文中，我们提出了一种新颖的双空间框架，即 DTR-Net，用于在图像和几何空间中从 2D 全景 X 射线图像重建 3D 牙齿模型。具体来说，在图像空间中，我们应用 2D 到 3D 生成模型来恢复 CBCT 图像的强度，并以协作训练的方式由面向任务的牙齿分割网络引导。同时，在几何空间中，我们受益于连续空间中的隐式函数网络，学习使用点来捕获具有几何特性的复杂牙齿形状。实验结果表明，我们提出的 DTR-Net 在 3D 牙齿模型重建中在定量和定性方面均实现了最先进的性能，表明其在牙科实践中的潜在应用。

AU Noichl, Wolfgang De Marco, Fabio Willer, Konstantin Urban, Theresa Frank, Manuela Schick, Rafael Gleich, Bernhard Hehn, Lorenz Gustschin, Alex Meyer, Pascal Koehler, Thomas Maack, Ingo Engel, Klaus-Jurgen Lundt, Bernd Renger, Bernhard Fingerle, Alexander Pfeiffer, Daniela Rummeny, Ernst Herzen, Julia Pfeiffer, Franz
奥·诺希尔、沃尔夫冈·德马科、法比奥·威勒、康斯坦丁·厄本、特里萨·弗兰克、曼努埃拉·希克、拉斐尔·格莱奇、伯恩哈德·赫恩、洛伦茨·古斯特钦、亚历克斯·迈耶、帕斯卡·克勒、托马斯·马克、英戈·恩格尔、克劳斯-于尔根·伦特、贝恩德·伦格、伯恩哈德·芬格勒亚历山大·菲佛 / 丹妮拉·鲁梅尼 / 恩斯特·赫尔岑 / 朱莉娅·菲佛 / 弗兰兹

Correction for Mechanical Inaccuracies in a Scanning Talbot-Lau Interferometer
扫描 Talbot-Lau 干涉仪中机械误差的校正

Grating-based X-ray phase-contrast and in particular dark-field radiography are promising new imaging modalities for medical applications. Currently, the potential advantage of dark-field imaging in early-stage diagnosis of pulmonary diseases in humans is being investigated. These studies make use of a comparatively large scanning interferometer at short acquisition times, which comes at the expense of a significantly reduced mechanical stability as compared to tabletop laboratory setups. Vibrations create random fluctuations of the grating alignment, causing artifacts in the resulting images. Here, we describe a novel maximum likelihood method for estimating this motion, thereby preventing these artifacts. It is tailored to scanning setups and does not require any sample-free areas. Unlike any previously described method, it accounts for motion in between as well as during exposures.
基于光栅的 X 射线相衬技术，特别是暗场射线照相术，是医疗应用中前景光明的新成像方式。目前，正在研究暗场成像在人类肺部疾病早期诊断中的潜在优势。这些研究在较短的采集时间内使用了相对较大的扫描干涉仪，但与桌面实验室设置相比，其机械稳定性显着降低。振动会造成光栅对准的随机波动，从而导致生成的图像出现伪影。在这里，我们描述了一种新颖的最大似然方法来估计这种运动，从而防止这些伪影。它专为扫描设置而定制，不需要任何无样品区域。与之前描述的任何方法不同，它考虑了曝光之间以及曝光期间的运动。

AU Wang, Hong Xie, Qi Zeng, Dong Ma, Jianhua Meng, Deyu Zheng, Yefeng
王AU、谢红、曾琪、马冬、孟建华、郑德宇、叶峰

OSCNet: Orientation-Shared Convolutional Network for CT Metal Artifact Learning
OSCNet：用于 CT 金属工件学习的方向共享卷积网络

X-ray computed tomography (CT) has been broadly adopted in clinical applications for disease diagnosis and image-guided interventions. However, metals within patients always cause unfavorable artifacts in the recovered CT images. Albeit attaining promising reconstruction results for this metal artifact reduction (MAR) task, most of the existing deep-learning-based approaches have some limitations. The critical issue is that most of these methods have not fully exploited the important prior knowledge underlying this specific MAR task. Therefore, in this paper, we carefully investigate the inherent characteristics of metal artifacts which present rotationally symmetrical streaking patterns. Then we specifically propose an orientation-shared convolution representation mechanism to adapt such physical prior structures and utilize Fourier-series-expansion-based filter parametrization for modelling artifacts, which can finely separate metal artifacts from body tissues. By adopting the classical proximal gradient algorithm to solve the model and then utilizing the deep unfolding technique, we easily build the corresponding orientation-shared convolutional network, termed as OSCNet. Furthermore, considering that different sizes and types of metals would lead to different artifact patterns (e.g., intensity of the artifacts), to better improve the flexibility of artifact learning and fully exploit the reconstructed results at iterative stages for information propagation, we design a simple-yet-effective sub-network for the dynamic convolution representation of artifacts. By easily integrating the sub-network into the proposed OSCNet framework, we further construct a more flexible network structure, called OSCNet+, which improves the generalization performance. Through extensive experiments conducted on synthetic and clinical datasets, we comprehensively substantiate the effectiveness of our proposed methods. Code will be released at https://github.com/hongwang01/OSCNet.
X射线计算机断层扫描（CT）已广泛应用于疾病诊断和图像引导干预的临床应用。然而，患者体内的金属总是会在恢复的 CT 图像中产生不利的伪影。尽管在金属伪影减少（MAR）任务中获得了有希望的重建结果，但大多数现有的基于深度学习的方法都存在一些局限性。关键问题是，大多数这些方法都没有充分利用这一特定 MAR 任务背后的重要先验知识。因此，在本文中，我们仔细研究了呈现旋转对称条纹图案的金属制品的固有特征。然后，我们特别提出了一种方向共享卷积表示机制来适应这种物理先验结构，并利用基于傅里叶级数展开的滤波器参数化来建模伪影，这可以将金属伪影与身体组织精细地分离。采用经典的近端梯度算法求解模型，然后利用深度展开技术，我们很容易构建相应的方向共享卷积网络，称为OSCNet。此外，考虑到不同尺寸和类型的金属会导致不同的伪影模式（例如伪影的强度），为了更好地提高伪影学习的灵活性并充分利用迭代阶段的重建结果进行信息传播，我们设计了一个简单的模型- 用于工件动态卷积表示的有效子网络。通过轻松地将子网络集成到所提出的 OSCNet 框架中，我们进一步构建了更灵活的网络结构，称为 OSCNet+，从而提高了泛化性能。通过对合成和临床数据集进行广泛的实验，我们全面证实了我们提出的方法的有效性。代码将在 https://github.com/hongwang01/OSCNet 发布。

AU Zhou, Lianyu Yu, Lequan Wang, Liansheng
周AU、于连玉、王乐泉、连胜

RECIST-Induced Reliable Learning: Geometry-Driven Label Propagation for Universal Lesion Segmentation
RECIST 诱导的可靠学习：用于通用病灶分割的几何驱动标签传播

Automatic universal lesion segmentation (ULS) from Computed Tomography (CT) images can ease the burden of radiologists and provide a more accurate assessment than the current Response Evaluation Criteria In Solid Tumors (RECIST) guideline measurement. However, this task is underdeveloped due to the absence of large-scale pixel-wise labeled data. This paper presents a weakly-supervised learning framework to utilize the large-scale existing lesion databases in hospital Picture Archiving and Communication Systems (PACS) for ULS. Unlike previous methods to construct pseudo surrogate masks for fully supervised training through shallow interactive segmentation techniques, we propose to unearth the implicit information from RECIST annotations and thus design a unified RECIST-induced reliable learning (RiRL) framework. Particularly, we introduce a novel label generation procedure and an on-the-fly soft label propagation strategy to avoid noisy training and poor generalization problems. The former, named RECIST-induced geometric labeling, uses clinical characteristics of RECIST to preliminarily and reliably propagate the label. With the labeling process, a trimap divides the lesion slices into three regions, including certain foreground, background, and unclear regions, which consequently enables a strong and reliable supervision signal on a wide region. A topological knowledge-driven graph is built to conduct the on-the-fly label propagation for the optimal segmentation boundary to further optimize the segmentation boundary. Experimental results on a public benchmark dataset demonstrate that the proposed method surpasses the SOTA RECIST-based ULS methods by a large margin. Our approach surpasses SOTA approaches over 2.0%, 1.5%, 1.4%, and 1.6% Dice with ResNet101, ResNet50, HRNet, and ResNest50 backbones.
计算机断层扫描 (CT) 图像的自动通用病灶分割 (ULS) 可以减轻放射科医生的负担，并提供比当前实体瘤反应评估标准 (RECIST) 指南测量更准确的评估。然而，由于缺乏大规模像素级标记数据，该任务尚未开发。本文提出了一种弱监督学习框架，以利用 ULS 医院图片存档和通信系统 (PACS) 中现有的大规模病变数据库。与之前通过浅层交互式分割技术构建伪代理掩码以进行完全监督训练的方法不同，我们建议从 RECIST 注释中挖掘隐含信息，从而设计一个统一的 RECIST 诱导的可靠学习（RiRL）框架。特别是，我们引入了一种新颖的标签生成过程和一种动态软标签传播策略，以避免噪声训练和泛化不良的问题。前者称为RECIST诱导几何标记，利用RECIST的临床特征来初步可靠地传播标签。通过标记过程，三元图将病变切片分为三个区域，包括某些前景、背景和不清晰区域，从而在广阔的区域上提供强大而可靠的监督信号。构建拓扑知识驱动图，对最佳分割边界进行即时标签传播，以进一步优化分割边界。在公共基准数据集上的实验结果表明，所提出的方法大大超过了基于 SOTA RECIST 的 ULS 方法。我们的方法比 SOTA 方法高出 2.0%、1.5%、1.4% 和 1 以上。6% Dice 具有 ResNet101、ResNet50、HRNet 和 ResNest50 主干网。

AU He, Linchao Du, Wenchao Liao, Peixi Fan, Fenglei Chen, Hu Yang, Hongyu Zhang, Yi
区贺、杜林超、廖文超、范培熙、陈风雷、胡杨、张宏宇、易

Solving Zero-Shot Sparse-View CT Reconstruction With Variational Score Solver.
使用变分求解器求解零样本稀疏视图 CT 重建。

Computed tomography (CT) stands as a ubiquitous medical diagnostic tool. Nonetheless, the radiation-related concerns associated with CT scans have raised public apprehensions. Mitigating radiation dosage in CT imaging poses an inherent challenge as it inevitably compromises the fidelity of CT reconstructions, impacting diagnostic accuracy. While previous deep learning techniques have exhibited promise in enhancing CT reconstruction quality, they remain hindered by the reliance on paired data, which is arduous to procure. In this study, we present a novel approach named Variational Score Solver (VSS) for solving sparse-view reconstruction without paired data. Our approach entails the acquisition of a probability distribution from densely sampled CT reconstructions, employing a latent diffusion model. High-quality reconstruction outcomes are achieved through an iterative process, wherein the diffusion model serves as the prior term, subsequently integrated with the data consistency term. Notably, rather than directly employing the prior diffusion model, we distill prior knowledge by finding the fixed point of the diffusion model. This framework empowers us to exercise precise control over the process. Moreover, we depart from modeling the reconstruction outcomes as deterministic values, opting instead for a distribution-based approach. This enables us to achieve more accurate reconstructions utilizing a trainable model. Our approach introduces a fresh perspective to the realm of zero-shot CT reconstruction, circumventing the constraints of supervised learning. Our extensive qualitative and quantitative experiments unequivocally demonstrate that VSS surpasses other contemporary unsupervised and achieves comparable results compared with the most advance supervised methods in sparse-view reconstruction tasks. Codes are available in https://github.com/fpsandnoob/vss.
计算机断层扫描 (CT) 是一种无处不在的医疗诊断工具。尽管如此，与 CT 扫描相关的辐射问题引起了公众的担忧。减少 CT 成像中的辐射剂量是一个固有的挑战，因为它不可避免地会损害 CT 重建的保真度，影响诊断准确性。虽然之前的深度学习技术在提高 CT 重建质量方面表现出了希望，但它们仍然受到对配对数据的依赖的阻碍，而配对数据很难获得。在这项研究中，我们提出了一种名为变分得分求解器（VSS）的新方法，用于在没有配对数据的情况下解决稀疏视图重建问题。我们的方法需要使用潜在扩散模型从密集采样的 CT 重建中获取概率分布。高质量的重建结果是通过迭代过程实现的，其中扩散模型作为先验项，随后与数据一致性项集成。值得注意的是，我们不是直接采用先验扩散模型，而是通过找到扩散模型的不动点来提取先验知识。该框架使我们能够对流程进行精确控制。此外，我们不再将重建结果建模为确定性值，而是选择基于分布的方法。这使我们能够利用可训练模型实现更准确的重建。我们的方法为零样本 CT 重建领域引入了全新的视角，规避了监督学习的限制。我们广泛的定性和定量实验明确证明，VSS 超越了当代其他无监督方法，并且与稀疏视图重建任务中最先进的有监督方法相比，取得了可比的结果。代码可在 https://github.com/fpsandnoob/vss 中找到。

AU Chen, Yixin Gao, Yajuan Zhu, Lei Shao, Wenrui Lu, Yanye Han, Hongbin Xie, Zhaoheng
陈AU、高一新、朱亚娟、邵雷、卢文瑞、韩彦野、谢宏斌、兆恒

PCNet: Prior Category Network for CT Universal Segmentation Model
PCNet：CT 通用分割模型的先验类别网络

Accurate segmentation of anatomical structures in Computed Tomography (CT) images is crucial for clinical diagnosis, treatment planning, and disease monitoring. The present deep learning segmentation methods are hindered by factors such as data scale and model size. Inspired by how doctors identify tissues, we propose a novel approach, the Prior Category Network (PCNet), that boosts segmentation performance by leveraging prior knowledge between different categories of anatomical structures. Our PCNet comprises three key components: prior category prompt (PCP), hierarchy category system (HCS), and hierarchy category loss (HCL). PCP utilizes Contrastive Language-Image Pretraining (CLIP), along with attention modules, to systematically define the relationships between anatomical categories as identified by clinicians. HCS guides the segmentation model in distinguishing between specific organs, anatomical structures, and functional systems through hierarchical relationships. HCL serves as a consistency constraint, fortifying the directional guidance provided by HCS to enhance the segmentation model's accuracy and robustness. We conducted extensive experiments to validate the effectiveness of our approach, and the results indicate that PCNet can generate a high-performance, universal model for CT segmentation. The PCNet framework also demonstrates a significant transferability on multiple downstream tasks. The ablation experiments show that the methodology employed in constructing the HCS is of critical importance.
计算机断层扫描 (CT) 图像中解剖结构的准确分割对于临床诊断、治疗计划和疾病监测至关重要。目前的深度学习分割方法受到数据规模和模型大小等因素的阻碍。受医生如何识别组织的启发，我们提出了一种新方法，即先验类别网络（PCNet），它通过利用不同类别的解剖结构之间的先验知识来提高分割性能。我们的 PCNet 包含三个关键组件：先前类别提示（PCP）、层次类别系统（HCS）和层次类别丢失（HCL）。 PCP 利用对比语言图像预训练 (CLIP) 以及注意力模块来系统地定义临床医生确定的解剖类别之间的关系。 HCS通过层次关系指导分割模型区分特定器官、解剖结构和功能系统。 HCL作为一致性约束，强化了HCS提供的方向指导，以提高分割模型的准确性和鲁棒性。我们进行了大量的实验来验证我们方法的有效性，结果表明 PCNet 可以生成高性能、通用的 CT 分割模型。 PCNet 框架还展示了在多个下游任务上的显着可转移性。消融实验表明，构建 HCS 所采用的方法至关重要。

AU Lou, Wei Wan, Xiang Li, Guanbin Lou, Xiaoying Li, Chenghang Gao, Feng Li, Haofeng
楼AU、万伟、李翔、楼冠斌、李晓英、高成航、李峰、浩峰

Structure Embedded Nucleus Classification for Histopathology Images
组织病理学图像的结构嵌入核分类

Nuclei classification provides valuable information for histopathology image analysis. However, the large variations in the appearance of different nuclei types cause difficulties in identifying nuclei. Most neural network based methods are affected by the local receptive field of convolutions, and pay less attention to the spatial distribution of nuclei or the irregular contour shape of a nucleus. In this paper, we first propose a novel polygon-structure feature learning mechanism that transforms a nucleus contour into a sequence of points sampled in order, and employ a recurrent neural network that aggregates the sequential change in distance between key points to obtain learnable shape features. Next, we convert a histopathology image into a graph structure with nuclei as nodes, and build a graph neural network to embed the spatial distribution of nuclei into their representations. To capture the correlations between the categories of nuclei and their surrounding tissue patterns, we further introduce edge features that are defined as the background textures between adjacent nuclei. Lastly, we integrate both polygon and graph structure learning mechanisms into a whole framework that can extract intra and inter-nucleus structural characteristics for nuclei classification. Experimental results show that the proposed framework achieves significant improvements compared to the previous methods.
细胞核分类为组织病理学图像分析提供了有价值的信息。然而，不同核类型的外观差异很大，导致识别核的困难。大多数基于神经网络的方法受到卷积局部感受野的影响，较少关注核的空间分布或核的不规则轮廓形状。在本文中，我们首先提出了一种新颖的多边形结构特征学习机制，将核轮廓转换为按顺序采样的点序列，并采用循环神经网络聚合关键点之间距离的顺序变化以获得可学习的形状特征。接下来，我们将组织病理学图像转换为以细胞核为节点的图结构，并构建图神经网络将细胞核的空间分布嵌入到它们的表示中。为了捕获细胞核类别与其周围组织模式之间的相关性，我们进一步引入了边缘特征，这些边缘特征被定义为相邻细胞核之间的背景纹理。最后，我们将多边形和图结构学习机制集成到一个整体框架中，可以提取核内和核间结构特征以进行核分类。实验结果表明，与之前的方法相比，所提出的框架取得了显着的改进。

AU Zheng, Yi Conrad, Regan D. Green, Emily J. Burks, Eric J. Betke, Margrit Beane, Jennifer E. Kolachalama, Vijaya B.
AU Cheng、Yi Conrad、Regan D. Green、Emily J. Burks、Eric J. Betke、Margrit Beane、Jennifer E. Kolachalama、Vijaya B.

Graph Attention-Based Fusion of Pathology Images and Gene Expression for Prediction of Cancer Survival
基于图注意力的病理图像和基因表达融合用于预测癌症生存

Multimodal machine learning models are being developed to analyze pathology images and other modalities, such as gene expression, to gain clinical and biological insights. However, most frameworks for multimodal data fusion do not fully account for the interactions between different modalities. Here, we present an attention-based fusion architecture that integrates a graph representation of pathology images with gene expression data and concomitantly learns from the fused information to predict patient-specific survival. In our approach, pathology images are represented as undirected graphs, and their embeddings are combined with embeddings of gene expression signatures using an attention mechanism to stratify tumors by patient survival. We show that our framework improves the survival prediction of human non-small cell lung cancers, outperforming existing state-of-the-art approaches that leverage multimodal data. Our framework can facilitate spatial molecular profiling to identify tumor heterogeneity using pathology images and gene expression data, complementing results obtained from more expensive spatial transcriptomic and proteomic technologies.
正在开发多模式机器学习模型来分析病理图像和其他模式（例如基因表达），以获得临床和生物学见解。然而，大多数多模态数据融合框架并没有完全考虑不同模态之间的相互作用。在这里，我们提出了一种基于注意力的融合架构，它将病理图像的图形表示与基因表达数据集成在一起，并同时从融合信息中学习以预测患者特定的生存率。在我们的方法中，病理图像被表示为无向图，并且它们的嵌入与基因表达特征的嵌入相结合，使用注意力机制根据患者的生存情况对肿瘤进行分层。我们表明，我们的框架改善了人类非小细胞肺癌的生存预测，优于利用多模态数据的现有最先进方法。我们的框架可以促进空间分子分析，以使用病理图像和基因表达数据来识别肿瘤异质性，补充从更昂贵的空间转录组和蛋白质组技术获得的结果。

AU Meng, Xiangxi Sun, Kaicong Xu, Jun He, Xuming Shen, Dinggang
区萌、孙向西、徐凯聪、何俊、沉旭明、定刚

Multi-Modal Modality-Masked Diffusion Network for Brain MRI Synthesis With Random Modality Missing
用于随机模态缺失的脑 MRI 合成的多模态掩蔽扩散网络

Synthesis of unavailable imaging modalities from available ones can generate modality-specific complementary information and enable multi-modality based medical images diagnosis or treatment. Existing generative methods for medical image synthesis are usually based on cross-modal translation between acquired and missing modalities. These methods are usually dedicated to specific missing modality and perform synthesis in one shot, which cannot deal with varying number of missing modalities flexibly and construct the mapping across modalities effectively. To address the above issues, in this paper, we propose a unified Multi-modal Modality-masked Diffusion Network (M2DN), tackling multi-modal synthesis from the perspective of "progressive whole-modality inpainting", instead of "cross-modal translation". Specifically, our M2DN considers the missing modalities as random noise and takes all the modalities as a unity in each reverse diffusion step. The proposed joint synthesis scheme performs synthesis for the missing modalities and self-reconstruction for the available ones, which not only enables synthesis for arbitrary missing scenarios, but also facilitates the construction of common latent space and enhances the model representation ability. Besides, we introduce a modality-mask scheme to encode availability status of each incoming modality explicitly in a binary mask, which is adopted as condition for the diffusion model to further enhance the synthesis performance of our M2DN for arbitrary missing scenarios. We carry out experiments on two public brain MRI datasets for synthesis and downstream segmentation tasks. Experimental results demonstrate that our M2DN outperforms the state-of-the-art models significantly and shows great generalizability for arbitrary missing modalities.
从可用的成像模式中合成不可用的成像模式可以生成特定于模式的补充信息，并实现基于多模态的医学图像诊断或治疗。现有的医学图像合成生成方法通常基于已获取模态和缺失模态之间的跨模态转换。这些方法通常致力于特定的缺失模态并一次性进行合成，不能灵活地处理不同数量的缺失模态并有效地构建跨模态的映射。为了解决上述问题，在本文中，我们提出了一种统一的多模态模态掩蔽扩散网络（M2DN），从“渐进式全模态修复”的角度来解决多模态合成，而不是“跨模态翻译” ”。具体来说，我们的 M2DN 将缺失的模态视为随机噪声，并将每个反向扩散步骤中的所有模态视为一个整体。所提出的联合合成方案对缺失的模态进行合成，对可用的模态进行自重建，这不仅能够合成任意缺失的场景，而且有利于公共潜在空间的构建并增强模型表示能力。此外，我们引入了一种模态掩码方案，将每个传入模态的可用性状态明确地编码在二进制掩码中，这被用作扩散模型的条件，以进一步增强我们的 M2DN 对于任意缺失场景的综合性能。我们在两个公共脑 MRI 数据集上进行了实验，用于合成和下游分割任务。实验结果表明，我们的 M2DN 显着优于最先进的模型，并且对于任意缺失的模态表现出良好的通用性。

AU Cai, De Chen, Jie Zhao, Junhan Xue, Yuan Yang, Sen Yuan, Wei Feng, Min Weng, Haiyan Liu, Shuguang Peng, Yulong Zhu, Junyou Wang, Kanran Jackson, Christopher Tang, Hongping Huang, Junzhou Wang, Xiyue
蔡区、陈德、赵杰、薛俊瀚、杨源、袁森、冯伟、翁敏、刘海燕、彭曙光、朱玉龙、王俊友、Kanran Jackson、Christopher Tang、黄红平、王俊洲、夕月

HiCervix: An Extensive Hierarchical Dataset and Benchmark for Cervical Cytology Classification.
HiCervix：广泛的分层数据集和宫颈细胞学分类基准。

Cervical cytology is a critical screening strategy for early detection of pre-cancerous and cancerous cervical lesions. The challenge lies in accurately classifying various cervical cytology cell types. Existing automated cervical cytology methods are primarily trained on databases covering a narrow range of coarse-grained cell types, which fail to provide a comprehensive and detailed performance analysis that accurately represents real-world cytopathology conditions. To overcome these limitations, we introduce HiCervix, the most extensive, multi-center cervical cytology dataset currently available to the public. HiCervix includes 40,229 cervical cells from 4,496 whole slide images, categorized into 29 annotated classes. These classes are organized within a three-level hierarchical tree to capture fine-grained subtype information. To exploit the semantic correlation inherent in this hierarchical tree, we propose HierSwin, a hierarchical vision transformer-based classification network. HierSwin serves as a benchmark for detailed feature learning in both coarse-level and fine-level cervical cancer classification tasks. In our comprehensive experiments, HierSwin demonstrated remarkable performance, achieving 92.08% accuracy for coarse-level classification and 82.93% accuracy averaged across all three levels. When compared to board-certified cytopathologists, HierSwin achieved high classification performance (0.8293 versus 0.7359 averaged accuracy), highlighting its potential for clinical applications. This newly released HiCervix dataset, along with our benchmark HierSwin method, is poised to make a substantial impact on the advancement of deep learning algorithms for rapid cervical cancer screening and greatly improve cancer prevention and patient outcomes in real-world clinical settings.
宫颈细胞学检查是早期发现癌前病变和癌性宫颈病变的重要筛查策略。挑战在于准确分类各种宫颈细胞学细胞类型。现有的自动化宫颈细胞学方法主要是在涵盖狭窄范围的粗粒度细胞类型的数据库上进行训练，这些数据库无法提供准确代表真实世界细胞病理学条件的全面且详细的性能分析。为了克服这些限制，我们引入了 HiCervix，这是目前向公众提供的最广泛的多中心宫颈细胞学数据集。 HiCervix 包含来自 4,496 个完整幻灯片图像的 40,229 个宫颈细胞，分为 29 个带注释的类别。这些类被组织在一个三级层次树中，以捕获细粒度的子类型信息。为了利用这种分层树中固有的语义相关性，我们提出了 HierSwin，一种基于分层视觉变换器的分类网络。 HierSwin 可以作为粗级和精细级宫颈癌分类任务中详细特征学习的基准。在我们的综合实验中，HierSwin 表现出了卓越的性能，粗级分类准确率达到 92.08%，所有三个级别的平均准确率达到 82.93%。与经过委员会认证的细胞病理学家相比，HierSwin 实现了较高的分类性能（平均准确度为 0.8293 对比 0.7359），凸显了其临床应用的潜力。这个新发布的 HiCervix 数据集以及我们的基准 HierSwin 方法有望对用于快速宫颈癌筛查的深度学习算法的进步产生重大影响，并极大地改善现实临床环境中的癌症预防和患者结果。

AU Xu, Yanwu Sun, Li Peng, Wei Jia, Shuyue Morrison, Katelyn Perer, Adam Zandifar, Afrooz Visweswaran, Shyam Eslami, Motahhare Batmanghelich, Kayhan

MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images.
MedSyn：文本引导的高保真 3D CT 图像解剖感知合成。

This paper introduces an innovative methodology for producing high-quality 3D lung CT images guided by textual information. While diffusion-based generative models are increasingly used in medical imaging, current state-of-the-art approaches are limited to low-resolution outputs and underutilize radiology reports' abundant information. The radiology reports can enhance the generation process by providing additional guidance and offering fine-grained control over the synthesis of images. Nevertheless, expanding text-guided generation to high-resolution 3D images poses significant memory and anatomical detail-preserving challenges. Addressing the memory issue, we introduce a hierarchical scheme that uses a modified UNet architecture. We start by synthesizing low-resolution images conditioned on the text, serving as a foundation for subsequent generators for complete volumetric data. To ensure the anatomical plausibility of the generated samples, we provide further guidance by generating vascular, airway, and lobular segmentation masks in conjunction with the CT images. The model demonstrates the capability to use textual input and segmentation tasks to generate synthesized images. Algorithmic comparative assessments and blind evaluations conducted by 10 board-certified radiologists indicate that our approach exhibits superior performance compared to the most advanced models based on GAN and diffusion techniques, especially in accurately retaining crucial anatomical features such as fissure lines and airways. This innovation introduces novel possibilities. This study focuses on two main objectives: (1) the development of a method for creating images based on textual prompts and anatomical components, and (2) the capability to generate new images conditioning on anatomical elements. The advancements in image generation can be applied to enhance numerous downstream tasks.
本文介绍了一种在文本信息引导下生成高质量 3D 肺部 CT 图像的创新方法。虽然基于扩散的生成模型越来越多地用于医学成像，但当前最先进的方法仅限于低分辨率输出，并且未充分利用放射学报告的丰富信息。放射学报告可以通过提供额外的指导和对图像合成的细粒度控制来增强生成过程。然而，将文本引导生成扩展到高分辨率 3D 图像对记忆和解剖细节保留提出了重大挑战。为了解决内存问题，我们引入了一种使用修改后的 UNet 架构的分层方案。我们首先合成以文本为条件的低分辨率图像，作为后续完整体积数据生成器的基础。为了确保生成的样本的解剖学合理性，我们通过结合 CT 图像生成血管、气道和小叶分割掩模来提供进一步的指导。该模型演示了使用文本输入和分割任务生成合成图像的能力。由 10 名经过委员会认证的放射科医生进行的算法比较评估和盲评估表明，与基于 GAN 和扩散技术的最先进模型相比，我们的方法表现出优越的性能，特别是在准确保留裂痕线和气道等关键解剖特征方面。这项创新带来了新的可能性。本研究重点关注两个主要目标：（1）开发一种基于文本提示和解剖成分创建图像的方法，以及（2）根据解剖元素生成新图像的能力。图像生成方面的进步可用于增强众多下游任务。

AU Xu, Ziang Rittscher, Jens Ali, Sharib
AU Xu、Ziang Rittscher、Jens Ali、Sharib

SSL-CPCD: Self-supervised learning with composite pretext-class discrimination for improved generalisability in endoscopic image analysis.
SSL-CPCD：具有复合借口类别区分的自监督学习，可提高内窥镜图像分析的通用性。

Data-driven methods have shown tremendous progress in medical image analysis. In this context, deep learning-based supervised methods are widely popular. However, they require a large amount of training data and face issues in generalisability to unseen datasets that hinder clinical translation. Endoscopic imaging data is characterised by large inter- and intra-patient variability that makes these models more challenging to learn representative features for downstream tasks. Thus, despite the publicly available datasets and datasets that can be generated within hospitals, most supervised models still underperform. While self-supervised learning has addressed this problem to some extent in natural scene data, there is a considerable performance gap in the medical image domain. In this paper, we propose to explore patch-level instance-group discrimination and penalisation of inter-class variation using additive angular margin within the cosine similarity metrics. Our novel approach enables models to learn to cluster similar representations, thereby improving their ability to provide better separation between different classes. Our results demonstrate significant improvement on all metrics over the state-of-the-art (SOTA) methods on the test set from the same and diverse datasets. We evaluated our approach for classification, detection, and segmentation. SSL-CPCD attains notable Top 1 accuracy of 79.77% in ulcerative colitis classification, an 88.62% mean average precision (mAP) for detection, and an 82.32% dice similarity coefficient for segmentation tasks. These represent improvements of over 4%, 2%, and 3%, respectively, compared to the baseline architectures. We demonstrate that our method generalises better than all SOTA methods to unseen datasets, reporting over 7% improvement.
数据驱动的方法在医学图像分析方面取得了巨大进步。在此背景下，基于深度学习的监督方法广泛流行。然而，它们需要大量的训练数据，并且面临着对看不见的数据集的通用性问题，从而阻碍了临床转化。内窥镜成像数据的特点是患者间和患者内差异较大，这使得这些模型在学习下游任务的代表性特征方面更具挑战性。因此，尽管有公开的数据集和可以在医院内部生成的数据集，但大多数监督模型仍然表现不佳。虽然自监督学习在自然场景数据中一定程度上解决了这个问题，但在医学图像领域还存在相当大的性能差距。在本文中，我们建议使用余弦相似性度量内的加性角度裕度来探索补丁级实例组歧视和类间变异的惩罚。我们的新颖方法使模型能够学习对相似的表示进行聚类，从而提高它们在不同类别之间提供更好分离的能力。我们的结果表明，在来自相同和不同数据集的测试集上，所有指标均比最先进的 (SOTA) 方法有了显着改进。我们评估了我们的分类、检测和分割方法。 SSL-CPCD 在溃疡性结肠炎分类中达到了 79.77% 的 Top 1 准确率，检测平均精度 (mAP) 为 88.62%，分割任务的骰子相似系数为 82.32%。与基准架构相比，这些改进分别超过 4%、2% 和 3%。我们证明，我们的方法比所有 SOTA 方法对未见过的数据集具有更好的泛化能力，报告改进超过 7%。

AU Chen, Yuanyuan Guo, Xiaoqing Xia, Yong Yuan, Yixuan
陈AU、郭媛媛、夏晓青、袁永、艺轩

Disentangle Then Calibrate With Gradient Guidance: A Unified Framework for Common and Rare Disease Diagnosis
用梯度引导解开然后校准：常见和罕见疾病诊断的统一框架

The computer-aided diagnosis (CAD) for rare diseases using medical imaging poses a significant challenge due to the requirement of large volumes of labeled training data, which is particularly difficult to collect for rare diseases. Although Few-shot learning (FSL) methods have been developed for this task, these methods focus solely on rare disease diagnosis, failing to preserve the performance in common disease diagnosis. To address this issue, we propose the Disentangle then Calibrate with Gradient Guidance (DCGG) framework under the setting of generalized few-shot learning, i.e., using one model to diagnose both common and rare diseases. The DCGG framework consists of a network backbone, a gradient-guided network disentanglement (GND) module, and a gradient-induced feature calibration (GFC) module. The GND module disentangles the network into a disease-shared component and a disease-specific component based on gradient guidance, and devises independent optimization strategies for both components, respectively, when learning from rare diseases. The GFC module transfers only the disease-shared channels of common-disease features to rare diseases, and incorporates the optimal transport theory to identify the best transport scheme based on the semantic relationship among different diseases. Based on the best transport scheme, the GFC module calibrates the distribution of rare-disease features at the disease-shared channels, deriving more informative rare-disease features for better diagnosis. The proposed DCGG framework has been evaluated on three public medical image classification datasets. Our results suggest that the DCGG framework achieves state-of-the-art performance in diagnosing both common and rare diseases.
由于需要大量标记的训练数据，而对于罕见疾病来说，收集这些数据尤其困难，因此使用医学成像对罕见疾病进行计算机辅助诊断（CAD）提出了重大挑战。尽管已经为此任务开发了少样本学习（FSL）方法，但这些方法仅专注于罕见疾病诊断，未能保持常见疾病诊断的性能。为了解决这个问题，我们在广义少样本学习的背景下提出了“Disentangle then Calibrate with Gradient Guidance”（DCGG）框架，即使用一个模型来诊断常见疾病和罕见疾病。 DCGG框架由网络主干、梯度引导网络解缠（GND）模块和梯度诱导特征校准（GFC）模块组成。 GND 模块基于梯度引导将网络分解为疾病共享组件和疾病特定组件，并在学习罕见疾病时分别为这两个组件设计独立的优化策略。 GFC模块仅将常见疾病特征的疾病共享通道转移到罕见疾病，并结合最优传输理论，根据不同疾病之间的语义关系识别最佳传输方案。 GFC模块基于最佳传输方案，校准罕见疾病特征在疾病共享通道上的分布，得出更多信息丰富的罕见疾病特征，以更好地进行诊断。所提出的 DCGG 框架已在三个公共医学图像分类数据集上进行了评估。我们的结果表明，DCGG 框架在诊断常见疾病和罕见疾病方面均实现了最先进的性能。

AU Ma, Jiabo Chen, Hao
区马、陈家博、郝

Efficient Supervised Pretraining of Swin-Transformer for Virtual Staining of Microscopy Images
用于显微图像虚拟染色的 Swin-Transformer 的高效监督预训练

Fluorescence staining is an important technique in life science for labeling cellular constituents. However, it also suffers from being time-consuming, having difficulty in simultaneous labeling, etc. Thus, virtual staining, which does not rely on chemical labeling, has been introduced. Recently, deep learning models such as transformers have been applied to virtual staining tasks. However, their performance relies on large-scale pretraining, hindering their development in the field. To reduce the reliance on large amounts of computation and data, we construct a Swin-transformer model and propose an efficient supervised pretraining method based on the masked autoencoder (MAE). Specifically, we adopt downsampling and grid sampling to mask 75% of pixels and reduce the number of tokens. The pretraining time of our method is only 1/16 compared with the original MAE. We also design a supervised proxy task to predict stained images with multiple styles instead of masked pixels. Additionally, most virtual staining approaches are based on private datasets and evaluated by different metrics, making a fair comparison difficult. Therefore, we develop a standard benchmark based on three public datasets and build a baseline for the convenience of future researchers. We conduct extensive experiments on three benchmark datasets, and the experimental results show the proposed method achieves the best performance both quantitatively and qualitatively. In addition, ablation studies are conducted, and experimental results illustrate the effectiveness of the proposed pretraining method. The benchmark and code are available at https://github.com/birkhoffkiki/CAS-Transformer.
荧光染色是生命科学中标记细胞成分的重要技术。然而，它也存在耗时、难以同时标记等问题。因此，不依赖化学标记的虚拟染色被引入。最近，变压器等深度学习模型已应用于虚拟染色任务。然而，它们的性能依赖于大规模的预训练，阻碍了它们在该领域的发展。为了减少对大量计算和数据的依赖，我们构建了 Swin-transformer 模型，并提出了一种基于掩码自动编码器（MAE）的有效监督预训练方法。具体来说，我们采用下采样和网格采样来屏蔽 75% 的像素并减少 token 的数量。与原始 MAE 相比，我们方法的预训练时间仅为 1/16。我们还设计了一个监督代理任务来预测具有多种样式而不是屏蔽像素的染色图像。此外，大多数虚拟染色方法都基于私有数据集并通过不同的指标进行评估，这使得公平比较变得困难。因此，我们基于三个公共数据集制定了标准基准，并建立了一个基线，以方便未来的研究人员。我们在三个基准数据集上进行了广泛的实验，实验结果表明所提出的方法在定量和定性上都达到了最佳性能。此外，还进行了消融研究，实验结果说明了所提出的预训练方法的有效性。基准测试和代码可在 https://github.com/birkhoffkiki/CAS-Transformer 获取。

AU Meng, Qingjie Bai, Wenjia O'Regan, Declan P. Rueckert, Daniel
区萌、白清杰、Wenjia O'Regan、Declan P. Rueckert、Daniel

DeepMesh: Mesh-Based Cardiac Motion Tracking Using Deep Learning
DeepMesh：使用深度学习进行基于网格的心脏运动跟踪

3D motion estimation from cine cardiac magnetic resonance (CMR) images is important for the assessment of cardiac function and the diagnosis of cardiovascular diseases. Current state-of-the art methods focus on estimating dense pixel-/voxel-wise motion fields in image space, which ignores the fact that motion estimation is only relevant and useful within the anatomical objects of interest, e.g., the heart. In this work, we model the heart as a 3D mesh consisting of epi- and endocardial surfaces. We propose a novel learning framework, DeepMesh, which propagates a template heart mesh to a subject space and estimates the 3D motion of the heart mesh from CMR images for individual subjects. In DeepMesh, the heart mesh of the end-diastolic frame of an individual subject is first reconstructed from the template mesh. Mesh-based 3D motion fields with respect to the end-diastolic frame are then estimated from 2D short- and long-axis CMR images. By developing a differentiable mesh-to-image rasterizer, DeepMesh is able to leverage 2D shape information from multiple anatomical views for 3D mesh reconstruction and mesh motion estimation. The proposed method estimates vertex-wise displacement and thus maintains vertex correspondences between time frames, which is important for the quantitative assessment of cardiac function across different subjects and populations. We evaluate DeepMesh on CMR images acquired from the UK Biobank. We focus on 3D motion estimation of the left ventricle in this work. Experimental results show that the proposed method quantitatively and qualitatively outperforms other image-based and mesh-based cardiac motion tracking methods.
电影心脏磁共振 (CMR) 图像的 3D 运动估计对于评估心脏功能和诊断心血管疾病非常重要。当前最先进的方法集中于估计图像空间中密集的像素/体素运动场，这忽略了运动估计仅在感兴趣的解剖对象（例如心脏）内相关和有用的事实。在这项工作中，我们将心脏建模为由心外膜和心内膜表面组成的 3D 网格。我们提出了一种新颖的学习框架 DeepMesh，它将模板心脏网格传播到主题空间，并根据各个主题的 CMR 图像估计心脏网格的 3D 运动。在 DeepMesh 中，首先根据模板网格重建个体受试者舒张末期帧的心脏网格。然后根据 2D 短轴和长轴 CMR 图像估计相对于舒张末期帧的基于网格的 3D 运动场。通过开发可微分的网格到图像光栅器，DeepMesh 能够利用来自多个解剖视图的 2D 形状信息进行 3D 网格重建和网格运动估计。所提出的方法估计顶点方向的位移，从而维持时间帧之间的顶点对应关系，这对于不同受试者和人群的心脏功能的定量评估非常重要。我们根据从英国生物银行获取的 CMR 图像评估 DeepMesh。在这项工作中，我们重点关注左心室的 3D 运动估计。实验结果表明，该方法在定量和定性上优于其他基于图像和基于网格的心脏运动跟踪方法。

AU Challoob, Mohsin Gao, Yongsheng Busch, Andrew
AU Challoob、Mohsin Gau、Yongsheng Busch、Andrew

Distinctive Phase Interdependency Model for Retinal Vasculature Delineation in OCT-Angiography Images
OCT 血管造影图像中视网膜脉管系统描绘的独特相位相互依赖性模型

Automatic detection of retinal vasculature in optical coherence tomography angiography (OCTA) images faces several challenges such as the closely located capillaries, vessel discontinuity and high noise level. This paper introduces a new distinctive phase interdependency model to address these problems for delineating centerline patterns of the vascular network. We capture the inherent property of vascular centerlines by obtaining the inter-scale dependency information that exists between neighboring symmetrical wavelets in complex Poisson domain. In particular, the proposed phase interdependency model identifies vascular centerlines as the distinctive features that have high magnitudes over adjacent symmetrical coefficients whereas the coefficients caused by background noises are decayed rapidly along adjacent wavelet scales. The potential relationships between the neighboring Poisson coefficients are established based on the coherency of distinctive symmetrical wavelets. The proposed phase model is assessed on the OCTA-500 database (300 OCTA images + 200 OCT images), ROSE-1-SVC dataset (9 OCTA images), ROSE-1 (SVC+ DVC) dataset (9 OCTA images), and ROSE-2 dataset (22 OCTA images). The experiments on the clinically relevant OCTA images validate the effectiveness of the proposed method in achieving high-quality results. Our method produces average F-score of 0.822, 0.782, and 0.779 on ROSE-1-SVC, ROSE-1 (SVC+ DVC), and ROSE-2 datasets, respectively, and the F-score of 0.910 and 0.862 on OCTA_6mm and OCT_3mm datasets (OCTA-500 database), respectively, demonstrating its superior performance over the state-of-the-art benchmark methods.
光学相干断层扫描血管造影 (OCTA) 图像中视网膜脉管系统的自动检测面临着一些挑战，例如毛细血管位置紧密、血管不连续性和高噪声水平。本文介绍了一种新的独特的相位相互依赖性模型来解决这些问题，以描绘血管网络的中心线模式。我们通过获取复杂泊松域中相邻对称小波之间存在的尺度间依赖性信息来捕获血管中心线的固有属性。特别是，所提出的相位相互依赖性模型将血管中心线识别为在相邻对称系数上具有高幅度的独特特征，而由背景噪声引起的系数沿相邻小波尺度快速衰减。相邻泊松系数之间的潜在关系是基于独特对称小波的相干性建立的。所提出的相位模型在 OCTA-500 数据库（300 个 OCTA 图像 + 200 个 OCT 图像）、ROSE-1-SVC 数据集（9 个 OCTA 图像）、ROSE-1 (SVC+ DVC) 数据集（9 个 OCTA 图像）和 ROSE 上进行评估-2 数据集（22 个 OCTA 图像）。对临床相关 OCTA 图像的实验验证了所提出的方法在获得高质量结果方面的有效性。我们的方法在 ROSE-1-SVC、ROSE-1 (SVC+ DVC) 和 ROSE-2 数据集上产生的平均 F 分数分别为 0.822、0.782 和 0.779，在 OCTA_6mm 和 OCT_3mm 上产生的 F 分数分别为 0.910 和 0.862数据集（OCTA-500 数据库），分别证明了其优于最先进的基准方法的性能。

AU Kurtz, Samuel Wattrisse, Bertrand Van Houten, Elijah E. W.
AU Kurtz、Samuel Wattrisse、Bertrand Van Houten、Elijah EW

Minimizing Measurement-Induced Errors in Viscoelastic MR Elastography
最大限度地减少粘弹性 MR 弹性成像中测量引起的误差

The inverse problem that underlies Magnetic Resonance Elastography (MRE) is sensitive to the measurement data and the quality of the results of this tissue elasticity imaging process can be influenced both directly and indirectly by measurement noise. In this work, we apply a coupled adjoint field formulation of the viscoelastic constitutive parameter identification problem, where the indirect influence of noise through applied boundary conditions is avoided. A well-posed formulation of the coupled field problem is obtained through conditions applied to the adjoint field, relieving the computed displacement field from kinematic errors on the boundary. The theoretical framework for this formulation via a nearly incompressible, parallel subdomain-decomposition approach is presented, along with verification and a detailed exploration of the performance of the methods via a numerical simulation study. In addition, the advantages of this novel approach are demonstrated in-vivo in the human brain, showing the ability of the method to obtain viable tissue property maps in difficult configurations, enhancing the accuracy of the method.
磁共振弹性成像 (MRE) 背后的反问题对测量数据很敏感，并且该组织弹性成像过程的结果的质量可能会直接或间接地受到测量噪声的影响。在这项工作中，我们应用了粘弹性本构参数识别问题的耦合伴随场公式，其中避免了噪声通过应用边界条件的间接影响。通过应用于伴随场的条件，获得了耦合场问题的适定公式，从而消除了计算的位移场的边界运动学误差。通过几乎不可压缩的并行子域分解方法提出了该公式的理论框架，并通过数值模拟研究对该方法的性能进行了验证和详细探索。此外，这种新颖方法的优点在人脑体内得到了证明，表明该方法能够在困难的配置中获得可行的组织特性图，从而提高了该方法的准确性。

AU Dan, Tingting Kim, Minjeong Kim, Won Hwa Wu, Guorong
AU Dan、Tingting Kim、Minjeong Kim、Won Hwa Wu、Guorong

Developing Explainable Deep Model for Discovering Novel Control Mechanism of Neuro-Dynamics
开发可解释的深度模型来发现神经动力学的新型控制机制

Human brain is a complex system composed of many components that interact with each other. A well-designed computational model, usually in the format of partial differential equations (PDEs), is vital to understand the working mechanisms that can explain dynamic and self-organized behaviors. However, the model formulation and parameters are often tuned empirically based on the predefined domain-specific knowledge, which lags behind the emerging paradigm of discovering novel mechanisms from the unprecedented amount of spatiotemporal data. To address this limitation, we sought to link the power of deep neural networks and physics principles of complex systems, which allows us to design explainable deep models for uncovering the mechanistic role of how human brain (the most sophisticated complex system) maintains controllable functions while interacting with external stimulations. In the spirit of optimal control, we present a unified framework to design an explainable deep model that describes the dynamic behaviors of underlying neurobiological processes, allowing us to understand the latent control mechanism at a system level. We have uncovered the pathophysiological mechanism of Alzheimer's disease to the extent of controllability of disease progression, where the dissected system-level understanding enables higher prediction accuracy for disease progression and better explainability for disease etiology than conventional (black box) deep models.
人脑是一个复杂的系统，由许多相互作用的组件组成。精心设计的计算模型（通常采用偏微分方程 (PDE) 格式）对于理解解释动态和自组织行为的工作机制至关重要。然而，模型公式和参数通常根据预定义的特定领域知识进行经验调整，这落后于从前所未有的时空数据中发现新机制的新兴范式。为了解决这个限制，我们试图将深度神经网络的力量和复杂系统的物理原理联系起来，这使我们能够设计可解释的深度模型，以揭示人脑（最复杂的复杂系统）如何维持可控功能的机械作用，同时与外界刺激相互作用。本着最优控制的精神，我们提出了一个统一的框架来设计一个可解释的深度模型，该模型描述了底层神经生物学过程的动态行为，使我们能够在系统层面理解潜在的控制机制。我们在疾病进展可控的程度上揭示了阿尔茨海默病的病理生理机制，与传统（黑匣子）深度模型相比，剖析的系统级理解能够实现更高的疾病进展预测准确性和更好的疾病病因解释性。

AU Li, Jingxiong Zheng, Sunyi Shui, Zhongyi Zhang, Shichuan Yang, Linyi Sun, Yuxuan Zhang, Yunlong Li, Honglin Ye, Yuanxin van Ooijen, Peter M. A. Li, Kang Yang, Lin
AU Li、郑竞雄、水孙一、张中一、杨石川、孙林一、张宇轩、李云龙、叶红林、Yuanxin van Ooijen、Peter MA Li、Kang Yang、Lin

Masked Conditional Variational Autoencoders for Chromosome Straightening
用于染色体矫正的掩蔽条件变分自动编码器

Karyotyping is of importance for detecting chromosomal aberrations in human disease. However, chromosomes easily appear curved in microscopic images, which prevents cytogeneticists from analyzing chromosome types. To address this issue, we propose a framework for chromosome straightening, which comprises a preliminary processing algorithm and a generative model called masked conditional variational autoencoders (MC-VAE). The processing method utilizes patch rearrangement to address the difficulty in erasing low degrees of curvature, providing reasonable preliminary results for the MC-VAE. The MC-VAE further straightens the results by leveraging chromosome patches conditioned on their curvatures to learn the mapping between banding patterns and conditions. During model training, we apply a masking strategy with a high masking ratio to train the MC-VAE with eliminated redundancy. This yields a non-trivial reconstruction task, allowing the model to effectively preserve chromosome banding patterns and structure details in the reconstructed results. Extensive experiments on three public datasets with two stain styles show that our framework surpasses the performance of state-of-the-art methods in retaining banding patterns and structure details. Compared to using real-world bent chromosomes, the use of high-quality straightened chromosomes generated by our proposed method can improve the performance of various deep learning models for chromosome classification by a large margin. Such a straightening approach has the potential to be combined with other karyotyping systems to assist cytogeneticists in chromosome analysis.
核型分析对于检测人类疾病中的染色体畸变非常重要。然而，染色体在显微图像中很容易出现弯曲，这阻碍了细胞遗传学家分析染色体类型。为了解决这个问题，我们提出了一个染色体拉直框架，其中包括初步处理算法和称为掩码条件变分自动编码器（MC-VAE）的生成模型。该处理方法利用块重排解决了擦除低曲率的困难，为MC-VAE提供了合理的初步结果。 MC-VAE 通过利用以曲率为条件的染色体补丁来学习条带模式和条件之间的映射，从而进一步矫正结果。在模型训练过程中，我们应用高掩蔽率的掩蔽策略来训练消除冗余的MC-VAE。这产生了一个不平凡的重建任务，使模型能够有效地保留重建结果中的染色体带型模式和结构细节。对具有两种染色风格的三个公共数据集进行的广泛实验表明，我们的框架在保留条带图案和结构细节方面超越了最先进的方法的性能。与使用现实世界的弯曲染色体相比，使用我们提出的方法生成的高质量直染色体可以大幅提高各种染色体分类深度学习模型的性能。这种拉直方法有可能与其他核型分析系统相结合，以协助细胞遗传学家进行染色体分析。

AU Kou, Zhengchang Lowerison, Matthew R. You, Qi Wang, Yike Song, Pengfei Oelze, Michael L.
AU Kou, 正昌 Lowerison, Matthew R. You, 王琪, 宋一科, Pengfei Oelze, Michael L.

High-Resolution Power Doppler Using Null Subtraction Imaging
使用零减成像的高分辨率功率多普勒

To improve the spatial resolution of power Doppler (PD) imaging, we explored null subtraction imaging (NSI) as an alternative beamforming technique to delay-and-sum (DAS). NSI is a nonlinear beamforming approach that uses three different apodizations on receive and incoherently sums the beamformed envelopes. NSI uses a null in the beam pattern to improve the lateral resolution, which we apply here for improving PD spatial resolution both with and without contrast microbubbles. In this study, we used NSI with three types of singular value decomposition (SVD)-based clutter filters and noise equalization to generate high-resolution PD images. An element sensitivity correction scheme was also proposed as a crucial component of NSI-based PD imaging. First, a microbubble trace experiment was performed to evaluate the resolution improvement of NSI-based PD over traditional DAS-based PD. Then, both contrast-enhanced and contrast free ultrasound PD images were generated from the scan of a rat brain. The cross-sectional profile of the microbubble traces and microvessels were plotted. FWHM was also estimated to provide a quantitative metric. Furthermore, iso-frequency curves were calculated to provide a resolution evaluation metric over the global field of view. Up to six-fold resolution improvement was demonstrated by the FWHM estimate and four-fold resolution improvement was demonstrated by the iso-frequency curve from the NSI-based PD microvessel images compared to microvessel images generated by traditional DAS-based beamforming. A resolvability of 39 mu m was measured from the NSI-based PD microvessel image. The computational cost of NSI-based PD was only increased by 40 percent over the DAS-based PD.
为了提高功率多普勒 (PD) 成像的空间分辨率，我们探索了零减成像 (NSI) 作为延迟求和 (DAS) 的替代波束形成技术。 NSI 是一种非线性波束形成方法，在接收时使用三种不同的变迹并对波束形成的包络进行非相干求和。 NSI 在波束图案中使用零点来提高横向分辨率，我们在这里应用它来提高有或没有对比微泡的 PD 空间分辨率。在本研究中，我们使用 NSI 以及三种基于奇异值分解 (SVD) 的杂波滤波器和噪声均衡来生成高分辨率 PD 图像。还提出了元件灵敏度校正方案作为基于 NSI 的 PD 成像的关键组成部分。首先，进行微泡痕量实验来评估基于 NSI 的 PD 相对于传统的基于 DAS 的 PD 的分辨率改进。然后，通过扫描大鼠大脑生成对比增强和无对比超声 PD 图像。绘制了微泡痕迹和微血管的横截面轮廓。 FWHM 还被估计为提供定量指标。此外，计算等频曲线以提供全局视场的分辨率评估指标。与传统的基于 DAS 的波束形成生成的微血管图像相比，FWHM 估计表明分辨率提高了六倍，基于 NSI 的 PD 微血管图像的等频曲线表明分辨率提高了四倍。从基于 NSI 的 PD 微血管图像中测得分辨率为 39 μm。基于 NSI 的 PD 的计算成本仅比基于 DAS 的 PD 增加了 40%。

AU Guo, Shouchang Fessler, Jeffrey A. Noll, Douglas C.
郭守昌、费斯勒、杰弗里·诺尔、道格拉斯·C.

Manifold Regularizer for High-Resolution fMRI Joint Reconstruction and Dynamic Quantification
用于高分辨率 fMRI 联合重建和动态量化的流形正则化器

Oscillating Steady-State Imaging (OSSI) is a recently developed fMRI acquisition method that can provide 2 to 3 times higher SNR than standard fMRI approaches. However, because the OSSI signal exhibits a nonlinear oscillation pattern, one must acquire and combine n(c) (e.g., 10) OSSI images to get an image that is free of oscillation for fMRI, and fully sampled acquisitions would compromise temporal resolution. To improve temporal resolution and accurately model the nonlinearity of OSSI signals, instead of using subspace models that are not well suited for the data, we build the MR physics for OSSI signal generation as a regularizer for the undersampled reconstruction. Our proposed physics-based manifold model turns the disadvantages of OSSI acquisition into advantages and enables joint reconstruction and quantification. OSSI manifold model (OSSIMM) outperforms subspace models and reconstructs high-resolution fMRI images with a factor of 12 acceleration and without spatial or temporal smoothing. Furthermore, OSSIMM can dynamically quantify important physics parameters, including R-2* maps, with a temporal resolution of 150 ms.
振荡稳态成像 (OSSI) 是一种最近开发的 fMRI 采集方法，其信噪比比标准 fMRI 方法高 2 至 3 倍。然而，由于OSSI信号表现出非线性振荡模式，因此必须采集并组合n(c)（例如，10）个OSSI图像以获得用于fMRI的无振荡图像，并且完全采样的采集会损害时间分辨率。为了提高时间分辨率并准确建模 OSSI 信号的非线性，我们没有使用不太适合数据的子空间模型，而是构建了用于 OSSI 信号生成的 MR 物理场，作为欠采样重建的正则器。我们提出的基于物理的流形模型将 OSSI 采集的缺点转化为优点，并实现联合重建和量化。 OSSI 流形模型 (OSSIMM) 的性能优于子空间模型，可以以 12 倍的加速度重建高分辨率 fMRI 图像，并且无需空间或时间平滑。此外，OSSIMM 可以动态量化重要的物理参数，包括 R-2* 图，时间分辨率为 150 ms。

AU Huang, Xingru Huang, Jian Zhao, Kai Zhang, Tianyun Li, Zhi Yue, Changpeng Chen, Wenhao Wang, Ruihao Chen, Xuanbin Zhang, Qianni Fu, Ying Wang, Yangyundou Guo, Yihao
黄AU、黄星茹、赵健、张凯、李天云、岳志、陈长鹏、王文浩、陈瑞豪、张玄彬、付倩妮、王瑛、郭杨云斗、一号

SASAN: Spectrum-Axial Spatial Approach Networks for Medical Image Segmentation
SASAN：用于医学图像分割的谱轴空间方法网络

Ophthalmic diseases such as central serous chorioretinopathy (CSC) significantly impair the vision of millions of people globally. Precise segmentation of choroid and macular edema is critical for diagnosing and treating these conditions. However, existing 3D medical image segmentation methods often fall short due to the heterogeneous nature and blurry features of these conditions, compounded by medical image clarity issues and noise interference arising from equipment and environmental limitations. To address these challenges, we propose the Spectrum Analysis Synergy Axial-Spatial Network (SASAN), an approach that innovatively integrates spectrum features using the Fast Fourier Transform (FFT). SASAN incorporates two key modules: the Frequency Integrated Neural Enhancer (FINE), which mitigates noise interference, and the Axial-Spatial Elementum Multiplier (ASEM), which enhances feature extraction. Additionally, we introduce the Self-Adaptive Multi-Aspect Loss ( $\mathcal {L}_{\textit {SM}}$ ), which balances image regions, distribution, and boundaries, adaptively updating weights during training. We compiled and meticulously annotated the Choroid and Macular Edema OCT Mega Dataset (CMED-18k), currently the world's largest dataset of its kind. Comparative analysis against 13 baselines shows our method surpasses these benchmarks, achieving the highest Dice scores and lowest HD95 in the CMED and OIMHS datasets.
中心性浆液性脉络膜视网膜病变 (CSC) 等眼科疾病严重损害全球数百万人的视力。脉络膜和黄斑水肿的精确分割对于诊断和治疗这些疾病至关重要。然而，由于这些条件的异构性和模糊特征，再加上医学图像清晰度问题以及设备和环境限制产生的噪声干扰，现有的 3D 医学图像分割方法往往存在不足。为了应对这些挑战，我们提出了频谱分析协同轴向空间网络（SASAN），这是一种使用快速傅立叶变换（FFT）创新地集成频谱特征的方法。 SASAN 包含两个关键模块：用于减轻噪声干扰的频率集成神经增强器 (FINE) 和用于增强特征提取的轴向空间元素乘法器 (ASEM)。此外，我们引入了自适应多方面损失（ $\mathcal {L}_{\textit {SM}}$ ），它平衡图像区域、分布和边界，在训练期间自适应更新权重。我们编译并精心注释了脉络膜和黄斑水肿 OCT 大数据集 (CMED-18k)，这是目前世界上最大的同类数据集。对 13 个基线的比较分析表明，我们的方法超越了这些基准，在 CMED 和 OIMHS 数据集中实现了最高的 Dice 分数和最低的 HD95。

AU Zhou, Houliang He, Lifang Chen, Brian Y Shen, Li Zhang, Yu
周AU、何厚良、陈丽芳、沉毅、张莉、余

Multi-Modal Diagnosis of Alzheimer's Disease using Interpretable Graph Convolutional Networks.
使用可解释的图卷积网络对阿尔茨海默病进行多模态诊断。

The interconnection between brain regions in neurological disease encodes vital information for the advancement of biomarkers and diagnostics. Although graph convolutional networks are widely applied for discovering brain connection patterns that point to disease conditions, the potential of connection patterns that arise from multiple imaging modalities has yet to be fully realized. In this paper, we propose a multi-modal sparse interpretable GCN framework (SGCN) for the detection of Alzheimer's disease (AD) and its prodromal stage, known as mild cognitive impairment (MCI). In our experimentation, SGCN learned the sparse regional importance probability to find signature regions of interest (ROIs), and the connective importance probability to reveal disease-specific brain network connections. We evaluated SGCN on the Alzheimer's Disease Neuroimaging Initiative database with multi-modal brain images and demonstrated that the ROI features learned by SGCN were effective for enhancing AD status identification. The identified abnormalities were significantly correlated with AD-related clinical symptoms. We further interpreted the identified brain dysfunctions at the level of large-scale neural systems and sex-related connectivity abnormalities in AD/MCI. The salient ROIs and the prominent brain connectivity abnormalities interpreted by SGCN are considerably important for developing novel biomarkers. These findings contribute to a better understanding of the network-based disorder via multi-modal diagnosis and offer the potential for precision diagnostics. The source code is available at https://github.com/Houliang-Zhou/SGCN.
神经系统疾病中大脑区域之间的互连编码了生物标志物和诊断学进步的重要信息。尽管图卷积网络广泛应用于发现指向疾病状况的大脑连接模式，但多种成像模式产生的连接模式的潜力尚未完全实现。在本文中，我们提出了一种多模态稀疏可解释 GCN 框架（SGCN），用于检测阿尔茨海默病（AD）及其前驱阶段，即轻度认知障碍（MCI）。在我们的实验中，SGCN 学习了稀疏区域重要性概率来找到感兴趣的特征区域 (ROI)，并学习了连接重要性概率来揭示疾病特定的大脑网络连接。我们使用多模态脑图像在阿尔茨海默病神经影像计划数据库上评估了 SGCN，并证明 SGCN 学习的 ROI 特征可有效增强 AD 状态识别。所发现的异常与 AD 相关的临床症状显着相关。我们进一步解释了 AD/MCI 中大规模神经系统水平上已发现的脑功能障碍和性别相关的连接异常。 SGCN 解释的显着 ROI 和显着的大脑连接异常对于开发新型生物标志物非常重要。这些发现有助于通过多模式诊断更好地理解基于网络的疾病，并提供精确诊断的潜力。源代码可在 https://github.com/Houliang-Zhou/SGCN 获取。

AU He, Yufang Liu, Zeyu Qi, Mingxin Ding, Shengwei Zhang, Peng Song, Fan Ma, Chenbin Wu, Huijie Cai, Ruxin Feng, Youdan Zhang, Haonan Zhang, Tianyi Zhang, Guanglei
AU、刘玉芳、齐泽宇、丁明欣、张胜伟、宋鹏、马凡、吴陈斌、蔡惠杰、冯如新、张友丹、张浩南、张天一、光磊

PST-Diff: Achieving High-consistency Stain Transfer by Diffusion Models with Pathological and Structural Constraints.
PST-Diff：通过具有病理和结构约束的扩散模型实现高浓度污渍转移。

Histopathological examinations heavily rely on hematoxylin and eosin (HE) and immunohistochemistry (IHC) staining. IHC staining can offer more accurate diagnostic details but it brings significant financial and time costs. Furthermore, either re-staining HE-stained slides or using adjacent slides for IHC may compromise the accuracy of pathological diagnosis due to information loss. To address these challenges, we develop PST-Diff, a method for generating virtual IHC images from HE images based on diffusion models, which allows pathologists to simultaneously view multiple staining results from the same tissue slide. To maintain the pathological consistency of the stain transfer, we propose the asymmetric attention mechanism (AAM) and latent transfer (LT) module in PST-Diff. Specifically, the AAM can retain more local pathological information of the source domain images through the design of asymmetric attention mechanisms, while ensuring the model's flexibility in generating virtual stained images that highly confirm to the target domain. Subsequently, the LT module transfers the implicit representations across different domains, effectively alleviating the bias introduced by direct connection and further enhancing the pathological consistency of PST-Diff. Furthermore, to maintain the structural consistency of the stain transfer, the conditional frequency guidance (CFG) module is proposed to precisely control image generation and preserve structural details according to the frequency recovery process. To conclude, the pathological and structural consistency constraints provide PST-Diff with effectiveness and superior generalization in generating stable and functionally pathological IHC images with the best evaluation score. In general, PST-Diff offers prospective application in clinical virtual staining and pathological image analysis.
组织病理学检查严重依赖苏木精和伊红 (HE) 以及免疫组织化学 (IHC) 染色。 IHC 染色可以提供更准确的诊断细节，但会带来巨大的财务和时间成本。此外，重新染色 HE 染色的载玻片或使用相邻载玻片进行 IHC 可能会因信息丢失而影响病理诊断的准确性。为了应对这些挑战，我们开发了 PST-Diff，这是一种基于扩散模型从 HE 图像生成虚拟 IHC 图像的方法，它允许病理学家同时查看同一组织载玻片的多个染色结果。为了保持染色转移的病理一致性，我们在 PST-Diff 中提出了不对称注意机制（AAM）和潜在转移（LT）模块。具体来说，AAM可以通过不对称注意机制的设计保留源域图像的更多局部病理信息，同时确保模型在生成与目标域高度一致的虚拟染色图像时的灵活性。随后，LT模块跨不同域传递隐式表示，有效减轻直接连接引入的偏差，进一步增强PST-Diff的病理一致性。此外，为了保持染色转移的结构一致性，提出了条件频率引导（CFG）模块，以根据频率恢复过程精确控制图像生成并保留结构细节。总之，病理和结构一致性约束为 PST-Diff 提供了有效性和卓越的泛化性，可生成具有最佳评估分数的稳定且功能性病理 IHC 图像。总的来说，PST-Diff在临床虚拟染色和病理图像分析方面具有广阔的应用前景。

AU Wang, Hongqiu Yang, Guang Zhang, Shichen Qin, Jing Guo, Yike Xu, Bo Jin, Yueming Zhu, Lei
王AU、杨红秋、张光、秦世辰、郭靖、徐一科、金波、朱月明、雷

Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery.
用于机器人手术中参考视频仪器分割的视频仪器协同网络。

Surgical instrument segmentation is fundamentally important for facilitating cognitive intelligence in robot-assisted surgery. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate segmentation masks of all instruments, which lack the capability to specify a target object and allow an interactive experience. This paper focuses on a novel and essential task in robotic surgery, i.e., Referring Surgical Video Instrument Segmentation (RSVIS), which aims to automatically identify and segment the target surgical instruments from each video frame, referred by a given language expression. This interactive feature offers enhanced user engagement and customized experiences, greatly benefiting the development of the next generation of surgical education systems. To achieve this, this paper constructs two surgery video datasets to promote the RSVIS research. Then, we devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance, while previous work only utilized video-level information. Meanwhile, we design a Graph-based Relation-aware Module (GRM) to model the correlation between multi-modal information (i.e., textual description and video frame) to facilitate the extraction of instrument-level information. Extensive experimental results on two RSVIS datasets exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods. We will release our code and dataset for future research (Git).
手术器械分割对于促进机器人辅助手术中的认知智能至关重要。尽管现有方法已经获得了准确的仪器分割结果，但它们同时生成所有仪器的分割掩模，缺乏指定目标对象和允许交互体验的能力。本文重点研究机器人手术中一项新颖且重要的任务，即参考手术视频器械分割（RSVIS），其目的是通过给定的语言表达来自动识别和分割每个视频帧中的目标手术器械。这种交互功能提供了增强的用户参与度和定制体验，极大有利于下一代外科教育系统的开发。为此，本文构建了两个手术视频数据集来促进 RSVIS 研究。然后，我们设计了一种新颖的视频仪器协同网络（VIS-Net）来学习视频级和仪器级知识以提高性能，而以前的工作仅利用视频级信息。同时，我们设计了基于图的关系感知模块（GRM）来对多模态信息（即文本描述和视频帧）之间的相关性进行建模，以方便提取仪器级信息。对两个 RSVIS 数据集的大量实验结果表明，VIS-Net 的性能可以显着优于现有的最先进的参考分割方法。我们将发布我们的代码和数据集以供未来研究（Git）。

AU Kreitner, Linus Paetzold, Johannes C. Rauch, Nikolaus Chen, Chen Hagag, Ahmed M. Fayed, Alaa E. Sivaprasad, Sobha Rausch, Sebastian Weichsel, Julian Menze, Bjoern H. Harders, Matthias Knier, Benjamin Rueckert, Daniel Menten, Martin J.
AU Kreitner, Linus Paetzold, Johannes C. Rauch, Nikolaus Chen, Chen Hagag, Ahmed M. Fayed, Alaa E. Sivaprasad, Sobha Rausch, Sebastian Weichsel, Julian Menze, Bjoern H. Harders, Matthias Knier, Benjamin Rueckert, Daniel Menten,马丁·J。

Synthetic Optical Coherence Tomography Angiographs for Detailed Retinal Vessel Segmentation Without Human Annotations
合成光学相干断层扫描血管造影无需人工注释即可进行详细的视网膜血管分割

Optical coherence tomography angiography (OCTA) is a non-invasive imaging modality that can acquire high-resolution volumes of the retinal vasculature and aid the diagnosis of ocular, neurological and cardiac diseases. Segmenting the visible blood vessels is a common first step when extracting quantitative biomarkers from these images. Classical segmentation algorithms based on thresholding are strongly affected by image artifacts and limited signal-to-noise ratio. The use of modern, deep learning-based segmentation methods has been inhibited by a lack of large datasets with detailed annotations of the blood vessels. To address this issue, recent work has employed transfer learning, where a segmentation network is trained on synthetic OCTA images and is then applied to real data. However, the previously proposed simulations fail to faithfully model the retinal vasculature and do not provide effective domain adaptation. Because of this, current methods are unable to fully segment the retinal vasculature, in particular the smallest capillaries. In this work, we present a lightweight simulation of the retinal vascular network based on space colonization for faster and more realistic OCTA synthesis. We then introduce three contrast adaptation pipelines to decrease the domain gap between real and artificial images. We demonstrate the superior segmentation performance of our approach in extensive quantitative and qualitative experiments on three public datasets that compare our method to traditional computer vision algorithms and supervised training using human annotations. Finally, we make our entire pipeline publicly available, including the source code, pretrained models, and a large dataset of synthetic OCTA images.
光学相干断层扫描血管造影 (OCTA) 是一种非侵入性成像方式，可以获取视网膜脉管系统的高分辨率体积，并有助于诊断眼部、神经系统和心脏疾病。从这些图像中提取定量生物标记物时，分割可见血管是常见的第一步。基于阈值的经典分割算法受到图像伪影和有限信噪比的强烈影响。由于缺乏带有血管详细注释的大型数据集，基于深度学习的现代分割方法的使用受到限制。为了解决这个问题，最近的工作采用了迁移学习，其中分割网络在合成 OCTA 图像上进行训练，然后应用于真实数据。然而，先前提出的模拟未能忠实地模拟视网膜脉管系统，并且没有提供有效的域适应。因此，当前的方法无法完全分割视网膜脉管系统，特别是最小的毛细血管。在这项工作中，我们提出了基于空间殖民的视网膜血管网络的轻量级模拟，以实现更快、更真实的 OCTA 合成。然后，我们引入三个对比度适应管道来减少真实图像和人造图像之间的域差距。我们在三个公共数据集上进行了广泛的定量和定性实验，展示了我们的方法的卓越分割性能，这些实验将我们的方法与传统计算机视觉算法和使用人工注释的监督训练进行比较。最后，我们公开了整个流程，包括源代码、预训练模型和合成 OCTA 图像的大型数据集。

AU Leynes, Andrew P. Deveshwar, Nikhil Nagarajan, Srikantan S. Larson, Peder E. Z.
AU Leynes、Andrew P. Deveshwar、Nikhil Nagarajan、Srikantan S. Larson、Peder EZ

Scan-Specific Self-Supervised Bayesian Deep Non-Linear Inversion for Undersampled MRI Reconstruction
用于欠采样 MRI 重建的扫描特定自监督贝叶斯深度非线性反演

Magnetic resonance imaging is subject to slow acquisition times due to the inherent limitations in data sampling. Recently, supervised deep learning has emerged as a promising technique for reconstructing sub-sampled MRI. However, supervised deep learning requires a large dataset of fully-sampled data. Although unsupervised or self-supervised deep learning methods have emerged to address the limitations of supervised deep learning approaches, they still require a database of images. In contrast, scan-specific deep learning methods learn and reconstruct using only the sub-sampled data from a single scan. Here, we introduce Scan-Specific Self-Supervised Bayesian Deep Non-Linear Inversion (DNLINV) that does not require an auto calibration scan region. DNLINV utilizes a Deep Image Prior-type generative modeling approach and relies on approximate Bayesian inference to regularize the deep convolutional neural network. We demonstrate our approach on several anatomies, contrasts, and sampling patterns and show improved performance over existing approaches in scan-specific calibrationless parallel imaging and compressed sensing.
由于数据采样的固有限制，磁共振成像的采集时间很慢。最近，监督深度学习已成为重建子采样 MRI 的一种有前途的技术。然而，监督深度学习需要大量完全采样的数据。尽管无监督或自监督深度学习方法已经出现，以解决有监督深度学习方法的局限性，但它们仍然需要图像数据库。相比之下，特定于扫描的深度学习方法仅使用来自单次扫描的子采样数据进行学习和重建。在这里，我们介绍不需要自动校准扫描区域的扫描特定自监督贝叶斯深度非线性反演（DNLINV）。 DNLINV 采用深度图像先验型生成建模方法，并依靠近似贝叶斯推理来正则化深度卷积神经网络。我们在多个解剖结构、对比度和采样模式上展示了我们的方法，并在特定于扫描的无校准并行成像和压缩传感方面展示了比现有方法更高的性能。

AU Wang, Haiqiao Ni, Dong Wang, Yi
王AU、倪海桥、王东、易

Recursive Deformable Pyramid Network for Unsupervised Medical Image Registration
用于无监督医学图像配准的递归可变形金字塔网络

Complicated deformation problems are frequently encountered in medical image registration tasks. Although various advanced registration models have been proposed, accurate and efficient deformable registration remains challenging, especially for handling the large volumetric deformations. To this end, we propose a novel recursive deformable pyramid (RDP) network for unsupervised non-rigid registration. Our network is a pure convolutional pyramid, which fully utilizes the advantages of the pyramid structure itself, but does not rely on any high-weight attentions or transformers. In particular, our network leverages a step-by-step recursion strategy with the integration of high-level semantics to predict the deformation field from coarse to fine, while ensuring the rationality of the deformation field. Meanwhile, due to the recursive pyramid strategy, our network can effectively attain deformable registration without separate affine pre-alignment. We compare the RDP network with several existing registration methods on three public brain magnetic resonance imaging (MRI) datasets, including LPBA, Mindboggle and IXI. Experimental results demonstrate our network consistently outcompetes state of the art with respect to the metrics of Dice score, average symmetric surface distance, Hausdorff distance, and Jacobian. Even for the data without the affine pre-alignment, our network maintains satisfactory performance on compensating for the large deformation. The code is publicly available at https://github.com/ZAX130/RDP.
医学图像配准任务中经常遇到复杂的变形问题。尽管已经提出了各种先进的配准模型，但准确有效的变形配准仍然具有挑战性，特别是在处理大体积变形时。为此，我们提出了一种新颖的递归变形金字塔（RDP）网络，用于无监督非刚性配准。我们的网络是一个纯卷积金字塔，充分利用了金字塔结构本身的优势，但不依赖于任何高权重的注意力或变压器。特别是，我们的网络利用逐步递归策略并融合高级语义来预测变形场从粗到细，同时保证变形场的合理性。同时，由于递归金字塔策略，我们的网络可以有效地实现变形配准，而无需单独的仿射预对齐。我们将 RDP 网络与三个公共脑磁共振成像 (MRI) 数据集（包括 LPBA、Mindboggle 和 IXI）上的几种现有配准方法进行比较。实验结果表明，我们的网络在 Dice 得分、平均对称表面距离、豪斯多夫距离和雅可比行列式等指标方面始终优于最先进的技术。即使对于没有仿射预对齐的数据，我们的网络在补偿大变形方面也保持了令人满意的性能。该代码可在 https://github.com/ZAX130/RDP 上公开获取。

AU Bayasi, Nourhan Hamarneh, Ghassan Garbi, Rafeef
AU Bayasi、Nourhan Hamarneh、Ghassan Garbi、Rafeef

GC2: Generalizable Continual Classification of Medical Images.
GC2：医学图像的可概括连续分类。

Deep learning models have achieved remarkable success in medical image classification. These models are typically trained once on the available annotated images and thus lack the ability of continually learning new tasks (i.e., new classes or data distributions) due to the problem of catastrophic forgetting. Recently, there has been more interest in designing continual learning methods to learn different tasks presented sequentially over time while preserving previously acquired knowledge. However, these methods focus mainly on preventing catastrophic forgetting and are tested under a closed-world assumption; i.e., assuming the test data is drawn from the same distribution as the training data. In this work, we advance the state-of-the-art in continual learning by proposing GC2 for medical image classification, which learns a sequence of tasks while simultaneously enhancing its out-of-distribution robustness. To alleviate forgetting, GC2 employs a gradual culpability-based network pruning to identify an optimal subnetwork for each task. To improve generalization, GC2 incorporates adversarial image augmentation and knowledge distillation approaches for learning generalized and robust representations for each subnetwork. Our extensive experiments on multiple benchmarks in a task-agnostic inference demonstrate that GC2 significantly outperforms baselines and other continual learning methods in reducing forgetting and enhancing generalization. Our code is publicly available at the following link: https://github.com/ nourhanb/TMI2024-GC2.
深度学习模型在医学图像分类方面取得了显着的成功。这些模型通常在可用的带注释图像上进行一次训练，因此由于灾难性遗忘的问题而缺乏持续学习新任务（即新类别或数据分布）的能力。最近，人们对设计持续学习方法越来越感兴趣，以学习随时间顺序呈现的不同任务，同时保留以前获得的知识。然而，这些方法主要侧重于防止灾难性遗忘，并在封闭世界假设下进行测试；即，假设测试数据来自与训练数据相同的分布。在这项工作中，我们通过提出用于医学图像分类的 GC2 来推进持续学习的最新技术，它学习一系列任务，同时增强其分布外鲁棒性。为了减少遗忘，GC2 采用基于罪责的渐进网络修剪来确定每个任务的最佳子网络。为了提高泛化能力，GC2 结合了对抗性图像增强和知识蒸馏方法来学习每个子网络的泛化和鲁棒表示。我们在与任务无关的推理中对多个基准进行的广泛实验表明，GC2 在减少遗忘和增强泛化方面显着优于基线和其他持续学习方法。我们的代码可通过以下链接公开获取：https://github.com/nourhanb/TMI2024-GC2。

AU Zhong, Yutian Zhang, Shuangyang Liu, Zhenyang Zhang, Xiaoming Mo, Zongxin Zhang, Yizhe Hu, Haoyu Chen, Wufan Qi, Li
钟AU、张雨田、刘双阳、张振阳、莫晓明、张宗欣、胡一哲、陈浩宇、齐吴凡、李

Unsupervised Fusion of Misaligned PAT and MRI Images via Mutually Reinforcing Cross-Modality Image Generation and Registration
通过相互增强的跨模态图像生成和配准，对未对准的 PAT 和 MRI 图像进行无监督融合

Photoacoustic tomography (PAT) and magnetic resonance imaging (MRI) are two advanced imaging techniques widely used in pre-clinical research. PAT has high optical contrast and deep imaging range but poor soft tissue contrast, whereas MRI provides excellent soft tissue information but poor temporal resolution. Despite recent advances in medical image fusion with pre-aligned multimodal data, PAT-MRI image fusion remains challenging due to misaligned images and spatial distortion. To address these issues, we propose an unsupervised multi-stage deep learning framework called PAMRFuse for misaligned PAT and MRI image fusion. PAMRFuse comprises a multimodal to unimodal registration network to accurately align the input PAT-MRI image pairs and a self-attentive fusion network that selects information-rich features for fusion. We employ an end-to-end mutually reinforcing mode in our registration network, which enables joint optimization of cross-modality image generation and registration. To the best of our knowledge, this is the first attempt at information fusion for misaligned PAT and MRI. Qualitative and quantitative experimental results show the excellent performance of our method in fusing PAT-MRI images of small animals captured from commercial imaging systems.
光声断层扫描（PAT）和磁共振成像（MRI）是两种广泛应用于临床前研究的先进成像技术。 PAT具有高光学对比度和深成像范围，但软组织对比度较差，而MRI提供良好的软组织信息，但时间分辨率较差。尽管最近在使用预对齐多模态数据的医学图像融合方面取得了进展，但由于图像未对齐和空间失真，PAT-MRI 图像融合仍然具有挑战性。为了解决这些问题，我们提出了一种名为 PAMRFuse 的无监督多阶段深度学习框架，用于未对齐的 PAT 和 MRI 图像融合。 PAMRFuse 包括一个多模态到单模态配准网络，用于精确对齐输入 PAT-MRI 图像对，以及一个自注意力融合网络，用于选择信息丰富的特征进行融合。我们在配准网络中采用端到端的相互增强模式，从而能够联合优化跨模态图像生成和配准。据我们所知，这是针对未对准的 PAT 和 MRI 进行信息融合的首次尝试。定性和定量实验结果表明，我们的方法在融合从商业成像系统捕获的小动物的 PAT-MRI 图像方面具有出色的性能。

AU Liu, Bingxue Wang, Yongchao Fomin-Thunemann, Natalie Thunemann, Martin Kilic, Kivilcim Devor, Anna Cheng, Xiaojun Tan, Jiyong Jiang, John Boas, David A. Tang, Jianbo
AU Liu、王冰雪、Yongchao Fomin-Thunemann、Natalie Thunemann、Martin Kilic、Kivilcim Devor、Anna Cheng、谭晓君、Jiyong Jiang、John Boas、David A. Tang、Jianbo

Time-Lagged Functional Ultrasound for Multi-Parametric Cerebral Hemodynamic Imaging
用于多参数脑血流动力学成像的时滞功能超声

We introduce an ultrasound speckle decorrelation-based time-lagged functional ultrasound technique (tl-fUS) for the quantification of the relative changes in cerebral blood flow speed (rCBF $_{\text {speed}}$ ), cerebral blood volume (rCBV) and cerebral blood flow (rCBF) during functional stimulations. Numerical simulations, phantom validations, and in vivo mouse brain experiments were performed to test the capability of tl-fUS to parse out and quantify the ratio change of these hemodynamic parameters. The blood volume change was found to be more prominent in arterioles compared to venules and the peak blood flow changes were around 2.5 times the peak blood volume change during brain activation, agreeing with previous observations in the literature. The tl-fUS shows the ability of distinguishing the relative changes of rCBFspeed, rCBV, and rCBF, which can inform specific physiological interpretations of the fUS measurements.
我们引入了一种基于超声散斑去相关的时滞功能超声技术（tl-fUS），用于量化脑血流速度（rCBF $_{\text {speed}}$）、脑血容量（rCBV ）和功能刺激期间的脑血流量（rCBF）。通过数值模拟、模型验证和体内小鼠大脑实验来测试 tl-fUS 解析和量化这些血流动力学参数的比率变化的能力。研究发现，与小静脉相比，小动脉的血容量变化更为显着，并且峰值血流量变化约为大脑激活期间峰值血容量变化的 2.5 倍，这与文献中先前的观察结果一致。 tl-fUS 显示了区分 rCBFspeed、rCBV 和 rCBF 相对变化的能力，这可以为 fUS 测量的特定生理学解释提供信息。

AU Tan, Yubo Shen, Wen-Da Wu, Ming-Yuan Liu, Gui-Na Zhao, Shi-Xuan Chen, Yang Yang, Kai-Fu Li, Yong-Jie
谭AU、沉宇波、吴文达、刘明远、赵桂娜、陈世轩、杨阳、李开复、永杰

Retinal Layer Segmentation in OCT Images With Boundary Regression and Feature Polarization
使用边界回归和特征偏振进行 OCT 图像中的视网膜层分割

The geometry of retinal layers is an important imaging feature for the diagnosis of some ophthalmic diseases. In recent years, retinal layer segmentation methods for optical coherence tomography (OCT) images have emerged one after another, and huge progress has been achieved. However, challenges due to interference factors such as noise, blurring, fundus effusion, and tissue artifacts remain in existing methods, primarily manifesting as intra-layer false positives and inter-layer boundary deviation. To solve these problems, we propose a method called Tightly combined Cross-Convolution and Transformer with Boundary regression and feature Polarization (TCCT-BP). This method uses a hybrid architecture of CNN and lightweight Transformer to improve the perception of retinal layers. In addition, a feature grouping and sampling method and the corresponding polarization loss function are designed to maximize the differentiation of the feature vectors of different retinal layers, and a boundary regression loss function is devised to constrain the retinal boundary distribution for a better fit to the ground truth. Extensive experiments on four benchmark datasets demonstrate that the proposed method achieves state-of-the-art performance in dealing with problems of false positives and boundary distortion. The proposed method ranked first in the OCT Layer Segmentation task of GOALS challenge held by MICCAI 2022. The source code is available at https://www.github.com/tyb311/TCCT.
视网膜层的几何形状是诊断某些眼科疾病的重要成像特征。近年来，光学相干断层扫描（OCT）图像的视网膜层分割方法相继出现，并取得了巨大的进展。然而，现有方法仍然面临噪声、模糊、眼底积液和组织伪影等干扰因素带来的挑战，主要表现为层内误报和层间边界偏差。为了解决这些问题，我们提出了一种称为“具有边界回归和特征极化的紧密组合交叉卷积和变换器”（TCCT-BP）的方法。该方法使用 CNN 和轻量级 Transformer 的混合架构来改善视网膜层的感知。此外，设计了特征分组和采样方法以及相应的偏振损失函数，以最大限度地区分不同视网膜层的特征向量，并设计了边界回归损失函数来约束视网膜边界分布，以更好地拟合视网膜边界分布。基本事实。对四个基准数据集的大量实验表明，所提出的方法在处理误报和边界失真问题方面实现了最先进的性能。该方法在MICCAI 2022举办的GOALS挑战赛的OCT层分割任务中排名第一。源代码位于https://www.github.com/tyb311/TCCT。