Abstract: |
The retroperitoneum presents a diverse array of pathologies, encompassing both rare benign tumors and malignant neoplasms, which can be either primary or metastatic. Diagnosing and treating these tumors can be challenging due to their infrequency, late presentation, and their close association with critical structures in the retroperitoneal space. Estimating the volume of retroperitoneal tumors is often challenging due to their large dimensions and irregular shape. Automatic semantic segmentation of tumors is crucial for comprehensive medical image analysis, impacting accurate cancer diagnosis and treatment planning due to manual segmentation’s time-consuming and tedious nature. U-Net and its variants incorporate the various convolutions of Vision Transformer (ViT) designs and have delivered top-notch results in 2D and 3D medical image segmentation tasks across diverse imaging modalities. ViTs excel at extracting global information, yet face scalability issues due to high computational and memory costs, making them challenging to use in medical applications with hardware constraints. Recently, architectures like the Mamba State Space Model (SSM) have been developed to address the quadratic computational demands of Transformers. Additionally, Extended Long-Short Term Memory (xLSTM) has emerged as a noteworthy successor to traditional LSTMs, offering a competitive edge in sequence modeling. Like SSMs, xLSTM excels at managing long-range dependencies while maintaining linear computational and memory efficiency. This study aimed to evaluate the U-Net-based modifications from the convolutional neural networks (CNNs), ViT, mamba, and the new xLSTM companions over the newly introduced in-house CT dataset along a publicly available organ segmentation dataset. Specifically, we introduce ViLU-Net designed by incorporating Vi-blocks into the encoder-decoder framework to advance biomedical image segmentation. The results point out the efficacy of xLSTM within the U-Net structure. The codes are publicly available at GitHub. © 2025 SPIE. |