Menu Close

ADD-UNet: A Convolutional Neural Network For Semantic Segmentation of Medical Images With Atrous Spatial Pyramid Pooling, Attention-Guided Dense Up-Sampling, And Dense Context Local Convolution

M. Prasad* and Vijaya J


Medical Imaging has widely revolutionized the medical practices for the diagnosis states such as cancer diagnosis and planning for the methodology for the treatment and monitoring. With the progress in medical imaging technologies, the size of high-quality medical data is increasing. Machine learning, particularly Deep Learning has set a new scope for understanding and utilizing medical imaging data smartly and extracting clinical information. Convolutional Neural Networks (CNNs) provides several methods for medical image segmentation to understand the desirable features and decision functions. U-Net is one of the top-performing convolution neural networks for exceptional segmentation of medical images. This paper proposes an ADD-UNet model based on a novel atrous spatial pyramid pooling, attention-guided dense up-sampling, and dense context local convolution for the semantic segmentation of medical images. We re-design the existing U-Net architecture to conserve the network’s spatial information and apply the dense local contextual information methods for the fine recovery of localization information. An atrous spatial pyramid pooling is utilized to replace ordinary convolution filters in U-Net for extracting multi-scale information at the encoder-decoder portion of the process because it conserves the receptive field and details of the image in the network by inserting spaces in the convolution filters. It is possible to build intricate feature representations for dense prediction by precisely restoring the spatial resolution using an attention-guided dense up-sampling technique. The low details of the feature representations can be recovered to the input resolution for pixel-wise classification, which is useful for precise border localization, by learning the up-sampling procedure. A dense local context convolution is used on the output feature representation to extract multi-level features of the segmented image. We utilized the TCIA-TGCA Brain MRI Dataset to train and validate the proposed network and achieved 81.0%, 89.4%, 80.5%, and 89.1% in Training IoU, Train Dice Score, Testing IoU, and Testing Dice Score respectively.

Keywords: ADD-UNet, Atrous Spatial Pyramid Pooling, Attention-guided Dense Up-sampling, Dense Context Local Convolution, Medical Image Segmentation.

Posted in Volume 4, Issue No. 3 (July-September 2022)