Deep Learning in Medical Image Analysis

Yüklə 4,25 Mb.

səhifə	2/25
tarix	14.06.2022
ölçüsü	4,25 Mb.
	#89427

1 2 3 4 5 6 7 8 9 ... 25

INTRODUCTION

Over the past few decades, medical imaging techniques, such as computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), mammography, ul- trasound, and X-ray, have been used for the early detection, diagnosis, and treatment of diseases (1). In the clinic, medical image interpretation has been performed mostly by human experts such as radiologists and physicians. However, given wide variations in pathology and the potential fa- tigue of human experts, researchers and doctors have begun to benefit from computer-assisted interventions. Although the rate of progress in computational medical image analysis has not been as rapid as that in medical imaging technologies, the situation is improving with the introduction of machine learning techniques.
In applying machine learning, finding or learning informative features that well describe the regularities or patterns inherent in data plays a pivotal role in various tasks in medical image anal- ysis. Conventionally, meaningful or task-related features were designed mostly by human experts on the basis of their knowledge about the target domains, making it challenging for nonexperts to exploit machine learning techniques for their own studies. In the meantime, there have been efforts to learn sparse representations based on predefined dictionaries, possibly learned from training samples. Sparse representation is motivated by the principle of parsimony in many areas of science; that is, the simplest explanation of a given observation should be preferred over more complicated ones. Sparsity-inducing penalization and dictionary learning have demonstrated the validity of this approach for feature representation and feature selection in medical image analysis (2–6). It should be noted that sparse representation or dictionary learning methods described in the literature still find informative patterns or regularities inherent in data with a shallow archi- tecture, thus limiting their representational power. However, deep learning (7) has overcome this obstacle by incorporating the feature engineering step into a learning step. That is, instead of extracting features manually, deep learning requires only a set of data with minor preprocessing, if necessary, and then discovers the informative representations in a self-taught manner (8, 9). Therefore, the burden of feature engineering has shifted from humans to computers, allowing

Annu. Rev. Biomed. Eng. 2017.19:221-248. Downloaded from www.annualreviews.org Access provided by 82.215.98.77 on 06/08/22. For personal use only.
nonexperts in machine learning to effectively use deep learning for their own research and/or applications, especially in medical image analysis.
The unprecedented success of deep learning is due mostly to the following factors: (a) ad- vances in high-tech central processing units (CPUs) and graphics processing units (GPUs),
(b) the availability of a huge amount of data (i.e., big data), and (c) developments in learning algo- rithms (10–14). Technically, deep learning can be regarded as an improvement over conventional artificial neural networks (15) in that it enables the construction of networks with multiple (more than two) layers. Deep neural networks can discover hierarchical feature representations such that higher-level features can be derived from lower-level features (9). Because these techniques en- able hierarchical feature representations to be learned solely from data, deep learning has achieved record-breaking performance in a variety of artificial intelligence applications (16–23) and grand challenges (24, 25; see https://grand-challenge.org). In particular, improvements in computer vision prompted the use of deep learning in medical image analysis, such as image segmentation (26, 27), image registration (28), image fusion (29), image annotation (30), computer-aided diag- nosis (CADx) and prognosis (31–33), lesion/landmark detection (34–36), and microscopic image analysis (37, 38).
Deep learning methods are highly effective when the number of available samples during the training stage is large. For example, in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), more than one million annotated images were available (24). However, in most medical applications there are far fewer images (i.e., <1,000). Therefore, a primary challenge in applying deep learning to medical images is the limited number of training samples available to build deep models without suffering from overfitting. To overcome this challenge, research groups have devised various strategies, such as (a) taking either two-dimensional (2D) or three-dimensional (3D) image patches, rather than the full-sized images, as input (29, 39–45) in order to reduce input dimensionality and thus the number of model parameters; (b) expanding the data set by artificially generating samples via affine transformation (i.e., data augmentation), and then training their network from scratch with the augmented data set (39–42); (c) using deep models trained on a huge number of natural images in computer vision as “off-the-shelf ” feature extractors, and then training the final classifier or output layer with the target-task samples (43, 45); (d ) initializing model parameters with those of pretrained models from nonmedical or natural images, then fine- tuning the network parameters with the task-related samples (46, 47); and (e) using models trained with small-sized inputs for arbitrarily sized inputs by transforming weights in the fully connected layers into convolutional kernels (36, 48).
In terms of input types, we can categorize deep models as typical multilayer neural networks that take vector-format (i.e., nonstructured) values as input and convolutional networks that take 2D or 3D (i.e., structured) values as input. Because of the structural characteristics of images (the structural or configural information contained in neighboring pixels or voxels is another important source of information), convolutional neural networks (CNNs) have attracted great interest in the field of medical image analysis (26, 35–37, 48–50). However, networks with vectorized inputs have also been successfully used in different medical applications (28, 29, 31, 33, 51–54). Along with deep neural networks, deep generative models (55)—such as deep belief networks (DBNs) and deep Boltzmann machines (DBMs), which are probabilistic graphical models with multiple layers of hidden variables—have been successfully applied to brain disease diagnosis (29, 33, 47, 56), lesion segmentation (36, 49, 57, 58), cell segmentation (37, 38, 59, 60), image parsing (61–63),
and tissue classification (26, 35, 48, 50).
This review is organized as follows. In Section 2, we explain the computational theories of neural networks and deep models [e.g., stacked auto-encoders (SAEs), DBNs, DBMs, CNNs] and discuss how they extract high-level representations from data. In Section 3, we introduce recent studies

Yüklə 4,25 Mb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8 9 ... 25