The applications of Artificial Intelligence (AI) for X-ray and CT-Scan image analysis using Convolutional neural network architectures, Generative adversarial networks, transfer learning, and data augmentation techniques are discussed.
Currently, AI algorithms embedded on a mobile x-ray and CT-Scan devices for automated diagnosis, measurements, case prioritization, and quality control are most popular research area. More than 60,000 research articles have been published related to the use of deep learning in healthcare and related applications. Established architectures, such as ResNet-50 or DenseNet-161 (with 50 and 161 representing the number of layers within the respective neural network) are easy to use. Integration of the AI modules with the drug systems and the experts are the key issues of implementing AI systems in healthcare.
“The true AI systems are more than pattern matching or data mining. They make constant abstraction of the world they have encountered so far, enabling them to anticipate what is to come and to serve humanity in a better way. “ — Amit Ray, Famous Scientist, Pioneer of Compassionate AI Movement.
Dr. Amit Ray in his Compassionate AI Lab used the popular deep convolutional neural networks (CNN) for AI based automated mobile diagnosis and measurements. They have used 21 different convolutional neural networks (CNN) of five different architectures (ResNet, DenseNet, VGG, SqueezeNet and AlexNet) for mobile x-ray device and automated diagnosis.
The key architecture for radiology image analysis are AlexNet, Inception V3, Xception, ShuffleNet, Alexnet-152, ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152, ResNet-179, SqueezeNet 1.0, VGG-16, VGG-19, DenseNet-40, DenseNet-50, DenseNet-121, DenseNet-161, DenseNet-169, DenseNet-201, DenseNet-219. Transfer learning techniques are often used for CT-Scan image analysis.
The common five CNN are ResNet18, ResNet50, VGG, SqueezeNet, and DenseNet-16. Transfer learning techniques are also the backbone of modern healthcare research. They detects abnormal chest X-rays, then identifies and localizes 32 common abnormalities. It can also screens for tuberculosis and pneumonia, and can be used in public health screening programs.
Chest radiography imaging such as X-ray or computed tomography (CT), which is a routine technique for diagnosing pneumonia, can be easily performed, and it provides a quick, highly sensitive diagnosis.
CT scan and X-ray Medical Image Datasets
X-ray images are in 2D, while CT scan images are 3D. A computerized tomography (CT) scan is usually a series of X-rays taken from different angles and then assembled into a three-dimensional. While an X-ray may show edges of soft tissues all stacked on top of each other, the computer used for a CT scan can figure out how those edges relate to each other in space, so the CT image is more useful for understanding blood vessels and soft tissue. Another kind of CT scan uses positrons — positrons are antimatter electrons. The National Institutes of Health’s Clinical Center has made a large-scale dataset of CT images publicly available to help the scientific community improve detection accuracy of lesions.
Open-Access Medical Image Repositories have huge collections of X-Ray and CT-Scan images. Many commercial AI products are built on proprietary data sets or specific hospital data sets not available due to concerns over patient privacy. There are however several imaging data sets of radiological images and/or reports publicly available the for Artificial Intelligence applications. The CoronaHack-Chest X-Ray-Dataset is also very popular for researchers.
Medical Image Data Augmentation Techniques
For reliable predictions, the deep learning models often require a lot of training data, which is not always available. Therefore, the existing data is augmented in order to make a better generalized model.
Data Augmentation encompasses a suite of techniques that enhance the size and quality of training datasets such that better Deep Learning models can be built using them. The medical image augmentation algorithms used includes geometric transformations, horizontal flips, random crops, principal component analysis (PCA), mixing images, random erasing, feature space augmentation, adversarial training, color space augmentations, kernel filters, generative adversarial networks, neural style transfer, and meta-learning. The augmentation algorithms exploits various transformations of the original data, including affine image transformations, elastic transformations, pixel-level transformations, and other approaches like Generative adversarial networks (GAN).
Generative Adversarial Networks
Generative adversarial networks (GANs), originally introduced in Goodfellow, are being exploited to augment medical datasets. The main objective of a GAN is to generate a new data example (by a generator) which will be indistinguishable from the real data by the discriminator (the generator competes with the discriminator, and the overall optimization mimics the min-max game). Often GAN architecture utilizes a coarse-to-fine generator whose objective is to capture the manifold of the training data and generate augmented examples.
Deep Convolutional Neural Networks (CNN)
In a traditional neural network, neurons are fully connected between different layers. Layers that sit between the input layer and output layer are called hidden layers. Each hidden layer is made up of a number of neurons, where each neuron is fully connected to all neurons in the preceding layer. The problem with the fully connected neural network is that its densely-connected network architecture does not scale well to large images. For large images, the most preferred approach is to use convolutional neural network. Convolutional neural network is a deep neural network architecture designed to process data that has a known, grid-like topology. It typically comprises of repeating sets of four sequential steps:
- Convolution layer: The input (image) is convoluted by application of numerous kernels and each kernel results in a distinct feature map
2. Pooling layer: Each feature map is downsized to a smaller matrix by pooling the values in adjacent pixels
3. Non-linear activation unit: The activation of each neuron is then computed by the application of this non-linear function to the weighted sum of its inputs and an additional bias term. This is what gives the neural network the ability to approximate almost any function.
4. Rectified Linear Unit (ReLU): The most popular activation unit is the rectified linear unit (ReLU). During convolution and pooling processes results in some pixels in the matrix having negative values. The rectified linear unit ensures all negative values are at a zero.
These four steps are then repeated many times, each convolution layer acting upon the pooled and rectified feature maps from the preceding layer. Convolutional and activation function layers are usually stacked together followed by an optional pooling layer. Fully connected layer makes up the last layer of the network and the output of the last fully connected layer produces the class scores of the input image. In addition to these main layers mentioned above, they may include optional layers like batch normalization layer to improve the training time and dropout layer to address the overfitting issue.
Results and Discussions
For the training and testing of AI-based models, the original image dataset are divided into 80% training dataset and 20% external validation.
The popular metrics AUC (AUROC) and average precision are used to understand how a classifier performs on balanced data and on imbalanced data. Predictions on the validation dataset of the models for each network architecture are pooled so that the models could be evaluated as a consortium. For each individual prediction as well as the pooled predictions, receiver operation characteristic (ROC) curves and precision recall curves (PRC) were plotted and the areas under each curve were calculated (AUROC and AUPRC). Area under the Receiver Operating Characteristic Curve (AUROC) and Area under the Precision Recall Curve (AUPRC) were chosen as they enable a comparison of different models, independent of a chosen threshold for the classification.
The applications of deep learning algorithms for X-ray and CT-Scan image analysis using Convolutional neural network architectures, Generative adversarial networks, transfer learning, and data augmentation techniques are explained.
The sheer quantity of data that the models need to learn in deep learning is one of its more serious weaknesses. The more complex the model the more data they tend to need. The successful applications need combine methods from many areas of AI.