[link]
Summary by Anmol Sharma 5 years ago
Skin cancer is one of the most common cancer type in humans. Primarily, the lesion is diagnosed visually through a series of 2D color images taken of the affected area. This may be followed by dermoscopic analysis, a biopsy and histopathological examination. Automated classification of skin lesions using images is a challenging task owing to the fine-grained variability in the appearance of skin lesions.
To this end, Esteva et al. propose a deep learning based solution to automate the task of diagnosing lesions of the skin into fine-grained categories. Specifically, they use a GoogleNet Inception v3 CNN architecture which won the ImageNet Large Scale Visual Recognition Challenge in 2014. The method also leverages pre-training, in which an already trained DNN can be fine-tuned on a slightly varied task, which allows the network to leverage the convolutional filters it might have learnt from a much larger dataset. To achieve this, the Inception v3 CNN was fine-tuned from a pre-trained state. The model was initially trained on approximately 1.28 million images with about 1000 classes, from the 2014 ImageNet Large Scale Visual Recognition Challenge. Following which, the network is then fine-tuned on the dermatology dataset.
The dataset used in the study was obtained clinically from open-access online repositories and Stanford Medical Center. It consists of 127,463 training and validation images, and held out set of 1942 labelled test images. The labels are organized hierarchically in a tree like structure, where each succeeding depth level represents a fine-grained classification of the disease.
The network is trained to perform three tasks: i) classify the first-level nodes of the taxonomy,
which represent benign lesions, malignant lesions and non-neoplastic. ii) nine-class
disease partition—the second-level nodes—so that the diseases of
each class have similar medical treatment plans, and finally iii) using only biopsy-proven images on medically important use cases, whether the algorithm and dermatologists could distinguish malignant versus benign lesions of epidermal (keratinocyte carcinoma compared to benign seborrheic keratosis) or melanocytic (malignant melanoma compared to benign nevus) origin. The CNN achieved 72.1 $\pm$ 0.9\% (mean $\pm$ s.d.) overall accuracy (the average of individual inference class accuracies) and two dermatologists attain 65.56\% and 66.0\% accuracy on a subset of the validation set for the first task. The CNN achieves 55.4 $\pm$ 1.7\% overall accuracy whereas the same two dermatologists attain 53.3\% and 55.0\% accuracy in the second task. For the third task, the CNN outperforms the dermatologists, and obtains an area under the curve (AUC) over 91\% for each case.
more
less