Open Access Open Access  Restricted Access Subscription or Fee Access

Deep Learning in Medical Image Classification and Object Detection: A Survey

Priyanka Gupta, Shikha Gupta


Deep learning methods have demonstrated superior performance in the area of computer vision, speech recognition, natural language processing, healthcare, and many more. Convolutional neural networks (CNNs) are a class of deep learning methods that have the ability to learn from raw data available as images, audio, or text. CNNs have become a powerful tool for variety of pattern recognition tasks due to the availability of abundant data and GPU-based training. Usually, a CNN is designed with the following: (1) Convolution layers, (2) Pooling layers, and (3) Fully connected layers. Convolutional layers use convolution filters to extract the low-level features (like edges, circles) and high-level features (like objects, texture) from the input. Pooling layers are interleaved in between the convolution layers to reduce the input dimension for the subsequent layers. A fully connected layer makes use of extracted features from the pooling or convolutional layer and maps them to the final output, such as in the case of classification. In the domain of medical imaging analysis, deep learning methods are rapidly becoming state-of-the-art, achieving magnificent performances in many medical applications amid the challenges of unavailability of large amounts of medical data, and lack of annotated data. In the present work, we seek to review the application of deep learning approaches in the domain of medical imaging. We highlight the impact of deep learning methods with respect to two key areas: image classification and object detection, and give comprehensive summaries of findings in these areas. Future research directions and solutions are also explored.         

Full Text:



Samuel A. Some Studies in Machine Learning Using the Game of Checkers. IBM J Res Dev. 1959;3(3): 210–229.

Kippenhan JS, Barker WW, Pascal S, Nagel JH, Duara R. Evaluation of a neural-network classifier for PET scans of normal and Alzheimer's disease subjects. J Nucl Med. 1992 Aug; 33(8): 1459–67.

Chang RF, Wu WJ, Moon WK, Chou YH, Chen DR. Support vector machines for diagnosis of breast tumors on US images. Acad Radiol. 2003 Feb 1; 10(2): 189–97.

Mourao-Miranda J, Bokde AL, Born C, Hampel H, Stetter M. Classifying brain states and determining the discriminating activation patterns: support vector machine on functional MRI data. NeuroImage. 2005 Dec 1; 28(4): 980–95.

Mu T, Nandi AK, Rangayyan RM. Classification of breast masses using selected shape, edgesharpness, and texture features with linear and kernel-based classifiers. J Digit Imaging. 2008 Jun; 21(2): 153–69.

Lupaşcu CA, Tegolo D. Automatic unsupervised segmentation of retinal vessels using selforganizing maps and k-means clustering. In International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics. Berlin, Heidelberg: Springer; 2010 Sep 16; 263–274.

Rajini NH, Bhavani R. Classification of MRI brain images using k-nearest neighbor and artificial neural network. In 2011 IEEE International conference on recent trends in information technology (ICRTIT). 2011 Jun 3; 563–568.

Suzuki K. Survey of deep learning applications to medical image analysis. Med Imaging Technol. 2017 Sep; 35(4): 212–26.

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015 May; 521(7553): 436–44.

LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998 Nov; 86(11): 2278–324.

Fukushima K. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Netw. 1988 Jan 1; 1(2): 119–30.

Kingma DP, Welling M. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114. 2013 Dec 20.

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial networks. Commun ACM. 2020 Oct 22; 63(11): 139–44.

Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (Icml). 2010 Jan 1; 807–814.

Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. Nature. 1986 Oct; 323(6088): 533–6.

Goodfellow I, Bengio Y, Courville A. Deep learning. MIT press; 2016 Nov 10.

Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Commun ACM. 2017 May 24; 60(6): 84–90.

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. 2014 Sep 4.

Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2015; 1–9.

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2016; 770–778.

Graves A, Fernández S, Gomez F, Schmidhuber J. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd international conference on Machine learning. 2006 Jun 25; 369–376.

Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. Advances in neural information processing systems (NIPS). 2014; 27.

Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan T, Ding D, Bagul A, Langlotz C, Shpanskaya K, Lungren MP. Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225. 2017 Nov 14.

Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med. 2001 Aug 1; 23(1): 89–109.

Huynh BQ, Li H, Giger ML. Digital mammographic tumor classification using transfer learning from deep convolutional neural networks. J Med Imaging. 2016 Aug; 3(3): 034501.

Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017; 542(7639): 115.

Yosinski J, Clune J, Bengio Y, Lipson H. How transferable are features in deep neural networks? Advances in neural information processing systems (NIPS). 2014; 27.

Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC. Imagenet large scale visual recognition challenge. Int J Comput Vis. 2015 Dec; 115(3): 211–52.

Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2017; 4700–4708.

Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2017; 2097–2106.

Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In PMLR International conference on machine learning. 2015 Jun 1; 448–456.

Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. 2009 Jun 20; 248–255.

Han Z, Wei B, Zheng Y, Yin Y, Li K, Li S. Breast cancer multi-classification from histopathological images with structured deep learning model. Sci Rep. 2017 Jun 23; 7(1): 4172.

Spanhol FA, Oliveira LS, Petitjean C, Heutte L. A dataset for breast cancer histopathological image classification. IEEE Trans Biomed Eng. 2015 Oct 30; 63(7): 1455-62.

Basaia S, Agosta F, Wagner L, Canu E, Magnani G, Santangelo R, Filippi M, Alzheimer's Disease Neuroimaging Initiative. Automated classification of Alzheimer's disease and mild cognitive impairment using a single MRI and deep neural networks. NeuroImage: Clin. 2019 Jan 1; 21:101645.

Albert MS, DeKosky ST, Dickson D, Dubois B, Feldman HH, Fox NC, Gamst A, Holtzman DM, Jagust WJ, Petersen RC, Snyder PJ. The diagnosis of mild cognitive impairment due to Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Focus. 2013 Jan; 11(1): 96–106.

McKhann GM, Knopman DS, Chertkow H, Hyman BT, Jack Jr CR, Kawas CH, Klunk WE, Koroshetz WJ, Manly JJ, Mayeux R, Mohs RC. The diagnosis of dementia due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer's disease. Alzheimers Dement. 2011 May 1; 7(3): 263–9.

Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806. 2014 Dec 21.

Talo M, Baloglu UB, Yıldırım Ö, Acharya UR. Application of deep transfer learning for automated brain abnormality classification using MR images. Cogn Syst Res. 2019 May 1; 54: 176–88.

Smith LN. Cyclical learning rates for training neural networks. In 2017 IEEE winter conference on applications of computer vision (WACV). 2017 Mar 24; 464–472.

Huang G, Li Y, Pleiss G, Liu Z, Hopcroft JE, Weinberger KQ. Snapshot ensembles: Train 1, get m for free. arXiv preprint arXiv:1704.00109. 2017 Apr 1.

Khan AI, Shah JL, Bhat MM. CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images. Comput Methods Programs Biomed. 2020 Nov 1; 196: 105581.

Chollet F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2017; 1251–1258.

Cohen JP, Morrison P, Dao L. COVID-19 image data collection. arXiv preprint arXiv:2003.11597. 2020 Mar 25. Retrieved from

Chest X-ray images (pneumonia). 2020. Retrieved from


Ozturk T, Talo M, Yildirim EA, Baloglu UB, Yildirim O, Acharya UR. Automated detection of COVID-19 cases using deep neural networks with X-ray images. Comput Biol Med. 2020 Jun 1; 121: 103792.

Duda RO, Hart PE, Stork DG. Pattern Classification. A Wiley-Interscience Publication Edn. John Wiley & Sons, Inc.; 2001.

Alzubaidi L, Al-Amidie M, Al-Asadi A, Humaidi AJ, Al-Shamma O, Fadhel MA, Zhang J, Santamaría J, Duan Y. Novel transfer learning approach for medical imaging with limited labeled data. Cancers. 2021 Mar 30; 13(7): 1590.

Codella NC, Gutman D, Celebi ME, Helba B, Marchetti MA, Dusza SW, Kalloo A., Liopyris K, Mishra N, Kittler H, et al. Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic). In Proceedings of the 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Washington, DC, USA. 2018 Apr 4–7; 168–172.

Gutman D, Codella NC, Celebi E, Helba B, Marchetti M, Mishra N, Halpern A. Skin lesion analysis toward melanoma detection: A challenge at the international symposium on biomedical imaging (ISBI) 2016, hosted by the international skin imaging collaboration (ISIC). arXiv 2016, arXiv:1605.01397.

Tschandl P, Rosendahl C, Kittler H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data. 2018; 5: 1–9.

Combalia M, Codella NC, Rotemberg V, Helba B, Vilaplana V, Reiter O, Carrera C, Barreiro A, Halpern AC, Puig, S, et al. BCN20000: Dermoscopic lesions in the wild. arXiv 2019. arXiv:1908.02288. Retrieved from,

Ballerini L, Fisher RB, Aldridge B, Rees J. A color and texture based hierarchical K-NN approach to the classification of non-melanoma skin lesions. In: Color medical image analysis. Dordrecht: Springer; 2013; 63–86.

Giotis I, Molders N, Land S, Biehl M, Jonkman MF, Petkov N. MED-NODE: A computer-assisted melanoma diagnosis system using non-dermoscopic images. Expert Syst Appl. 2015 Nov 1; 42(19): 6578–85.

SIIM-ISIC Melanoma Classification (2020). Retrieved from

Spanhol FA, Oliveira LS, Petitjean C, Heutte L. A dataset for breast cancer histopathological image classification. IEEE Trans Biomed Eng. 2015 Oct 30; 63(7): 1455–62.

Bolhasani H, Amjadi E, Tabatabaeian M, Jassbi SJ. A histopathological image dataset for grading breast invasive ductal carcinomas. Inform Med Unlocked. 2020 Jan 1; 19: 100341.

Xu J, Xiang L, Liu Q, Gilmore H, Wu J, Tang J, Madabhushi A. Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images. IEEE Trans Med Imaging. 2015 Jul 20; 35(1): 119–30.

SPIE-AAPM-NCI BreastPathQ (2020). Retrieved from

Aksac A, Demetrick DJ, Ozyer T, Alhajj R. BreCaHAD: a dataset for breast cancer histopathological annotation and diagnosis. BMC Res Notes. 2019 Dec; 12(1): 1–3.

Araújo T, Aresta G, Castro E, Rouco J, Aguiar P, Eloy C, Polónia A, Campilho A. Classification of breast cancer histology images using convolutional neural networks. PloS one. 2017 Jun 1; 12(6): e0177544.

Aresta G, Araújo T, Kwok S, Chennamsetty SS, Safwan M, Alex V, Marami B, Prastawa M, Chan M, Donovan M, Fernandez G. Bach: Grand challenge on breast cancer histology images. Med Image Anal. 2019 Aug 1; 56: 122–39.

Goyal M, Reeves ND, Davison AK, Rajbhandari S, Spragg J, Yap MH. Dfunet: Convolutional neural networks for diabetic foot ulcer classification. IEEE Trans Emerg Top Comput Intell. 2018 Sep 12; 4(5): 728–39.

Alom MZ, Taha TM, Yakopcic C, Westberg S, Sidike P, Nasrin MS, Hasan M, Van Essen BC, Awwal AA, Asari VK. A state-of-the-art survey on deep learning theory and architectures. Electronics. 2019 Mar; 8(3): 292.

Teramoto A, Fujita H, Yamamuro O, Tamaki T. Automated detection of pulmonary nodules in PET/CT images: Ensemble false‐positive reduction using a convolutional neural network technique. Med Phys. 2016 Jun; 43(6Part1): 2821–7.

Setio AA, Ciompi F, Litjens G, Gerke P, Jacobs C, Van Riel SJ, Wille MM, Naqibullah M, Sánchez CI, Van Ginneken B. Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks. IEEE Trans Med Imaging. 2016 Mar 1; 35(5): 1160–9.

Lo SC, Lou SL, Lin JS, Freedman MT, Chien MV, Mun SK. Artificial convolution neural network techniques and applications for lung nodule detection. IEEE Trans Med Imaging. 1995 Dec; 14(4):711–8.

Zhao ZQ, Zheng P, Xu ST, Wu X. Object detection with deep learning: A review. IEEE Trans Neural Netw Learn Syst. 2019 Jan 28; 30(11): 3212–32.

Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2014; 580–587.

He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2015 Jan 9; 37(9): 1904–16.

Girshick R. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision 2015; 1440–1448.

Ren S, He K, Girshick R, Sun J. Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS). 2015; 28.

Dai J, Li Y, He K, Sun J. R-fcn: Object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems (NIPS). 2016; 29.

Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2017; 2117–2125.

He K, Gkioxari G, Dollár P, Girshick R. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision. 2017; 2961–2969.

Erhan D, Szegedy C, Toshev A, Anguelov D. Scalable object detection using deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2014; 2147–2154.

Yoo D, Park S, Lee JY, Paek AS, So Kweon I. Attentionnet: Aggregating weak directions for accurate object detection. In Proceedings of the IEEE International Conference on Computer Vision. 2015; 2659–2667.

Najibi M, Rastegari M, Davis LS. G-cnn: an iterative grid based object detector. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2016; 2369–2377.

Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2016; 779–788.

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC. Ssd: Single shot multibox detector. In European conference on computer vision. Cham: Springer; 2016 Oct 8; 21–37.

Fu CY, Liu W, Ranga A, Tyagi A, Berg AC. Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659. 2017 Jan 23.

Ding J, Li A, Hu Z, Wang L. Accurate pulmonary nodule detection in computed tomography images using deep convolutional neural networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer; 2017 Sep 10; 559–567.

Armato III SG, McLennan G, Bidaut L, McNitt‐Gray MF, Meyer CR, Reeves AP, Zhao B, Aberle DR, Henschke CI, Hoffman EA, Kazerooni EA. The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Med Phys. 2011 Feb; 38(2): 915–31.

Setio AA, Traverso A, De Bel T, Berens MS, Van Den Bogaard C, Cerello P, Chen H, Dou Q, Fantacci ME, Geurts B, van der Gugten R. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge. Med Image Anal. 2017 Dec 1; 42: 1–3.

Yap MH, Pons G, Marti J, Ganau S, Sentis M, Zwiggelaar R, Davison AK, Marti R. Automated breast ultrasound lesions detection using convolutional neural networks. IEEE J Biomed Health Inform. 2017 Aug 7; 22(4): 1218–26.

Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Cham: Springer; 2015 Oct 5; 234–241.

Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2015; 3431–3440.

Prapavesis ST, Fornage BD, Weismann CF, Palko A, Zoumpoulis P. Breast ultrasound and USguided interventional techniques. Thessaloniki, Greece: 2001.

Drukker K, Giger ML, Horsch K, Kupinski MA, Vyborny CJ, Mendelson EB. Computerized lesion detection on breast ultrasound. Med Phys. 2002 Jul; 29(7): 1438–46.

Yap MH, Edirisinghe EA, Bez HE. A novel algorithm for initial lesion detection in ultrasound breast images. J Appl Clin Med Phys. 2008 Sep; 9(4): 181–99.

Shan J, Cheng HD, Wang Y. Completely automated segmentation approach for breast ultrasound images using multiple-domain features. Ultrasound Med Biol. 2012 Feb 1; 38(2): 262–75.

Pons G, Martí R, Ganau S, Sentís M, Martí J. Feasibility study of lesion detection using deformable part models in breast ultrasound images. In Iberian Conference on Pattern Recognition and Image Analysis. Berlin, Heidelberg: Springer; 2013 Jun 5; 269–276.

Powers DM. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061. 2020 Oct 11.

Ribli D, Horváth A, Unger Z, Pollner P, Csabai I. Detecting and classifying lesions in mammograms with deep learning. Sci Rep. 2018 Mar 15; 8(1): 1–7.

Heath M, Bowyer K, Kopans D, Kegelmeyer P, Moore R, Chang K, Munishkumaran S. Current status of the digital database for screening mammography. In: Digital mammography. Dordrecht: Springer, 1998; 457–460.

Moreira IC, Amaral I, Domingues I, Cardoso A, Cardoso MJ, Cardoso JS. In breast: toward a fullfield digital mammographic database. Acad Radiol. 2012 Feb 1; 19(2): 236–48.

Zhang X, Chen F, Yu T, An J, Huang Z, Liu J, Hu W, Wang L, Duan H, Si J. Real-time gastric polyp detection using convolutional neural networks. PloS one. 2019 Mar 25; 14(3): e0214133.

Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A. The pascal visual object classes (voc) challenge. Int J Comput Vis. 2010 Jun; 88(2): 303–38.

Kim KH, Koo HW, Lee BJ, Yoon SW, Sohn MJ. Cerebral hemorrhage detection and localization with medical imaging for cerebrovascular disease diagnosis and treatment using explainable deep learning. J Korean Phys Soc. 2021 Aug; 79(3): 321–7.

Flanders AE, Prevedello LM, Shih G, Halabi SS, Kalpathy-Cramer J, Ball R, Mongan JT, Stein A, Kitamura FC, Lungren MP, Choudhary G. Construction of a machine learning dataset through collaboration: the RSNA 2019 brain CT hemorrhage challenge. Radiol: Artif Intell. 2020 Apr 29; 2(3): e190211.


  • There are currently no refbacks.