Comprehensive Image Classification using Hybrid CNN-LSTM Model with Advanced Feature Extraction on Coco Dataset

Zahraa Haimeed Rasool 1 Maha Adham Abdel Amir 2

Department of Computer Science, College of Science, Mustansiriyah University, Baghdad, Iraq.

Keywords: Object Detection, Principal Component Analysis, Gray-Level Co-Occurrence Matrix, Histogram of Oriented Gradients, Convolutional Neural Networks, Long Short-Term Memory.


Abstract

A fundamental computer vision challenge is object detection, which involves pinpointing and classifying objects in an image or video. This capability opens up many possibilities in autonomous vehicles, surveillance, and image analytics. In this study, the proposed hybrid CNN_LSTM model is employed to classify various categories within the Coco dataset, spanning common everyday objects, animals, vehicles, and more. The workflow includes steps to enhance image data and extract pertinent features. Initially, RGB images were converted to grayscale to simplify processing, followed by histogram equalization to enhance the contrast and median blur for noise reduction. Principal Component Analysis (PCA), Gray-Level Co-Occurrence Matrix (GLCM), and Histogram of Oriented Gradients (HOG) were used for feature extraction. The architecture employs a proposed hybrid CNN_LSTM model structure, combining Convolutional Neural Spatial and sequential patterns are captured by CNNs and LSTM networks. This effective hybrid neural network classifies images using preprocessing and feature extraction. The model performed well on the COCO dataset, with an accuracy of 0.9917, precision of 0.991738, recall of 0.991695, and F1 score of 0.999949, supported by consistent loss reduction and accuracy improvement in its training history, proving its pattern recognition abilities.