Novel Machine Learning for Human Actions Classification Using Histogram of Oriented Gradients and Sparse Representation
Recognition of human actions is a trending research topic as it can be used for crucial medical applications like life care and healthcare. In this research, we propose a novel machine learning algorithm for the classification of human actions based on sparse representation theory. In the proposed framework, the input videos are initially partitioned into several temporal segments of a predefined length. From these temporal segments, the key-cuboids are then obtained. These cuboids are obtained based on the locations having maximum variation in orientation. From these regions, key-cuboids are extracted. From the key-cuboids, Histogram of Oriented Gradient (HOG) features are extracted. This new descriptor has the capability to express the dynamic features in the action videos. Using these features, a single shared dictionary is created from the videos belonging to different classes using K-Singular Value Decomposition (K-SVD) algorithm. This dictionary has the combined features of all the action classes. This shared dictionary is generated during the training phase. During the testing phase, the features belonging to a test class is classified using a novel Sparse Representation Modeling based Action Recognition (SRMAR) Algorithm using Orthogonal Matching Pursuit (OMP) and the shared dictionary. The proposed framework was evaluated using popular benchmark action recognition datasets like KTH dataset, Olympic dataset and the Hollywood dataset. The results obtained using these datasets were represented in the form of a confusion matrix. Evaluation was performed using metrics like overall classification accuracy, specificity, precision, recall and F-score that were obtained from the confusion matrix. This system achieved a high specificity of about 99.52%, 99.16% and 96.15% for the KTH dataset, Olympic dataset and the Hollywood datasets, respectively. Similarly, the proposed framework attained very good precision of 97.64%, 90.46% and 73.39% for the KTH dataset, Olympic dataset and the Hollywood datasets, respectively. Also, the average value of recall achieved was 97.58%, 90.86% and 74.09% for the KTH dataset, Olympic dataset and the Hollywood datasets, respectively. It was also observed that the proposed machine learning algorithm achieved outstanding results compared to the existing state-of-the-art human action recognition frameworks in the literature.
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.