Action recognition in video using a spatial-temporal graph-based feature representation
Jargalsaikhan, Iveel, Little, SuzanneORCID: 0000-0003-3281-3471, Trichet, Remi and O'Connor, Noel E.ORCID: 0000-0002-4033-9135
(2015)
Action recognition in video using a spatial-temporal graph-based feature representation.
In: 12th IEEE International Conference on Advanced Video and Signal based Surveillance (AVSS2015), 25-26 Aug 2015, Karlsruhe, Germany.
We propose a video graph based human action recognition
framework. Given an input video sequence, we extract
spatio-temporal local features and construct a video graph to incorporate appearance and motion constraints to reflect the spatio-temporal dependencies among features. them. In particular, we extend a popular dbscan density-based clustering algorithm to form an intuitive video graph. During training, we estimate a linear SVM classifier using the standard Bag-of-words method. During classification, we apply Graph-Cut optimization to find the most frequent action label in the constructed graph and assign this label to the test video sequence. The proposed approach achieves stateof-the-art performance with standard human action recognition benchmarks, namely KTH and UCF-sports datasets and competitive results for the Hollywood (HOHA) dataset.