2013 6th International Congress on Image and Signal Processing, CISP 2013, Hangzhou, China, 16 - 18 December 2013, vol.1, pp.112-116
In this study, a global shape descriptor that we call Mixture of Poses (MoP) is proposed to solve human behavior understanding problem. Firstly, the Shape Context Descriptor (SCD) is obtained for each frame. SCD is a low level feature representing a single pose, that is the shape at a single frame. PCA is used for data reduction while obtaining SCDs. The collection of SCDs obtained in a video of a single action are clustered by a version of k-medoids algorithm. Center Poses, which are cluster medoids, are used in turn to initialize the mixture of Gaussians to be trained by expectation maximization algorithm. MoPs are these mixtures of Gaussians representing the distribution of SCDs. Number of mixtures in MoGs are found automatically by the system, since more clusters are emerged by itself for more complex actions undergoing with more different poses. Experiments are conducted on Weizmann dataset and quite encouraging results are obtained. © 2013 IEEE.