Key body pose detection and movement assessment of fitness performances
thesisposted on 22.04.2015 by Pablo Fernandez de Dios
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
Motion segmentation plays an important role in human motion analysis. Understanding the intrinsic features of human activities represents a challenge for modern science. Current solutions usually involve computationally demanding processing and achieve the best results using expensive, intrusive motion capture devices. In this thesis, research has been carried out to develop a series of methods for affordable and effective human motion assessment in the context of stand-up physical exercises. The objective of the research was to tackle the needs for an autonomous system that could be deployed in nursing homes or elderly people's houses, as well as rehabilitation of high profile sport performers. Firstly, it has to be designed so that instructions on physical exercises, especially in the case of elderly people, can be delivered in an understandable way. Secondly, it has to deal with the problem that some individuals may find it difficult to keep up with the programme due to physical impediments. They may also be discouraged because the activities are not stimulating or the instructions are hard to follow. In this thesis, a series of methods for automatic assessment production, as a combination of worded feedback and motion visualisation, is presented. The methods comprise two major steps. First, a series of key body poses are identified upon a model built by a multi-class classifier from a set of frame-wise features extracted from the motion data. Second, motion alignment (or synchronisation) with a reference performance (the tutor) is established in order to produce a second assessment model. Numerical assessment, first, and textual feedback, after, are delivered to the user along with a 3D skeletal animation to enrich the assessment experience. This animation is produced after the demonstration of the expert is transformed to the current level of performance of the user, in order to help encourage them to engage with the programme. The key body pose identification stage follows a two-step approach: first, the principal components of the input motion data are calculated in order to reduce the dimensionality of the input. Then, candidates of key body poses are inferred using multi-class, supervised machine learning techniques from a set of training samples. Finally, cluster analysis is used to refine the result. Key body pose identification is guaranteed to be invariant to the repetitiveness and symmetry of the performance. Results show the effectiveness of the proposed approach by comparing it against Dynamic Time Warping and Hierarchical Aligned Cluster Analysis. The synchronisation sub-system takes advantage of the cyclic nature of the stretches that are part of the stand-up exercises subject to study in order to remove out-of-sequence identified key body poses (i.e., false positives). Two approaches are considered for performing cycle analysis: a sequential, trivial algorithm and a proposed Genetic Algorithm, with and without prior knowledge on cyclic sequence patterns. These two approaches are compared and the Genetic Algorithm with prior knowledge shows a lower rate of false positives, but also a higher false negative rate. The GAs are also evaluated with randomly generated periodic string sequences. The automatic assessment follows a similar approach to that of key body pose identification. A multi-class, multi-target machine learning classifier is trained with features extracted from previous motion alignment. The inferred numerical assessment levels (one per identified key body pose and involved body joint) are translated into human-understandable language via a highly-customisable, context-free grammar. Finally, visual feedback is produced in the form of a synchronised skeletal animation of both the user's performance and the tutor's. If the user's performance is well below a standard then an affine offset transformation of the skeletal motion data series to an in-between performance is performed, in order to prevent dis-encouragement from the user and still provide a reference for improvement. At the end of this thesis, a study of the limitations of the methods in real circumstances is explored. Issues like the gimbal lock in the angular motion data, lack of accuracy of the motion capture system and the escalation of the training set are discussed. Finally, some conclusions are drawn and future work is discussed.
- Computer Science