John-CVIU+2014-accepted.pdf (4.6 MB)

Tracking object poses in the context of robust body pose estimates

Download (4.6 MB)
journal contribution
posted on 09.02.2016, 14:18 by John Darby, Baihua Li, Nicholas Costen
This work focuses on tracking objects being used by humans. These objects are often small, fast moving and heavily occluded by the user. Attempting to recover their 3D position and orientation over time is a challenging research problem. To make progress we appeal to the fact that these objects are often used in a consistent way. The body poses of different people using the same object tend to have similarities, and, when considered relative to those body poses, so do the respective object poses. Our intuition is that, in the context of recent advances in body-pose tracking from RGB-D data, robust object-pose tracking during human-object interactions should also be possible. We propose a combined generative and discriminative tracking framework able to follow gradual changes in object-pose over time but also able to re-initialise object-pose upon recognising distinctive body-poses. The framework is able to predict object-pose relative to a set of independent coordinate systems, each one centred upon a different part of the body. We conduct a quantitative investigation into which body parts serve as the best predictors of object-pose over the course of different interactions. We find that while object-translation should be predicted from nearby body parts, object-rotation can be more robustly predicted by using a much wider range of body parts. Our main contribution is to provide the first object-tracking system able to estimate 3D translation and orientation from RGB-D observations of human-object interactions. By tracking precise changes in object-pose, our method opens up the possibility of more detailed computational reasoning about human-object interactions and their outcomes. For example, in assistive living systems that go beyond just recognising the actions and objects involved in everyday tasks such as sweeping or drinking, to reasoning that a person has missed sweeping under the chair or not drunk enough water today. © 2014 Elsevier B.V. All rights reserved.



  • Science


  • Computer Science

Published in

Computer Vision and Image Understanding




57 - 72


DARBY, J., LI, B. and COSTEN, N., 2014. Tracking object poses in the context of robust body pose estimates. Computer Vision and Image Understanding, 127, pp.57-72


© Elsevier


AM (Accepted Manuscript)

Publisher statement

This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at:

Acceptance date


Publication date