Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

European Conference on Computer Vision (ECCV) 2016, Amsterdam, The Netherlands

Download Video (MP4, 720p, 99 MB)


Real-time simultaneous tracking of hands manipulating and interacting with external objects has many potential applications in augmented reality, tangible computing, and wearable computing. However, due to difficult occlusions, fast motions, and uniform hand appearance, jointly tracking hand and object pose is more challenging than tracking either of the two separately. Many previous approaches resort to complex multi-camera setups to remedy the occlusion problem and often employ expensive segmentation and optimization steps which makes real-time tracking impossible. In this paper, we propose a real-time solution that uses a single commodity RGB-D camera. The core of our approach is a 3D articulated Gaussian mixture alignment strategy tailored to hand-object tracking that allows fast pose optimization. The alignment energy uses novel regularizers to address occlusions and hand-object contacts. For added robustness, we guide the optimization with discriminative part classification of the hand and segmentation of the object. We conducted extensive experiments on several existing datasets and introduce a new annotated hand-object dataset. Quantitative and qualitative results show the key advantages of our method: speed, accuracy, and robustness.



BibTeX, 1 KB

 author = {Sridhar, Srinath and Mueller, Franziska and Zollhoefer, Michael and Casas, Dan and Oulasvirta, Antti and Theobalt, Christian},
 title = {Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input},
 booktitle = {Proceedings of European Conference on Computer Vision ({ECCV})},
 url = {},
 numpages = {17},
 month = October,
 year = {2016}

Related Pages

  • Fast and Robust Hand Tracking Using Detection-Guided Optimization, CVPR 2015 (webpage)
  • Investigating the Dexterity of Multi-Finger Input for Mid-Air Text Entry, CHI 2015 (webpage)
  • Real-time Hand Tracking Using a Sum of Anisotropic Gaussians Model, 3DV 2014 (webpage)
  • Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data, ICCV 2013 (webpage)
  • HandSonor: A Customizable Vision-based Control Interface for Musical Expression, Extended Abstracts, CHI 2013 (webpage)


  • Dexter+Object Dataset, ECCV 2016 (webpage)
  • Dexter 1, ICCV 2013 (webpage)


This research was funded by the ERC Starting Grant projects CapReal (335545) and COMPUTED (637991), and the Academy of Finland. We would like to thank Christian Richardt.


Srinath Sridhar

This page is Zotero translator friendly. Page last updated 07-Oct-2016. Imprint/Impressum.