RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video

Wang, Jiayi; Mueller, Franziska; Bernard, Florian; Sorli, Suzanne; Sotnychenko, Oleksandr; Qian, Neng; Otaduy, Miguel A.; Casas, Dan; Theobalt, Christian

RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video

Jiayi Wang^1,2 Franziska Mueller^1,2 Florian Bernard^1,2,3 Suzanne Sorli⁴ Oleksandr Sotnychenko^1,2 Neng Qian^1,2 Miguel A. Otaduy⁴ Dan Casas⁴ Christian Theobalt^1,2

¹Max Planck Institute for Informatics (GVV Group) ²Saarland Informatics Campus ³ Technical University of Munich ⁴Universidad Rey Juan Carlos

SIGGRAPHAsia2020 Virtual Conference

Download Video(MP4, 190 MB)

Abstract

Tracking and reconstructing the 3D pose and geometry of two hands in interaction is a challenging problem that has a high relevance for several human-computer interaction applications, including AR/VR, robotics, or sign language recognition. Existing works are either limited to simpler tracking settings (e.g., considering only a single hand or two spatially separated hands), or rely on less ubiquitous sensors, such as depth cameras. In contrast, in this work we present the first real-time method for motion capture of skeletal pose and 3D surface geometry of hands from a single RGB camera that explicitly considers close interactions. In order to address the inherent depth ambiguities in RGB data, we propose a novel multi-task CNN that regresses multiple complementary pieces of information, including segmentation, dense matchings to a 3D hand model, and 2D keypoint positions, together with newly proposed intra-hand relative depth and inter-hand distance maps. These predictions are subsequently used in a generative model fitting framework in order to estimate pose and shape parameters of a 3D hand model for both hands. We experimentally verify the individual components of our RGB two-hand tracking and 3D reconstruction pipeline through an extensive ablation study. Moreover, we demonstrate that our approach offers previously unseen two-hand tracking performance from RGB, and quantitatively and qualitatively outperforms existing RGB-based methods that were not explicitly designed for two-hand interactions. Moreover, our method even performs on-par with depth-based real-time methods.

Downloads

Paper
PDF, 22.3 MB
RGB2Hands Benchmark
Website
Dense Real Dataset
ZIP, 1.2 GB
Dense Synthetic Dataset
ZIP, 81.2 GB

Citation

BibTeX, 1 KB

@article{wang_SIGAsia2020,
  title={{RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video}},
  author={Wang, Jiayi and Mueller, Franziska and Bernard, Florian and Sorli, Suzanne and Sotnychenko, Oleksandr and Qian, Neng and Otaduy, Miguel A. and Casas, Dan and Theobalt, Christian},
  journal={ACM Transactions on Graphics (TOG)},
  volume={39},
  number={6},
  year={2020},
  month={12},
  article={218},
  publisher={ACM}
}

Acknowledgments

The work was supported by the ERC Consolidator Grants 4DRepLy (770784) and TouchDesign (772738) and Spanish Ministry of Science (RTI2018-098694-B-I00 VizLearning).

Contact

Jiayi Wang
jwang@mpi-inf.mpg.de

This page is Zotero and Mendeley translator friendly. This page was last updated: 03/02/2021

Imprint/Impressum | Data Protection/Datenschutzhinweis