SynthHands

A Dataset for Hand Pose Estimation from Depth and Color

F. Mueller, D. Mehta, O. Sotnychenko, S. Sridhar, D. Casas, C. Theobalt
Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor
International Conference on Computer Vision (ICCV) 2017, Venice, Italy.

This dataset accompanies the ICCV 2017 paper, Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor. SynthHands is a dataset for training and evaluating algorithms for hand pose estimation from depth and color data. The dataset contains data for male and female hands, both with and without interaction with objects. While the hand and foreground object are synthtically generated using Unity, the motion was obtained from real performances as described in the accompanying paper. In addition, real object textures and background images (depth and color) were used. Ground truth 3D positions are provided for 21 keypoints of the hand. The following variations are contained in the data:

  • Pose: 63,530 frames of real hand motion, sampled every 5th frame
  • Wrist+Arm Rotation: sampled from a 70 (wrist) / 180 (arm) degree range
  • Shape: x,y,z scale sampled uniformly in [0.8,1.2]; female and male mesh
  • Skin Color: 2 x 6 hand textures (female/male)
  • Camera Viewpoints: 5 egocentric viewpoints
  • Object Shapes: 7 objects
  • Object Textures: 145 textures
  • Background Clutter: 10,000 real images, uniform random u,v offset in [-100,100]

License

This dataset can only be used for scientific/non-commercial purposes. Please refer to the detailed license which is also enclosed in the download file. If you use this dataset, you are required to cite the following paper. BibTeX, 1 KB

@inproceedings{OccludedHands_ICCV2017,
 author = {Mueller, Franziska and Mehta, Dushyant and Sotnychenko, Oleksandr and Sridhar, Srinath and Casas, Dan and Theobalt, Christian},
 title = {Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor},
 booktitle = {Proceedings of International Conference on Computer Vision ({ICCV})},
 url = {http://handtracker.mpi-inf.mpg.de/projects/OccludedHands/},
 numpages = {10},
 month = October,
 year = {2017}
}

Downloads

  • Compressed Zip: Single file (zip, 48.5 GB), SHA-256:
    ba6eeaf6d251c983a6e7e0f20057b6b7034dbaf1e1f9e8a7ae1436b918d5a0b4
  • Browse: here

Data

  • Depth: rendered using Unity (mimicking Intel RealSense SR300 @640x480 px), augmented with real background
  • Color on Depth: rendered using Unity (mimicking Intel RealSense SR300 @640x480 px), augmented with real background
  • Color: rendered using Unity (mimicking Intel RealSense SR300 @640x480 px), chroma-keyed background
  • Annotation: full 3D positions for 21 keypoints of the hand
  • Camera Calibration: for mapping the 3D positions onto the image plane and mapping between coordinate systems

FAQ

  • How can I get 2D joint positions from the provided 3D annotations?
    The camera calibration file provides you with different matrices: depth_intrinsics, color_intrinsics and color_extrinsics. The 3D positions are given in the depth camera space. So when you want to project the 3D point [x,y,z] onto the depth image plane, you can do that by: [u,v,w] = depth_intrinsics * [x,y,z,1] where [u/w, v/w] is the final 2D image location. When you want to project onto the color image plane, you first go from depth camera to color camera space: [x', y', z'] = color_extrinsics * [x,y,z], where color_extrinsics is a 3 x 4 matrix with the translation in the last column. Afterwards you can project onto the image plane again: [u',v',w'] = color_intrinsics * [x', y', z', 1] with [u'/w', v'/w'] being the final 2D location.

Contact

Franziska Mueller
frmueller@mpi-inf.mpg.de

Back to ICCV 2017