A Dataset for Evaluation of Joint Hand+Object Tracking

S. Sridhar, F. Mueller, M. Zollhöfer, D. Casas, A. Oulasvirta, C. Theobalt
Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input
European Conference on Computer Vision (ECCV) 2016, Amsterdam, The Netherlands.

Dexter+Object is a dataset for evaluating algorithms for joint hand and object tracking. It consists of 6 sequences with 2 actors (1 female), and varying interactions with a simple object shape. Fingertip positions and cuboid corners were manually annotated for all sequences. This dataset accompanies the ECCV 2016 paper, Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input.


If you use this dataset, you are required to cite the following paper. BibTeX, 1 KB

 author = {Sridhar, Srinath and Mueller, Franziska and Zollhoefer, Michael and Casas, Dan and Oulasvirta, Antti and Theobalt, Christian},
 title = {Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input},
 booktitle = {Proceedings of European Conference on Computer Vision ({ECCV})},
 url = {},
 numpages = {17},
 month = October,
 year = {2016}


  • Compressed Zip: Single file (zip, 1.2 GB), SHA-256:
  • Browse: Link


  • RGB: Creative Senz3D color camera
  • Depth: Creative Senz3D close range TOF depth camera co-located with the color camera
  • Ground Truth: Manually annotated on depth data for 3D fingertip positions, and 3 object (cuboid) corners
  • Depth Camera Intrinsics: Can be used to backproject depth image to create a 3D point cloud

Evaluation Metric

Please see the supplementary document for the definition of the error measure. We recommend using the same measure to facilitate direct comparison with other methods.

Sequence Details

Please click the links below for a video preview of each sequence.
  1. Grasp1: User grasping a small cuboid.
  2. Grasp2: User grasping a big cuboid.
  3. Pinch: User pinching a small cuboid.
  4. Rigid: User moving rigidly while holding a small cuboid.
  5. Rotate: User holding and rotating a small cuboid.
  6. Occlusion: A small cuboid occluding the user's hand.

Dataset Structure

The root directory (containing this file) consists of 4 sub-directories.
  1. data: All the data resides here.
      • color: Color images in BMP format.
      • depth: Depth map as 16-bit PNG. Background/invalid pixels have a value of 32001.
      • annotations: Contains manually annotated data for fingertip positions and 3 cuboid corners. See README.txt inside the directory for more details.
  2. preview: Preview videos of all the sequences in the dataset and a montage of results from our tracker.
  3. scripts: Bash scripts used to make previews.
  4. calibration: Intrinsic calibration matrix for depth map backprojection.


We thank Perttu Lähteenlahti for helping with data annotation.

Page last updated Sept-2016. Imprint/Impressum.