Depth-Based 3D Hand Pose Estimation
This task builds on BigHand2.2M dataset in a similar format to HANDS 2017 challenge. Hands appear in both 3rd person and egocentric viewpoints. No objects are present in this task.
- Training set: Contains images from 5 different subjects. Some hand articulations and viewpoints are strategically excluded.
- Test set: Contains images from 10 different subjects. 5 subjects overlap with the training set. Exhaustive coverage of viewpoints and articulations.
The following performance scores (as mean joint error) will be evaluated:
- Interpolation (INTERP.): performance on test samples that have shape, viewpoints and articulations present in the training set.
- Extrapolation:
- Total (EXTRAP.): performance on test samples that have hand shapes, viewpoints and articulations not present in the training set.
- Shape (SHAPE): performance on test samples that have hand shapes not present in the training set. Viewpoints and articulations are present in the training set.
- Articulation (ARTIC.): performance on test samples that have articulations not present in the training set. Shapes and viewpoints are present in the training set.
- Viewpoint (VIEWP.): performance on test samples that have viewpoints not present in the training set. Shapes and articulations are present in the training set. Viewpoint is defined as elevation and azimuth angles of the hand respect to the camera. Both angles are analyzed independently.
- Use of fitted MANO model for synthesizing data is encouraged.
- Images are captured with Intel RealSense SR300 camera at 640 × 480-pixel resolution.
- Use of training data from HANDS 2017 challenge is not allowed as some images may overlap with the test set.
- Use of other labelled datasets (either real or synthetic) is not allowed. Use of fitted MANO model for synthesizing data is encouraged. Use of external unlabelled data is allowed (self-supervised and unsupervised methods).