End-to-End 3D Hand Pose Estimation from Stereo Cameras

Yuncheng Li (Snap Inc.), Zehao Xue (Snap Inc.), Yingying Wang (Snap Inc.), Liuhao Ge (Nanyang Technological University), Zhou Ren (Wormpex AI Research), Jonathan Rodriguez (Snap Inc.)

Abstract
This work proposes an end-to-end approach to estimate full 3D hand pose from stereo cameras. Most existing methods of estimating hand pose from stereo cameras apply stereo matching to obtain depth map and use depth-based solution to estimate hand pose. In contrast, we propose to bypass the stereo matching and directly estimate the 3D hand pose from the stereo image pairs. The proposed neural network architecture extends from any keypoint predictor to estimate the sparse disparity of the hand joints. In order to effectively train the model, we propose a large scale synthetic dataset that is composed of stereo image pairs and ground truth 3D hand pose annotations. Experiments show that the proposed approach outperforms the existing methods based on the stereo depth.

DOI
10.5244/C.33.38
https://dx.doi.org/10.5244/C.33.38

Files
Paper (PDF)

BibTeX
@inproceedings{BMVC2019,
title={End-to-End 3D Hand Pose Estimation from Stereo Cameras},
author={Yuncheng Li and Zehao Xue and Yingying Wang and Liuhao Ge and Zhou Ren and Jonathan Rodriguez},
year={2019},
month={September},
pages={38.1--38.13},
articleno={38},
numpages={13},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.38},
url={https://dx.doi.org/10.5244/C.33.38}
}