Unified 2D and 3D Hand Pose Estimation from a Single Visible or X-ray Image

Akila Pemasiri (Queensland University of Technology), Kien Nguyen Thanh (Queensland University of Technology), Sridha Sridharan (Queensland University of Technology), Clinton Fookes (Queensland University of Technology)

Robust detection of the keypoints of the human hand from a single 2D image is a crucial step in many applications including medical image processing, where X-ray images play a vital role. In this paper, we address the challenging problem of 2D and 3D hand pose estimation from a single hand image, where the image can be either in the visible spectrum or an X-ray. In contrast to the state-of-the-art methods, which are for hand pose estimation on visible images, in this work, we do not incorporate the depth images to the training model, thereby making the pose estimation more appealing for the situations where the access to the depth images is not viable. Besides, by training a unified model for both X-ray and visible images, where each modality captures different information which complements each other, we elevate the accuracy of the overall model. We present a cascaded network architecture which utilizes a template mesh to estimate the deformations in the 2D images where the estimation is propagated in different cascaded levels to increase the accuracy.


Paper (PDF)

title={Unified 2D and 3D Hand Pose Estimation from a Single Visible or X-ray Image},
author={Akila Pemasiri and Kien Nguyen Thanh and Sridha Sridharan and Clinton Fookes},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},