Single Image 3D Hand Reconstruction with Mesh Convolutions

Dominik Kulon (Imperial College London), Haoyang Wang (Imperial College London), Alp Guler (Ariel AI, Imperial College London), Michael Bronstein (Imperial College London), Stefanos Zafeiriou (Imperial College London)

Abstract
Monocular 3D reconstruction of deformable objects, such as human body parts, has been typically approached by predicting parameters of heavyweight linear models. In this paper, we demonstrate an alternative solution that is based on the idea of encoding images into a latent non-linear representation of meshes. The prior on 3D hand shapes is learned by training an autoencoder with intrinsic graph convolutions performed in the spectral domain. The pre-trained decoder acts as a non-linear statistical deformable model. The latent parameters that reconstruct the shape and articulated pose of hands in the image are predicted using an image encoder. We show that our system reconstructs plausible meshes and operates in real-time. We evaluate the quality of the mesh reconstructions produced by the decoder on a new dataset and show latent space interpolation results. Our code, data, and models will be made publicly available.

DOI
10.5244/C.33.131
https://dx.doi.org/10.5244/C.33.131

Files
Paper (PDF)

BibTeX
@inproceedings{BMVC2019,
title={Single Image 3D Hand Reconstruction with Mesh Convolutions},
author={Dominik Kulon and Haoyang Wang and Alp Guler and Michael Bronstein and Stefanos Zafeiriou},
year={2019},
month={September},
pages={131.1--131.14},
articleno={131},
numpages={14},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.131},
url={https://dx.doi.org/10.5244/C.33.131}
}