Learning Depth-aware Heatmaps for 3D Human Pose Estimation in the Wild

Zerui Chen (Chinese Academy of Sciences), Yiru Guo (Beihang University), Yan Huang (Institute of Automation, Chinese Academy of Sciences), Liang Wang (NLPR, China)

Abstract
In this paper, we explore to determine 3D human pose directly from monocular image data. While current state-of-the-art approaches employ the volumetric representation to predict per voxel likelihood for each human joint, the network output is memory-intensive, making it hard to function on mobile devices. To reduce the output dimension, we intend to decompose the volumetric representation into 2D depth-aware heatmaps and joint depth estimation. We propose to learn depth-aware 2D heatmaps via associative embeddings to reconstruct the connection between the 2D joint location and its corresponding depth. Our approach achieves a good trade-off between complexity and high performance. We conduct extensive experiments on the popular benchmark Human3.6M and advance the state-of-the-art accuracy for 3D human pose estimation in the wild.

DOI
10.5244/C.33.226
https://dx.doi.org/10.5244/C.33.226

Files
Paper (PDF)

BibTeX
@inproceedings{BMVC2019,
title={Learning Depth-aware Heatmaps for 3D Human Pose Estimation in the Wild},
author={Zerui Chen and Yiru Guo and Yan Huang and Liang Wang},
year={2019},
month={September},
pages={226.1--226.13},
articleno={226},
numpages={13},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.226},
url={https://dx.doi.org/10.5244/C.33.226}
}