Weakly-Supervised 3D Pose Estimation from a Single Image using Multi-View Consistency

Guillaume Rochette (University of Surrey), Chris Russell (University of Surrey), Richard Bowden (University of Surrey)

Abstract
We present a novel data-driven regularizer for weakly-supervised learning of 3D human pose estimation that eliminates the drift problem that effects existing approaches. We do this by moving the stereo reconstruction problem into the loss of the network itself. This avoids the need to reconstruct data prior to training and unlike previous semi-supervised approaches, avoids the need for a warm-up period of supervised training. The conceptual and implementational simplicity of our approach is fundamental to its appeal. Not only is it straightforward to augment many weakly-supervised approaches with our additional re-projection based loss, but it is obvious how it shapes reconstructions and prevents drift. As such we believe it will be a valuable tool for any researcher working in weakly-supervised 3D reconstruction. Evaluating on Panoptic, the largest multi-camera and markerless dataset available, we obtain an accuracy that is essentially indistinguishable from a fully supervised approach making full use of 3D ground truth in training.

DOI
10.5244/C.33.107
https://dx.doi.org/10.5244/C.33.107

Files
Paper (PDF)

BibTeX
@inproceedings{BMVC2019,
title={Weakly-Supervised 3D Pose Estimation from a Single Image using Multi-View Consistency},
author={Guillaume Rochette and Chris Russell and Richard Bowden},
year={2019},
month={September},
pages={107.1--107.14},
articleno={107},
numpages={14},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.107},
url={https://dx.doi.org/10.5244/C.33.107}
}