Adversarial View-Consistent Learning for Monocular Depth Estimation

Yixuan Liu (Tsinghua University), Yuwang Wang (Microsoft Research), Shengjin Wang (Tsinghua University)

Abstract
This paper addresses the problem of Monocular Depth Estimation (MDE). Existing approaches on MDE usually model it as a pixel-level regression problem, ignoring the underlying geometry property. We empirically find this may result in sub-optimal solution: while the predicted depth map presents small loss value in one specific view, it may exhibit large loss if viewed in different directions. In this paper, inspired by multi-view stereo (MVS), we propose an Adversarial View-Consistent Learning (AVCL) framework to force the estimated depth map to be all reasonable viewed from multiple views. To this end, we first design a differentiable depth map warping operation, which is end-to-end trainable, and then propose a pose generator to generate novel views for a given image in an adversarial manner. Collaborating with the differentiable depth map warping operation, the pose generator encourages the depth estimation network to learn from hard views, hence produce view-consistent depth maps . We evaluate our method on NYU Depth V2 dataset and the experimental results show promising performance gain upon state-of-the-art MDE approaches.

DOI
10.5244/C.33.3
https://dx.doi.org/10.5244/C.33.3

Files
Paper (PDF)

BibTeX
@inproceedings{BMVC2019,
title={Adversarial View-Consistent Learning for Monocular Depth Estimation},
author={Yixuan Liu and Yuwang Wang and Shengjin Wang},
year={2019},
month={September},
pages={3.1--3.12},
articleno={3},
numpages={12},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.3},
url={https://dx.doi.org/10.5244/C.33.3}
}