High Frequency Residual Learning for Multi-Scale Image Classification

Bowen Cheng (UIUC), Rong Xiao (Ping An), Jianfeng Wang (Microsoft Research), Thomas Huang (UIUC), Lei Zhang (Microsoft)

Abstract
We present a novel high frequency residual learning framework, which leads to a highly efficient multi-scale network (MSNet) architecture for mobile and embedded vision problems. The architecture utilizes two networks: a low resolution network to efficiently approximate low frequency components and a high resolution network to learn high frequency residuals by reusing the upsampled low resolution features. With a classifier calibration module, MSNet can dynamically allocate computation resources during inference to achieve a better speed and accuracy trade-off. We evaluate our methods on the challenging ImageNet-1k dataset and observes consistent improvements over different base networks. On ResNet-18 and MobileNet with alpha=1.0, MSNet gains 1.5% over both architectures without increasing computations. On the more efficient MobileNet with alpha=0.25, our method gains 3.8% with the same amount of computations.

DOI
10.5244/C.33.20
https://dx.doi.org/10.5244/C.33.20

Files
Paper (PDF)
Supplementary material (PDF)

BibTeX
@inproceedings{BMVC2019,
title={High Frequency Residual Learning for Multi-Scale Image Classification},
author={Bowen Cheng and Rong Xiao and Jianfeng Wang and Thomas Huang and Lei Zhang},
year={2019},
month={September},
pages={20.1--20.13},
articleno={20},
numpages={13},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.20},
url={https://dx.doi.org/10.5244/C.33.20}
}