Fast-SCNN: Fast Semantic Segmentation Network

Rudra Poudel (Tosihiba Research Europe, Ltd.), Stephan Liwicki (Toshiba Research Europe, Ltd.), Roberto Cipolla (University of Cambridge)

Abstract
The encoder-decoder framework is state-of-the-art for offline semantic image segmentation. Since the rise in autonomous systems, real-time computation is increasingly desirable. In this paper, we introduce fast segmentation convolutional neural network (Fast-SCNN), an above real-time semantic segmentation model on high resolution image data (1024x2048px) suited to efficient computation on embedded devices with low memory and power. We introduce a `learning to downsample' module which computes low-level features for multiple resolution branches simultaneously. Our network combines spatial detail at high resolution with deep features extracted at lower resolution, yielding an accuracy of 68.0% mean intersection over union at 123.5 frames per second on full scale image of Cityscapes. We also show that large scale pre-training is unnecessary. We thoroughly validate our experiments with ImageNet pre-training and the coarse labeled data of Cityscapes. Finally, we show even faster computation with competitive results on subsampled inputs, without any network modifications.

DOI
10.5244/C.33.187
https://dx.doi.org/10.5244/C.33.187

Files
Paper (PDF)

BibTeX
@inproceedings{BMVC2019,
title={Fast-SCNN: Fast Semantic Segmentation Network},
author={Rudra Poudel and Stephan Liwicki and Roberto Cipolla},
year={2019},
month={September},
pages={187.1--187.12},
articleno={187},
numpages={12},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.187},
url={https://dx.doi.org/10.5244/C.33.187}
}