Joint Learning of Attended Zero-Shot Features and Visual-Semantic Mapping

Yanan Li (Zhejiang Lab), Donghui Wang (Zhejiang University)

Abstract
Zero-shot learning (ZSL) aims to recognize unseen categories by associating image features with semantic embeddings of class labels and its performance can be improved progressively through learning better features and more generalized visual-semantic mapping (V-S mapping) to unseen classes. Current methods typically learn feature extractors and V-S mapping independently. In this work, we propose a simple but effective joint learning framework with fused autoencoder (AE) paradigm, which can simultaneously learn features specific to ZSL task as well as V-S mapping inseparable to learning features. In particular, the encoder in AE can not only transfer semantic knowledge to the feature space, but also achieve semantics-guided attended feature learning. At the same time, the decoder in AE can be used as a V-S mapping, which further improves the generalization ability to unseen classes. Extensive experiments show that the proposed approach can achieve promising results.

DOI
10.5244/C.33.138
https://dx.doi.org/10.5244/C.33.138

Files
Paper (PDF)

BibTeX
@inproceedings{BMVC2019,
title={Joint Learning of Attended Zero-Shot Features and Visual-Semantic Mapping},
author={Yanan Li and Donghui Wang},
year={2019},
month={September},
pages={138.1--138.12},
articleno={138},
numpages={12},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.138},
url={https://dx.doi.org/10.5244/C.33.138}
}