End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional U-Net
Tuan Anh Nguyen Dang (Cinnamon), Dat Nguyen Thanh (Cinnamon) AbstractInformation extraction from document images has received a lot of attention recently, due to the need for digitizing a large volume of unstructured documents such as invoices, receipts, bank transfers, etc. In this paper, we propose a novel deep learning architecture for end-to-end information extraction on the 2D character-grid embedding of the document, namely the ``Multi-Stage Attentional U-Net''. To effectively capture the textual and spatial relations between 2D elements, our model leverages a specialized multi-stage encoder-decoders design, in conjunction with efficient uses of the self-attention mechanism and the box convolution. Experimental results on different datasets show that our model outperforms the baseline U-Net architecture by a large margin while using 40% less parameters. Moreover, it also significantly improved the baseline in erroneous OCR and limited training data scenario, thus becomes practical for real-world applications.
DOI
10.5244/C.33.170
https://dx.doi.org/10.5244/C.33.170
Files
BibTeX
@inproceedings{BMVC2019,
title={End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional U-Net},
author={Tuan Anh Nguyen Dang and Dat Nguyen Thanh},
year={2019},
month={September},
pages={170.1--170.13},
articleno={170},
numpages={13},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.170},
url={https://dx.doi.org/10.5244/C.33.170}
}
title={End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional U-Net},
author={Tuan Anh Nguyen Dang and Dat Nguyen Thanh},
year={2019},
month={September},
pages={170.1--170.13},
articleno={170},
numpages={13},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Kirill Sidorov and Yulia Hicks},
doi={10.5244/C.33.170},
url={https://dx.doi.org/10.5244/C.33.170}
}