Multimodal grid features and cell pointers for scene text visual question answering

Authors

Keywords

Deep learning, Scene text, Visual question answering, Multi-modal learning, MSC, 41A05, 41A10, 65D05, 65D17

Journal

Volume 150, Issue -, Pages 242-249

Publisher

Elsevier BV

Online

2021-07-20

DOI

10.1016/j.patrec.2021.06.026

References

View 1 related references

Contact the author

Do you already have a recorded webinar? Grow your audience and get more views by easily listing your recording on Peeref.

Upload Now

The Peeref Institute provides free reviewer training that teaches the core competencies of the academic peer review process.

Get Started