Javascript must be enabled for the correct page display

Visual Question Answering With Enhanced Question-Answer Diversity

Visser, Jelle (2020) Visual Question Answering With Enhanced Question-Answer Diversity. Master's Thesis / Essay, Artificial Intelligence.

[img] Text
toestemming.pdf
Restricted to Registered users only

Download (96kB)
[img]
Preview
Text
master_thesis_jelle_visser_updated.pdf

Download (5MB) | Preview

Abstract

Visual Question Answering (VQA) is a multi-modal Machine Learning task that consists of two inputs, an image and a natural language question about this image, and requires an answer. VQA models are often quite complex and are bad at answering previously unseen questions. Additionally, data collection requires intensive human labor, and it is hard to augment the natural language data. This research proposes a method of enhancing the question-answer (QA) input data, using the Visual Genome dataset to automatically generate new QA-pairs by using its extensive image annotations. We construct two baseline VQA-models, one that chooses an answer from a pre-defined list and one that generates an answer word-by-word with an LSTM, and train them on four datasets with different degrees of data augmentation. We compare the models using several metrics while testing on a holdout test set from the original dataset. Additionally, experiments are conducted where we measure how adding noise to the question embedding affects the performance of both baseline models, as an indication for robustness to uncertainty in the question input. We find that training on augmented datasets slightly decreases performance on the holdout test set for both baseline models. All models, however, show to be highly resistant to noise on the question embedding. Additionally, models trained on the augmented datasets appear to be more resistant to noise compared to models trained on the original dataset.

Item Type: Thesis (Master's Thesis / Essay)
Supervisor name: Schomaker, L.R.B. and Maillette de Buij Wenniger, G.E.
Degree programme: Artificial Intelligence
Thesis type: Master's Thesis / Essay
Language: English
Date Deposited: 24 Aug 2020 13:31
Last Modified: 01 Sep 2020 12:17
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/23181

Actions (login required)

View Item View Item