Rogaar, D. (2017) The effect of multimodality on the performance of a convolutional neural network. Bachelor's Thesis, Artificial Intelligence.
|
Text
AI_BA_2017_s2393344.pdf - Published Version Download (551kB) | Preview |
|
Text
toestemming.pdf - Other Restricted to Backend only Download (77kB) |
Abstract
A classical convolutional neural network is connected to two inputs of different modality (language, images, music, etc.) to observe the effect on accuracy. One modality used is iconographic images, representative of some concept. The second modality used is titles for the same concept, converted to images. The modalities are both processed using a convolutional neural network and a fully connected representation layer, with different parameters. Connecting the convolutional networks to a fully connected classification layer allows comparing the resulting bimodal classifier with control-condition networks. Initial results show that overfit representations disabled transfer learning which was required to connect the modalities. The perceived overfitting was counteracted using dropout, after which the bimodal network has significantly more accuracy (98.6%) than the control-conditions (93.6% on images and 93.0% on words) when given both inputs.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 15 Feb 2018 08:31 |
Last Modified: | 15 Feb 2018 08:31 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/15776 |
Actions (login required)
View Item |