Monne, Lucas (2022) Understanding Vision Transformers Through Transfer Learning. Bachelor's Thesis, Artificial Intelligence.
|
Text
bAI_2022_MonneL.pdf Download (1MB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (132kB) |
Abstract
Recently, transformers, originally purposed for natural language processing, have begun demonstrating a strong potential to not only compete, but outperform convolutional neural networks (CNN) in machine vision tasks. This paper investigates the transfer learning potential of vision transformers (ViT) in differing contexts, such as with small sample sizes and low- and high-degree differences between the source and target domains. Ultimately, when compared to state of the art CNNs, the ViT significantly outperforms the former on the grand majority of the carried experiments. Particularly, ViTs transfer with ease to depth-prediction tasks regardless of sample size. Results align with previous research, exposing new questions regarding the structure and a possible trade-off of performance versus training time and suggest real-life use cases in biomedical, material and processing industries where the conditions fit the experimental environment used.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Supervisor name: | Sabatelli, M. |
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 07 Feb 2022 08:43 |
Last Modified: | 07 Feb 2022 08:43 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/26543 |
Actions (login required)
View Item |