Dongen, T.A. van (2021) Quality Prediction of Scientific Documents Using Textual and Visual Content. Master's Thesis / Essay, Artificial Intelligence.
|
Text
Masters_thesis_Thomas_van_Dongen_updated.pdf Download (1MB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (117kB) |
Abstract
In this thesis, multiple methods are proposed to improve upon the task of scholarly document quality prediction (SDQP). Specifically, the two sub-tasks of accept/reject prediction and number of citation prediction are used as measures of quality. Automatic prediction of the quality of scholarly documents is an important task due to the increasing number of submissions to scientific journals and venues, which struggle to keep up with the demand for adequate reviewers. The proposed models focus solely on the textual and visual content of documents. A textual model called SChuBERT is proposed which uses a chunking method to apply BERT to long documents. A visual model called INCEPTIONgu is proposed, which is a modified version of an existing model that uses gradual unfreezing to improve performance. These two models are combined in a model called SChuBERTjoint. The accept/reject prediction task is evaluated on the PeerRead dataset, while for the citation prediction task a new dataset called ACL-BiblioMetry is proposed. Extensive experiments are performed to find the optimal method of concatenation for the textual and visual embeddings. Furthermore, experiments are performed to evaluate whether multi-task learning can improve results, which is not found to be the case. The SCHuBERTjoint model significantly improves performance on both tasks when compared to previous baselines.
Item Type: | Thesis (Master's Thesis / Essay) |
---|---|
Supervisor name: | Schomaker, L.R.B. and Maillette de Buij Wenniger, G.E. |
Degree programme: | Artificial Intelligence |
Thesis type: | Master's Thesis / Essay |
Language: | English |
Date Deposited: | 25 Mar 2021 11:27 |
Last Modified: | 25 Mar 2021 11:27 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/24117 |
Actions (login required)
View Item |