David, Roy (2023) Probing Pre-trained Large Language Models for Narrative Coherence. Master's Thesis / Essay, Computational Cognitive Science.
|
Text
mCCS_2023_DavidRA.pdf Download (2MB) | Preview |
|
![]() |
Text
toestemming.pdf Restricted to Registered users only Download (135kB) |
Abstract
The extend to which PTLLMs capture narrative coherence, given (coherent) sequences of text and a set of possible ending sequences, in a zero-shot, multilingual setting has not been explored yet. This research presents an extensive study of the abilities of six PTLLMs, to encode narrative coherence across sixteen datasets, with varying narrativity types and coherence complexity. In addition we introduce a (small) language specific dataset for Dutch. Our results show that these PTLLMs can capture narrative coherence mostly when having access to the full text and in simple cases, namely when the possible follow-up sequences do not present subtle linguistic differences and do not require complex commonsense reasoning. In most of these instances, the higher layers (8-12) yield the best performance. Moreover, when the data presented consists of short, coherent sentences with subtle linguistic differences between possible ending-sequences, the models’ performance tends to drop (≈0.2 points) compared to the simple(r) cases, however still capturing (some) coherence. However, the models fail to capture coherence when the data presented consists of long(er) format sen- tences and subtle linguistic differences are present between the possible follow-up sequences. At the same time, simple probes show competitive results when compared to state-of-the-art systems on the same task and outperform all our baselines.
Item Type: | Thesis (Master's Thesis / Essay) |
---|---|
Supervisor name: | Jones, S.M. and Caselli, T. |
Degree programme: | Computational Cognitive Science |
Thesis type: | Master's Thesis / Essay |
Language: | English |
Date Deposited: | 21 Aug 2023 10:25 |
Last Modified: | 21 Aug 2023 10:25 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/31225 |
Actions (login required)
![]() |
View Item |