Javascript must be enabled for the correct page display

How Do You Stop a Computer From Being Rude? Assessing Formality in Machine Translation through Interpretability Methods

Sickert, Ludwig (2023) How Do You Stop a Computer From Being Rude? Assessing Formality in Machine Translation through Interpretability Methods. Master's Thesis / Essay, Artificial Intelligence.

[img]
Preview
Text
mAI_2023_SickertL.pdf

Download (2MB) | Preview
[img] Text
toestemming.pdf
Restricted to Registered users only

Download (133kB)

Abstract

Formality is an essential aspect of many languages, and choosing the correct level of formality for specific situations is crucial to avoid misunderstandings. However, many machine translation models struggle to generate sentences with the appropriate formality and to maintain a consistent formality in their translations. Previous studies attempted to address this issue by utilizing special formality labels. However, since many languages already have grammatical features that denote formality, such as formal and informal personal pronouns, it is unclear whether a well-trained model can generate a formality-appropriate output from these features alone. Further, since existing research is focused almost exclusively on English-centric translations, this pattern cannot emerge due to the inherent underspecification of formality in English. The present thesis closes this knowledge gap by quantifying the issue of formality in the non English-centric setting of translating between German and Korean. The thesis analyses whether the model learns any internal representation of formality from the grammatical features inherent to both languages through feature attribution methods. A novel analysis technique called saliency map interpolation is introduced as a side result. The results showed that current machine translation models struggle to generate translations of an appropriate formality in even languages with grammatical formality features and that the chosen model has indeed not learned any internal representation of formality for those languages. A potential issue with COMET scores in the presence of off-target translations is also uncovered.

Item Type: Thesis (Master's Thesis / Essay)
Supervisor name: Rij-Tange, J.C. van and Bisazza, A. and Sarti, G.
Degree programme: Artificial Intelligence
Thesis type: Master's Thesis / Essay
Language: English
Date Deposited: 27 Jul 2023 14:35
Last Modified: 10 Oct 2024 10:14
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/30337

Actions (login required)

View Item View Item