Javascript must be enabled for the correct page display

Dutch factuality classification: Using machine translation to create a Dutch version of FactBank

Alkemade, H.C. (2016) Dutch factuality classification: Using machine translation to create a Dutch version of FactBank. Bachelor's Thesis, Artificial Intelligence.

[img]
Preview
Text
AI_BA_2016_HARMKEALKMEADE.pdf - Published Version

Download (219kB) | Preview
[img] Text
Toestemming.pdf - Other
Restricted to Backend only

Download (551kB)

Abstract

People refer in texts to events that may or may not have happened. Information about how the writer presents an event is called event factuality. Factuality is separated in certainty and polarity. FactBank is an English corpus consisting of events and their corresponding factuality values. There is no such corpus for Dutch, even though this information could be interesting to have. In this project, TechoMT and Google Translate are used to create a Dutch version of FactBank. Sentences are represented by a word vector using frequency information combined with syntactical distance to represent scope. A stochastic gradient learning routine is trained to make a classifier for Dutch. The classifier is tested on a small Dutch corpus consisting of factuality values. The results show that the certainty classification does not perform better than majority-class baseline. Polarity classification however, does perform better. An F-measure of 0.98 is achieved.

Item Type: Thesis (Bachelor's Thesis)
Degree programme: Artificial Intelligence
Thesis type: Bachelor's Thesis
Language: English
Date Deposited: 15 Feb 2018 08:10
Last Modified: 15 Feb 2018 08:10
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/13669

Actions (login required)

View Item View Item