Groot, W. de (2013) Adding semantic similarity features to Dutch coreference resolution. Bachelor's Thesis, Artificial Intelligence.
|
Text
AI_BA_2013_W.DeGroot.pdf - Published Version Download (254kB) | Preview |
|
Text
AkkoordSpenader.pdf - Other Restricted to Registered users only Download (39kB) |
Abstract
In automatic coreference resolution the object is to identify when two noun phrases refer to the same entity in the world. In this paper I use the Dutch language Knack-2002 coreference annotated corpus and example-based supervised machine learning to experiment with adding semantic similarity features to a standard set of linguistic features used in previous work. I use the Cornetto database to add features for WordNet semantic classes and three semantic similarity metrics based on work by Lin (1998), Jiang and Conrath (1997) and Resnik (1995), respectively. Performance is tested using TiMBLs k-nearest neighbors algorithm on data split into sets with common noun, proper noun and pronoun anaphors; I find that the metric from Jiang & Conrath improves resolution for all noun types. All features combined improve resolution substantially: 21.1% over baseline for common nouns, 2.7% for pronouns and 11.4% for proper nouns.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 15 Feb 2018 07:52 |
Last Modified: | 15 Feb 2018 07:52 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/10925 |
Actions (login required)
View Item |