Noord, R.I.K. van (2013) Improving coreference resolution by using noun-phrase clusters. Bachelor's Thesis, Artificial Intelligence.
|
Text
AI_BA_2013_R.I.K.van.Noord.pdf - Published Version Download (560kB) | Preview |
|
Text
NoordRIkAkkoordSpenader.pdf - Other Restricted to Registered users only Download (24kB) |
Abstract
Coreference resolution is the task of determining whether two noun phrases refer to the same entity in the real world. In this study we use automatically generated noun clusters as a semantic information source in our supervised machine learning approach to solve the task of (Dutch) coreference resolution, as was done in Hendrickx et al. (2008), using the clustering method of van der Cruys (2005). We investigate the effect of cluster size (average amount of words per cluster) and test the effect of the clusters on a dataset of only common nouns. We try to find an optimal cluster size and hypothesize that the optimal average size of a cluster will be higher than the average size (10) of the clusters used in Hendrickx et al.(2008). Adding cluster features yielded a 27,9% higher F-score than our baseline, and by testing different cluster sizes we found an optimal cluster size of 10 words per cluster, with slightly lower F-scores the more we move away from 10 words per cluster. The results suggest that an average size of 10 words per cluster is optimal for the task of coreference resolution.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 15 Feb 2018 07:54 |
Last Modified: | 15 Feb 2018 07:54 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/11318 |
Actions (login required)
View Item |