Zijlstra, Rogchert (2022) Feature relevance bounds in classification problems. Master's Thesis / Essay, Computing Science.
|
Text
mCS_2022_Rogchert_Zijlstra.pdf Download (3MB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (126kB) |
Abstract
Generalized Matrix Learning Vector Quantization is a powerful tool for prototype-based classification. The results can also be interpreted relatively easily through the relevances obtained. The relevances obtained from GMLVQ do not show the full picture. Correlated features can greatly affect results between different trainings and ambiguous relevances can appear, which limits the reliability of the interpretation. In this thesis, we look at the interpretation of relevances. We introduce a method which allows us to look at the relevance bounds, which show the range of values a relevance can take while still retaining classification accuracy.In addition, we introduce a new dimensionality reduction technique which is able to significantly reduce the number of features during training. This allows us to calculate the relevance bounds for bigger, more complicated data sets. We apply our methods to several data sets. A set of mock data sets is used to examine the advantages and challenges of this method. Finally, we apply the method to a real world data set which classifies merger and non-merger galaxies. This data set is big enough to require the use of dimensionality reduction. The results show us that this method can effectively improve the interpretability or the relevances.
Item Type: | Thesis (Master's Thesis / Essay) |
---|---|
Supervisor name: | Biehl, M. and Bunte, K. and Nolte, A.F. |
Degree programme: | Computing Science |
Thesis type: | Master's Thesis / Essay |
Language: | English |
Date Deposited: | 29 Mar 2022 12:47 |
Last Modified: | 29 Mar 2022 12:47 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/26780 |
Actions (login required)
View Item |