Feature relevance bounds in classification problems

Zijlstra, Rogchert (2022) Feature relevance bounds in classification problems. Master's Thesis / Essay, Computing Science.

Preview

Text
mCS_2022_Rogchert_Zijlstra.pdf
Download (3MB) | Preview

Text
toestemming.pdf
Restricted to Registered users only
Download (126kB)

Abstract

Generalized Matrix Learning Vector Quantization is a powerful tool for prototype-based classification. The results can also be interpreted relatively easily through the relevances obtained. The relevances obtained from GMLVQ do not show the full picture. Correlated features can greatly affect results between different trainings and ambiguous relevances can appear, which limits the reliability of the interpretation. In this thesis, we look at the interpretation of relevances. We introduce a method which allows us to look at the relevance bounds, which show the range of values a relevance can take while still retaining classification accuracy.In addition, we introduce a new dimensionality reduction technique which is able to significantly reduce the number of features during training. This allows us to calculate the relevance bounds for bigger, more complicated data sets. We apply our methods to several data sets. A set of mock data sets is used to examine the advantages and challenges of this method. Finally, we apply the method to a real world data set which classifies merger and non-merger galaxies. This data set is big enough to require the use of dimensionality reduction. The results show us that this method can effectively improve the interpretability or the relevances.

Item Type:	Thesis (Master's Thesis / Essay)
Supervisor name:	Biehl, M. and Bunte, K. and Nolte, A.F.
Degree programme:	Computing Science
Thesis type:	Master's Thesis / Essay
Language:	English
Date Deposited:	29 Mar 2022 12:47
Last Modified:	29 Mar 2022 12:47
URI:	https://fse.studenttheses.ub.rug.nl/id/eprint/26780

Actions (login required)

View Item