Javascript must be enabled for the correct page display

The paradox of overfitting

Nannen, V. (2003) The paradox of overfitting. Master's Thesis / Essay, Artificial Intelligence.

scriptie-volker-nannen.pdf - Published Version

Download (1MB) | Preview


Model selection plays an important part in machine learning and in artificial intelligence in general. A central problem to model selection is overfitting. It can be measured as the generalization error on test samples. Minimizing the generalization error is not the only definition of a good model. Resemblance of the original function and minimum randomness deficiency are others. While we are not interested in resemblance of the original function we do want to know if minimum randomness deficiency and minimizing the generalization error can be combined. Kolmogorov complexity is a powerful mathematical tool. It has resulted in the theory of MDL for model selection. MDL selects a model that minimizes the combined complexity of model and data, known as the two-part code. It has been proven that MDL minimizes randomness deficiency [VV02]. To efficiently map the performance of MDL on as broad a selection of problems as possible we introduced our own application, the Statistical Data Viewer. While intended for the advanced scientist, it has an interface that is easy enough to use to allow uninitiated students to explore and comprehend the problems of model selection. Its interactive plots and sophisticated editor allow for a fast and efficient setup and execution of controlled experiments. All the relevant aspects of model selection are implemented. The problem space consists of two-dimensional regression problems and the models are polynomials. To objectively measure the generalization error we use i.i.d.test samples.

Item Type: Thesis (Master's Thesis / Essay)
Degree programme: Artificial Intelligence
Thesis type: Master's Thesis / Essay
Language: English
Date Deposited: 15 Feb 2018 07:29
Last Modified: 15 Feb 2018 07:29

Actions (login required)

View Item View Item