Shahkhan, Mohammad Siam (2023) The Impact of Class-Based Noise on Bayesian and Frequentist Logistic Regression. Bachelor's Thesis, Mathematics.
|
Text
bMath_2023_ShahkhanMS.pdf Download (348kB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (130kB) |
Abstract
Real world data often contains noise, thus, in classification tasks it is important that the algorithms used are robust against noise. In this research, the aim is to compare the performance of Frequentist and Bayesian methods, specifically when it comes to this fundamental issue; handling noisy data sets. In regard to the Frequentist method, we will be looking at AIC or Akaike information criterion and BIC or Bayesian information criterion. For the Bayesian approach, Metropolis-Hastings MCMC and Reversible Jump MCMC (RJMCMC) Processes are evaluated. The data set used is a benchmark data set; Breast Cancer Wisconsin Diagnostic data set that has binary labels for classification. The algorithms are trained and tested on the data set. Furthermore the accuracy of these algorithms are compared against the increasing class-based noise levels in the training data. In addition to that, the classification threshold will also be changed to observe its effects. This study shows that under increasing class-based noise RJMCMC performs with the best accuracy. A significant drawback of the RJMCMC algorithm is its computational complexity when contrasted with the GLM stepwise AIC and BIC procedures. While the current study focused on specific noise levels and data sets, future work could explore different noise structures.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Supervisor name: | Grzegorczyk, M.A. and Krijnen, W.P. |
Degree programme: | Mathematics |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 28 Aug 2023 07:57 |
Last Modified: | 28 Aug 2023 07:58 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/31242 |
Actions (login required)
View Item |