Javascript must be enabled for the correct page display

Exploring Improvements for Gradient Descent Optimization Algorithms in Deep Learning

Elderman, Richard (2019) Exploring Improvements for Gradient Descent Optimization Algorithms in Deep Learning. Master's Thesis / Essay, Artificial Intelligence.

[img]
Preview
Text
Richard Elderman Master Thesis Final V4.pdf

Download (15MB) | Preview
[img] Text
toestemming.pdf
Restricted to Registered users only

Download (120kB)

Abstract

The use of the right optimization algorithm for gradient descent in a deep learning system can have a big influence on the learning performance of the classifier. Currently, there are two families of optimization algorithms commonly used: pure stochastic gradient descent (SGD) with some kind of momentum, and algorithms with adaptive learning rates from which Adam is the most commonly known. In this thesis, it is tried to investigate possible improvements on these state of the art optimization algorithms for gradient descent used in deep learning. Five different new methods are invented that could improve the algorithms. One of them tries to incorporate some braking system in the SGD with momentum algorithm, while the other four use a different gradient history collection (GHC) method for the adaptive learning rate algorithms. These new techniques are compared to the state-of-theart algorithms in 16 different experiments. One experiment is dedicated to a pure optimization setting on a simple convex optimization function, while all other experiments use either the logistic regression method, a standard multi-layer perceptron, or a convolutional neural network as a classifier. The results of the experiments show in general that the adaptation of the SGD with momentum algorithm turns out to perform worse than the standard SGD with momentum algorithm. Two of the four alternative methods for GHC in adaptive learning rate algorithms also perform worse than the standard version, but the other two methods turn out to perform better. These two alternative GHC methods can be proposed as potential improvements of the state of the art algorithms such as Adam.

Item Type: Thesis (Master's Thesis / Essay)
Supervisor name: Wiering, M.A. and Schomaker, L.R.B.
Degree programme: Artificial Intelligence
Thesis type: Master's Thesis / Essay
Language: English
Date Deposited: 11 Mar 2019
Last Modified: 13 Mar 2019 08:51
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/19251

Actions (login required)

View Item View Item