Javascript must be enabled for the correct page display

Learning of single-layer neural networks: ReLU vs. sigmoidal activation

Oostwal, Elisa (2019) Learning of single-layer neural networks: ReLU vs. sigmoidal activation. Master's Internship Report, Computing Science.

[img]
Preview
Text
Learning of single-layer neural networks - ReLU vs. sigmoidal activation.pdf

Download (438kB) | Preview
[img] Text
toestemming.pdf
Restricted to Registered users only

Download (179kB)

Abstract

Due to great advancements in hardware and the availability of large amounts of data, the topic of artificial neural networks has regained interest from scientists. Conventionally, sigmoidal activation is used in the hidden units of these networks. However, so-called rectified linear units (ReLU) have been proposed as a better alternative, due to the activation function's computational ease and higher training rate compared to sigmoidal activation. These claims are however mainly based on empirical data, and thus a theoretical approach is needed to understand the fundamental differences between sigmoidal activation and ReLU activation, if there are any at all. In this study we have investigated why ReLU might perform better than sigmoidal units by researching their fundamental differences in the context of off-line learning, using a statistical physics approach. We have restricted ourselves to shallow networks with a single hidden layer as a first model system. We found that, while sigmoidal undergoes a first order phase transition for three hidden units, ReLU still experiences a second order phase transition in this case, which is beneficial for the performance. This provides theoretical evidence that indeed ReLU performs better than sigmoidal, at least for this small number of units.

Item Type: Thesis (Master's Internship Report)
Supervisor name: Biehl, M. and Straat, M.J.C.
Degree programme: Computing Science
Thesis type: Master's Internship Report
Language: English
Date Deposited: 27 Nov 2019
Last Modified: 29 Nov 2019 11:51
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/21258

Actions (login required)

View Item View Item