Oostwal, Elisa (2019) Learning of single-layer neural networks: ReLU vs. sigmoidal activation. Master's Internship Report, Computing Science.
|
Text
Learning of single-layer neural networks - ReLU vs. sigmoidal activation.pdf Download (438kB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (179kB) |
Abstract
Due to great advancements in hardware and the availability of large amounts of data, the topic of artificial neural networks has regained interest from scientists. Conventionally, sigmoidal activation is used in the hidden units of these networks. However, so-called rectified linear units (ReLU) have been proposed as a better alternative, due to the activation function's computational ease and higher training rate compared to sigmoidal activation. These claims are however mainly based on empirical data, and thus a theoretical approach is needed to understand the fundamental differences between sigmoidal activation and ReLU activation, if there are any at all. In this study we have investigated why ReLU might perform better than sigmoidal units by researching their fundamental differences in the context of off-line learning, using a statistical physics approach. We have restricted ourselves to shallow networks with a single hidden layer as a first model system. We found that, while sigmoidal undergoes a first order phase transition for three hidden units, ReLU still experiences a second order phase transition in this case, which is beneficial for the performance. This provides theoretical evidence that indeed ReLU performs better than sigmoidal, at least for this small number of units.
Item Type: | Thesis (Master's Internship Report) |
---|---|
Supervisor name: | Biehl, M. and Straat, M.J.C. |
Degree programme: | Computing Science |
Thesis type: | Master's Internship Report |
Language: | English |
Date Deposited: | 27 Nov 2019 |
Last Modified: | 29 Nov 2019 11:51 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/21258 |
Actions (login required)
View Item |