Javascript must be enabled for the correct page display

Order parameter based study of Hidden unit specialization in Neural Networks for ReLU and Sigmoidal activation functions.

Bari Tamboli, Neha Rajendra (2020) Order parameter based study of Hidden unit specialization in Neural Networks for ReLU and Sigmoidal activation functions. Master's Internship Report, Computing Science.

[img]
Preview
Text
mCS_2020_N.R.BariTamboli.pdf

Download (3MB) | Preview
[img] Text
toestemming.pdf
Restricted to Registered users only

Download (120kB)

Abstract

In this work we study the generalization properties of a shallow neural network as a function of the size of the dataset. We make comparisons with theoretical results based on statistical mechanics of equilibrium and the central limit theorem. We consider a student-teacher model training scenario where we train the student network and observe the overlap between its weights and the weights of the teacher neural network. To understand the learning behaviour of the student neural network, we use statistical mechanics inspired methods of tracking the order parameters to spot a phase transition in the network. These order parameters enable us to track the learning behaviour without having to track the individual weights, i.e. they give us a macroscopic view of learning. The activation functions used for this experiment are ReLU and the shifted sigmoid. We analyze systems of two different sizes and observe differences in the student learning behaviour with respect to hidden unit specialization for these activation functions. Specialized network configurations exist for sigmoid and ReLU where the plateau states representing poor generalization are overcome as the size of the dataset increases. However, with ReLU activation function for larger dataset and different initial conditions lead to specialization or anti-specialization of hidden units. The anti-specialized state with good generalization but worse than the specialized state.

Item Type: Thesis (Master's Internship Report)
Supervisor name: Biehl, M. and Straat, M.J.C.
Degree programme: Computing Science
Thesis type: Master's Internship Report
Language: English
Date Deposited: 07 Dec 2020 11:21
Last Modified: 07 Dec 2020 11:21
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/23669

Actions (login required)

View Item View Item