Comparing graphical models with pseudo-likelihood methods and their Bayesian counterparts on gene regulatory networks.

Hekkelman, G. (2017) Comparing graphical models with pseudo-likelihood methods and their Bayesian counterparts on gene regulatory networks. Master's Thesis / Essay, Mathematics.

Preview

Text
thesis.pdf - Published Version
Download (4MB) | Preview

Text
Toestemming.pdf - Other
Restricted to Backend only
Download (79kB)

Abstract

In biology, a set of regulatory genes can have different interactions with each other. By measuring gene expression data, it becomes possible to infer a network from it. Such a network is called a gene regulatory network. In this thesis, the principle and relation of pseudo-likelihood and graphical models is discussed and applied to graphical data. For both methods, different models on homogeneous and mixture data are used. An algorithm to estimate different components of mixture data is the expectation- maximization (EM) algorithm. Its properties and the numerical difficulties that may arise are discussed. Also, simulation studies are performed for these models on homogeneous and mixture data where the mixture data consists of two components. Here, the simulated mixture data has components of which the parameters are similar. For this data, the methods are used to recon- struct an underlying network. These type of simulations usually take days, therefore only the case is considered in which estimating the different components is not clear from an exploratory analysis. Synthetic biology allows for new constructions of a gene regulatory network to seed new func- tions within a cell. Combining the inferences with the new constructions, yields the problem of network reconstruction. An example of this is the yeast synthetic network data set (Cantone et al., 2009). For this data set, both aforementioned methods which are frequentist statistical methods are compared with their Bayesian counterparts. Partial F-tests and t-tests are used to test if if the data is homogenous or mixture distributed. As a result, although if data has a mixture distribu- tion, the homogeneous model is often able to explain the behavior of the underlying network. All R-code is available on https://github.com/gerard1911/msc-thesis.

Item Type:	Thesis (Master's Thesis / Essay)
Degree programme:	Mathematics
Thesis type:	Master's Thesis / Essay
Language:	English
Date Deposited:	15 Feb 2018 08:26
Last Modified:	15 Feb 2018 08:26
URI:	https://fse.studenttheses.ub.rug.nl/id/eprint/14918

Actions (login required)

View Item