Pre-Processing of LC-MS data in Proteogenomics

Kragh Kristensen, Sebastian (2019) Pre-Processing of LC-MS data in Proteogenomics. Master's Thesis / Essay, Molecular Biology and Biotechnology (2016-2019).

Preview

Text
mMBB_2019_SebastianKristensen.pdf
Download (1MB) | Preview

Text
toestemming.pdf
Restricted to Registered users only
Download (144kB)

Abstract

The omics field, in particular genomics, transcriptomics and proteomics, is a fundamental source of knowledge with countless applications used in life sciences. These high throughput strategies have converged synergistically into a new field, proteogenomics. The domain of proteogenomics is still in a relatively early phase compared to other omics technologies, and new innovative approaches, incorporating proteogenomics strategies, are continuously being developed. By generating customized protein databases instead of using canonical public databases, researchers are successfully identifying novel proteins and protein-coding loci, thereby refining gene models and exposing potential drug targets. However, the effectiveness, in terms of data generation and analysis, is yet to be optimized before proteogenomics can lead to standardization of personalized medicine. In this literature study current approaches for overcoming challenges associated with proteogenomics have been reviewed. Key topics addressed in this study include (I) Customized protein database generation, from nucleotide sequencing through the analysis pipeline eventually giving rise to predicted proteins. (II) Database search and false discovery rate (FDR). FDR values are correlated with the peptide score-cut off threshold, specified by the researcher, and can be estimated using the target/decoy strategy. (III) Approaches for improving proteome coverage. When analyzing highly complex protein samples it is a challenge to reach full proteome coverage, since a fraction of proteins may be masked. Several strategies can be applied in order to increase the number of positive protein identifications. Using multidimensional chromatography, the complexity of the sample can be reduced by extensive peptide separation prior to MS. Multiple peptide digestion methods may be applied in order to produce complementary data containing peptides that may not be seen when relying on a single digestion method, such as tryptic digestion. Consistent research in the proteogenomics field will extend the usage of these tools, and as technology advances the potentials are continuously reaching new heights.

Item Type:	Thesis (Master's Thesis / Essay)
Supervisor name:	Horvatovich, P.L. and Kok, J.
Degree programme:	Molecular Biology and Biotechnology (2016-2019)
Thesis type:	Master's Thesis / Essay
Language:	English
Date Deposited:	20 Aug 2019
Last Modified:	20 Aug 2019 10:49
URI:	https://fse.studenttheses.ub.rug.nl/id/eprint/20716

Actions (login required)

View Item