Boneschansker, Maarten (2022) RiPP precursor prediction using machine learning on conservation patterns. Master's Research Project 2, Biology.
|
Text
mBIO_2022_BoneschanskerMP.pdf Download (3MB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (93kB) |
Abstract
Genome mining holds great promise for a new ’Golden Age’ in natural product discovery. Given the developments in bioinformatics and the current torrent of genome data, the future looks bright for the field. It is also necessary as a decrease in discovery of new compounds, notably also because of a lack of interest in natural discovery, coincides with a steep rise in antimicrobial resistance - with as many as 50.000.000 to die a year worldwide in 2050. Many classes of natural products can be mined from genomes directly, one class specifically well-suited for this are ribosomally and post translationally modified peptides (RiPPs), and new tools based on machine learning meth- ods have shown their value predicting RiPPs. One such tool is decRiPPter, which predicts RiPP biosynthetic gene clusters using an SVM classifier. However, as an exploratory tool decRiPPter prizes novelty over accuracy and a large amount of false positives is thus expected. RiPP precursor peptides have a unique leader-core structure which has been shown to be differentially conserved. Presented here is a bioinformatic pipeline that predicts RiPPs in genomic data using a random forest model trained on conservation patterns. The presented model achieves a high accuracy, most notably with a very low false positive rate in held-out validation. Cross-validation experiments show that the model is also able to distinguish negative from positive training data well.
Item Type: | Thesis (Master's Research Project 2) |
---|---|
Supervisor name: | Doorn, G.S. van |
Degree programme: | Biology |
Thesis type: | Master's Research Project 2 |
Language: | English |
Date Deposited: | 30 Nov 2022 08:46 |
Last Modified: | 30 Nov 2022 08:46 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/29005 |
Actions (login required)
View Item |