Giagkoulas, Panagiotis (2020) Predicting online article popularity based on text-generated features. Master's Thesis / Essay, Artificial Intelligence.
|
Text
mAI_2020_GiagkoulasP.pdf Download (1MB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (98kB) |
Abstract
Predicting the popularity of online content can have great value for all parties involved, from content creators and editors to technical and marketing personnel. Many methods have been developed and they either rely on the first available indicators of popularity right after publication of the content or rely solely on the content itself in order to make a prediction well before publication. This latter approach is called cold-start prediction and it will be the focus of this thesis. Content-wise we will focus on online news articles. We will employ embedding techniques, that have not yet been investigated thoroughly in cold-start popularity prediction, to encode the main text of our articles. We will use these encodings to train simple predictive models for both binary classification and regression tasks. Our main aim is to develop a proof of concept for the suitability of our embedding methods of choice in this task. To develop a reliable and informative proof of concept we experiment with three encoding methods to represent the texts of the articles, namely tf-idf indexes, Word2Vector averaged word embeddings and Document2Vector document embeddings. For predictive models, we experiment with Logistic Regression, Support Vector Machines for classification (SVCs) and regression (SVRs) and Linear Regression. Our system shows promise in the binary classification task while it performs poorly on the regression task. The best performing models are Document2Vector for encoding
Item Type: | Thesis (Master's Thesis / Essay) |
---|---|
Supervisor name: | Wiering, M.A. |
Degree programme: | Artificial Intelligence |
Thesis type: | Master's Thesis / Essay |
Language: | English |
Date Deposited: | 23 Jan 2020 11:39 |
Last Modified: | 23 Jan 2020 11:39 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/21436 |
Actions (login required)
View Item |