Javascript must be enabled for the correct page display

Predicting online article popularity based on text-generated features

Giagkoulas, Panagiotis (2020) Predicting online article popularity based on text-generated features. Master's Thesis / Essay, Artificial Intelligence.

[img]
Preview
Text
mAI_2020_GiagkoulasP.pdf

Download (1MB) | Preview
[img] Text
toestemming.pdf
Restricted to Registered users only

Download (98kB)

Abstract

Predicting the popularity of online content can have great value for all parties involved, from content creators and editors to technical and marketing personnel. Many methods have been developed and they either rely on the first available indicators of popularity right after publication of the content or rely solely on the content itself in order to make a prediction well before publication. This latter approach is called cold-start prediction and it will be the focus of this thesis. Content-wise we will focus on online news articles. We will employ embedding techniques, that have not yet been investigated thoroughly in cold-start popularity prediction, to encode the main text of our articles. We will use these encodings to train simple predictive models for both binary classification and regression tasks. Our main aim is to develop a proof of concept for the suitability of our embedding methods of choice in this task. To develop a reliable and informative proof of concept we experiment with three encoding methods to represent the texts of the articles, namely tf-idf indexes, Word2Vector averaged word embeddings and Document2Vector document embeddings. For predictive models, we experiment with Logistic Regression, Support Vector Machines for classification (SVCs) and regression (SVRs) and Linear Regression. Our system shows promise in the binary classification task while it performs poorly on the regression task. The best performing models are Document2Vector for encoding

Item Type: Thesis (Master's Thesis / Essay)
Supervisor name: Wiering, M.A.
Degree programme: Artificial Intelligence
Thesis type: Master's Thesis / Essay
Language: English
Date Deposited: 23 Jan 2020 11:39
Last Modified: 23 Jan 2020 11:39
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/21436

Actions (login required)

View Item View Item