Noord, R.I.K. (2016) A supervised approach to categorizing Dutch Twitter events. Master's Thesis / Essay, Human-Machine Communication.
|
Text
Final_Thesis_Rik_van_Noord_31_maart.pdf - Published Version Download (1MB) | Preview |
|
Text
Toestemming.pdf - Other Restricted to Backend only Download (607kB) |
Abstract
In this thesis we applied a supervised machine learning approach to automatically categorize Dutch Twitter events. One of the ten categories used is the category social action which aims to predict civil unrest. Reliably detecting such events might have great practical value, since we are then able to alert the authorities when a (possibly violent) social action will take place. We employ the existing event set of Kunneman and Van den Bosch (2015), who used explicit future time expressions to identify events. We show that it is difficult to categorize all events automatically, since the classifications are biased towards the dominant category public event. However, our general categorization system offers comparable performance to the best known approach in the literature and is even suggested to outperform that approach when categorizing the full event set of 93,901 events. We find that our final categorization system is very precise in its predictions for non-dominant categories, but that it does not offer those predictions very often. We obtained a 80% precision for detecting social action events, but also a low estimated recall. Due to this low recall and since we were limited to Twitter data, our approach got outperformed by the best known approach of predicting civil unrest. However, a follow-up approach that utilizes the ranking of the Bayesian probabilities increased the recall of social action events by 232%, while decreasing the precision by only 14%.
Item Type: | Thesis (Master's Thesis / Essay) |
---|---|
Degree programme: | Human-Machine Communication |
Thesis type: | Master's Thesis / Essay |
Language: | English |
Date Deposited: | 15 Feb 2018 08:11 |
Last Modified: | 15 Feb 2018 08:11 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/13784 |
Actions (login required)
View Item |