Gankema, T. (2013) Mining of words from undersegmented word images using holistic matching and tree-based search. Bachelor's Thesis, Artificial Intelligence.
|
Text
AI_BA_2013_tomgankema.pdf - Published Version Download (455kB) | Preview |
|
Text
AkkoordGankema.pdf - Other Restricted to Registered users only Download (31kB) |
Abstract
To be able to recognize handwritten text, the text needs to be segmented. Without any recognition, errors are unavoidable during that segmentation, whereby images with multiple words can appear. Because these images do not contribute to create better word models, it is desirable to split these images. In our study we looked into undersegmented word images which are already transcribed. With a number of basic constraints it was possible to 'mine' new word instances in case one of the words in a multiple-word image did not have a model yet. This was achieved by building a segmentation graph containing all possible combinations of connected components that could lead to the pattern of the undersegmented image. Then a graph search is used to find the most likely sequence of wordzones in the undersegmented image. The study focused on both the development of this method as well as on creating a heuristic for handling new words.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 15 Feb 2018 07:52 |
Last Modified: | 15 Feb 2018 07:52 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/10914 |
Actions (login required)
View Item |