Javascript must be enabled for the correct page display

DateFinder: detecting date regions on handwritten document images based on positional expectancy

Zhenwei, Shi (2016) DateFinder: detecting date regions on handwritten document images based on positional expectancy. Master's Thesis / Essay, Artificial Intelligence.

[img]
Preview
Text
Master_Thesis_Zhenwei.pdf - Published Version

Download (16MB) | Preview
[img] Text
Toestemming.pdf - Other
Restricted to Backend only

Download (634kB)

Abstract

Whereas Optical Character Recognition (OCR) technology is used for many documents, such as check, passport, bank statement and receipt, there is a showing interest on modelling of occurrence and location of visual items. However, this source of information attracts much less attention in general OCR. For instance, there are many specific visual items (e.g., dates, writer signatures, calligraphy, author markings, schematic drawings, glyphs and even graffiti) that can be used to explain the underlying meaning and origin of documents. Among the aforementioned visual items, dates play a very important role in many documents (e.g., bank cheques, letters, postal mails, bills and diaries), which can provide time-related clues for readers. Also, the date is central to many administrative applications such as document indexing, translation and retrieval. In this thesis, we propose a method called DateFinder for detecting date regions on handwritten document images based on a four-step processing sequence. Firstly, we perform pre-processing operations on original scanned images, which aim to extract appropriate proposed date blocks. Secondly, a positional expectancy model is used for ‘date’ text blocks to measure how much an unknown region is similar to a date region based on its position. Thirdly, feature representation and classification techniques are used to extract features from an extracted block and compute the probability this block is a date region. Finally, we combine the scores of the positional expectancy model and classification to determine whether an extracted block is a date region. In the experiments, we have obtained encouraging results for detecting date regions in our dataset. However, there are still ample opportunities to improve the proposed DateFinder method, which can be considered for future work.

Item Type: Thesis (Master's Thesis / Essay)
Supervisor name: xx, xx
Degree programme: Artificial Intelligence
Thesis type: Master's Thesis / Essay
Language: English
Date Deposited: 15 Feb 2018 08:12
Last Modified: 02 May 2019 09:35
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/13900

Actions (login required)

View Item View Item