International Conference on Information Technology and Digital Applications 2018 | |
Digital conversion model for hand-filled forms using optical character recognition (OCR) | |
计算机科学;无线电电子学 | |
Adriano, J.E.M.^1 ; Calma, K.A.S.^1 ; Lopez, N.T.^1 ; Parado, J.A.^1 ; Rabago, L.W.^1 ; Cabardo, J.M.^1 | |
Asia Pacific College, 3 Humabon Place, Magallanes, Makati City | |
1232, Philippines^1 | |
关键词: Character classification; Character segmentation; Convolutional neural network; Digital conversion; Image preprocessing; K nearest neighbor (KNN); Manual process; Optical character recognition (OCR); | |
Others : https://iopscience.iop.org/article/10.1088/1757-899X/482/1/012049/pdf DOI : 10.1088/1757-899X/482/1/012049 |
|
学科分类:计算机科学(综合) | |
来源: IOP | |
【 摘 要 】
The process of manual data entry used by several industries garners a high error rate. This is because the manual process relies too heavily on a human's capability to interpret handwritten forms. To reduce the high error rate of data entry, the researchers explored the different processes that comprise optical character recognition (OCR) and used it on a novel digital conversion model for hand-filled forms. The OCR process is made up of 4 major phases. The techniques for each stage are as follows: Sauvola binarization for image pre-processing; blob analysis for character segmentation; pre-trained Convolutional Neural Networks: GoogLeNet, AlexNet, and VGG16 for feature extraction, and Support Vector Machines (SVM), K-Nearest Neighbor (KNN), and Naïve Bayes for classification. The novel combination of Convolutional Neural Networks for feature extraction coupled with SVM for character classification showed promising results, going up to 98.62% in accuracy and 65.31% in F-Score.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Digital conversion model for hand-filled forms using optical character recognition (OCR) | 923KB | download |