References
1. Sajedi, H. Handwriting recognition of digits, signs,
and numerical strings in Persian", Computers & Electrical
Engineering, 49, pp. 52{65 (2016).
2. Azadnia, M. Presenting an expert system for automatic
correcting Persian texts", International Journal
of Computer Science and Network Security, 8(3), pp.
27{31 (2008).
3. Eikvil, L. Optical Character Recognition, Norsk Regnesentral,
P.B. 114 Blindern, N-0314 Oslo, (Dec. 1993).
4. Singh, A., Bacchuwar, K., and Bhasin, A. A survey of
OCR applications", International Journal of Machine
Learning and Computing, 2(3), pp. 314{318 (2012).
5. Menhaj, M.B. and Adab, M. Simultaneous segmentation
and recognition of Farsi/Latin printed texts with
MLP", Neural Networks, 2002. IJCNN'02. Proceedings
of the 2002 International Joint Conference, 2 (2002).
6. Raymond, S. Hybrid page layout analysis via tabstop
detection", Document Analysis and Recognition,
ICDAR'09. 10th International Conference (2009).
7. Simon, A., Pret, J.-C., and Johnson, A.P. A fast
algorithm for bottom-up document layout analysis",
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 19(3), pp. 273{277 (1997).
8. O'Gorman, L. The document spectrum for page
layout analysis", IEEE Transactions on Pattern Anal3032
Z. Khosrobeigi et al./Scientia Iranica, Transactions D: Computer Science & ... 27 (2020) 3019{3033
ysis and Machine Intelligence, 15(11), pp. 1162{1173
(1993).
9. Pritpal, S. and Budhiraja, S. Feature extraction and
classication techniques in OCR systems for handwritten
Gurmukhi script-a survey", International Journal
of Engineering Research and Applications (IJERA),
1(4), pp. 1736{1739 (2011).
10. Lehal, G.S. and Singh, C. A Gurmukhi script recognition
system", Pattern Recognition, Proceedings. 15th
International Conference, 2 (2000).
11. Zand, M., Naghsh Nilchi, A., and Monadjemi, S.A.
Recognition-based segmentation in Persian character
recognition", Proceedings of World Academy of Science,
Engineering and Technology International Journal
of Computer, Electrical, Automation, Control and
Information Engineering, 28 (2008).
12. Khosravi, H. and Kabir, E. A blackboard approach
towards integrated Farsi OCR system", International
Journal of Document Analysis and Recognition (IJDAR),
12(1), pp. 21{32 (2009).
13. Malik, S.A., Maqsood, M., Aadil, F., et al. An
ecient segmentation technique for Urdu optical character
recognizer (OCR)", Advances in Information and
Communication, 70, pp. 131{141 (2019).
14. Mirzaee, M. Text detection in images for Persian
optical character recognition", MSc Thesis, University
Of Tehran, Iran (2012).
15. Ghanbari, N. A review of research studies on the
recognition of Farsi alphabetic and numeric characters
in the last decade", Fundamental Research in Electrical
Engineering, Springer, Singapore, pp. 173{184 (2019).
16. Kameswara Rao, T., Yashwanth Chowdary, K.,
Koushik Chowdary, I., et al. Optical character recognition
from printed text images", International Journal
of Scientic Research in Computer Science, Engineering
and Information Technology, 5, pp. 597{604
(2019).
17. Bina Persian OCR system", ASR-Gooyesh Co.,
http://www.binaocr.com.
18. Niwa, H., Kayashima, K., and Shimeki, Y. Postprocessing
for character recognition using keyword
information", IAPR Workshop on Machine Vision
Applcatron, Tokyo (1992).
19. Hong, T. Degraded text recognition using visual and
linguistic context", Doctoral Dissertation, University
of New York, Bualo (1996).
20. Kukich, K. Techniques for automatically correcting
words in text", Acm. Computing Surveys (CSUR),
24(4), pp. 377{439 (1992).
21. Jurafsky, D. and Martin, H.J., Speech and Language
Processing: An Introduction to Natural Language Processing,
Computational Linguistics, and Speech Recognition
(2008).
22. Mays, E., Damerau, F.J., and Mercer, L.R. Context
based spelling correction context-sensitive spell
checking based on eld association terms dictionaries",
Information Processing & Management, 27(5), pp.
517{522 (1991).
23. Beaufort, R. and Thillou, C. A weighted nite-state
framework for correcting errors in natural scene OCR",
Document Analysis and Recognition, ICDAR 2007.
Ninth International Conference, 2, pp. 889{893 (2007).
24. Bassil, Y. and Alwani, M. OCR post-processing
error correction algorithm using google online spelling
suggestion", Computer Science ArXiv, 3(1), pp. 1{9
(2012).
25. Ranka, V., Patil, S., Patni, S., et al. Automatic table
detection and retention from scanned document images
via analysis of structural information", 2017 Fourth
International Conference on Image Information Processing
(ICIIP), India (2017).
26. Jahan MAC, A. and Ragel, R. Locating tables in
scanned documents for reconstructing and republishing",
7th International Conference on Information and
Automation for Sustainability, Sri Lanka (2014).
27. Nagata, M. Japanese OCR error correction using
character shape similarity and statistical language
model", Proceedings of the 36th Annual Meeting of the
Association for Computational Linguistics and 17th
International Conference on Computational Linguistics,
2, pp. 922{928 (1998).
28. A
i, H., Barrault, L., and Schwenk, H. OCR error
correction using statistical machine translation", International
Journal of Computational Linguistics and
Applications, 7(1), pp. 175{191 (2016).
29. Kesorn, K. and Phawapoothayanchai, P. Optical
character recognition (OCR) enhancement using an
approximate string matching technique", Engineering
and Applied Science Research, 45(4), pp. 282{289
(2018).
30. Doush, A.I., Alkhateeb, F. and Gharaibeh, H.A. A
novel Arabic OCR post-processing using rule-based
and word context techniques", International Journal
on Document Analysis and Recognition (IJDAR),
21(1-2), pp. 77{89 (2018).
31. Magdy, W. and Darwish, K. Arabic OCR error
correction using character segment correction, language
modeling, and shallow morphology", EMNLP
'06: Proceedings of the 2006 Conference on Empirical
Methods in Natural Language Processing, pp. 408{414
(2006).
32. Ramanan, M., Ramanan, A. and Charles, E.Y.A.
A performance comparison and post-processing error
correction technique to OCRs for printed Tamil texts",
Industrial and Information Systems (ICIIS), 2014 9th
International Conference, India (2014).
33. Kolak, O. and Resnik, P. OCR post-processing for low
density languages", Proceedings of the Conference on
Human Language Technology and Empirical Methods
in Natural Language Processing, pp. 867{874 (2005).
34. Zaiz, F., Babahenini, C.M., and Djeal, A. Puzzle
based system for improving Arabic handwriting
recognition", Engineering Applications of Articial
Intelligence, 56, pp. 222{229 (2016).
Z. Khosrobeigi et al./Scientia Iranica, Transactions D: Computer Science & ... 27 (2020) 3019{3033 3033
35. Al-Youse, H. and Upda, S.S. Recognition of Arabic
characters", IEEE Transactions on Pattern Analysis
& Machine Intelligence, 8, pp. 853{857 (1992).
36. Khorsheed, S.M. and Clocksin, F.C., Structural Features
of Cursive Arabic Script, BMVC (1999).
37. Mahootian, S., Persian, Routledge (2002).
38. Awde, N. and Samano, P., The Arabic Alphabet: How
to Read and Write It, Lyle Stuart (1986).
39. Parhami, B. and Taraghi, M. Automatic recognition
of printed Farsi texts", Pattern Recognition, 14(1-6),
pp. 395{403 (1981).
40. Azmi, R. and Kabir, E. A new segmentation technique
for omnifont Farsi text", Pattern Recognition
Letters, 22(2), pp. 97{104 (2001).
41. Ebrahimi, A. and Kabir, E. A pictorial dictionary for
printed Farsi subwords", Pattern Recognition Letters,
29(5), pp. 656{663 (2008).
42. Azadnia, M. Presenting an expert system for automatic
correcting Persian texts", International Journal
of Computer Science and Network Security, 8(3), pp.
27{31 (2008).
43. Tesseract Open Source OCR engine (main repository),
https://github.com/tesseract-OCR/tesseract.
44. Smith, R. An overview of the Tesseract OCR engine",
9th IEEE Intl. Conf. on Document Analysis and Recognition
(ICDAR) (2007).
45. Persian processing, Tmu-printed-farsi-text-1-100-pp,
http://farsiocr.ir/.