Practical segmentation methods for logical and geometric layout
analysis to Improve scanned PDF accessibility to vision impaired
Nazemi, AzadehORCID: 0000-0002-1138-309X, Murray, Iain and McMeekin, David A.ORCID: 0000-0001-6445-1183
(2014)
Practical segmentation methods for logical and geometric layout
analysis to Improve scanned PDF accessibility to vision impaired.
International Journal of Signal Processing, Image Processing and Pattern Recognition, 7
(4).
pp. 23-36.
ISSN 2005-4254
The use of electronic documents has rapidly increased in recent decades and the PDF is one the
most commonly used electronic document formats. A scanned PDF is an image and does not actually
contain any text. For the vision–impaired user who is dependent upon a screen reader to access this
information, this format is not useful. Thus addressing PDF accessibility through assistive technology
has now become an important concern. PDF layout analysis provides precious formatting information
that supports PDF component classification. This classification facilitates the tag generation. Accurate
tagging produces a searchable and navigable scanned PDF document. This paper describes several
practical segmentation methods which are easy to implement and efficient for PDF layout analysis so
that the scanned PDF document can be navigated or searched using assistive technologies.
Metadata
Item Type:
Article (Published)
Refereed:
Yes
Uncontrolled Keywords:
PDF layout analysis; Optical character recognition (OCR); Vision-impaired