Introduction – What is OCR and how does it work
Optical character recognition (OCR) is the electronic or mechanical conversion of images of typed, handwritten, or printed text into machine-encoded text. OCR is a subfield of pattern recognition, artificial intelligence, and computer vision research.
Some systems can generate formatted output that closely resembles the original page, including images, columns, and other non-textual components.
Video demonstration of an OCR system in action:
*Video credit:
Vassia Atanassova – Spiritia, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons
Some background history of the technology
The history of optical character recognition can be traced back to telegraphy and the development of reading aids for the blind. Emanuel Goldberg created a device in 1914 that could read characters and translate them into telegraph code.
The Optophone was created by Edmund Fournier d’Albe and was a handheld scanner that produced tones that represented letters or characters. Omni-font OCR is credited to Ray Kurzweil, but it was already in use by businesses in the late 1960s and early 1970s, notably CompuScan. Ray Kurzweil created the optical character recognition computer in 1976, and it went on sale in 1978.
In the 2000s, OCR was made available online as a service (WebOCR), and it is currently employed in mobile devices for augmented reality and text-to-speech applications. There are numerous commercial and open source OCR systems.
These systems are capable of interpreting a number of common languages including but not limited to:
- English
- Spanish
- Arabic
- Russian
- German
- Mandarin
- Japanese and others
Use cases for the technology
Making scanned documents searchable by converting them to searchable PDFs
The most common use case of OCR technology is in converting scanned paper documents into electronic PDF’s so that the text contained in the document can be searchable.
This is especially useful for very large documents such as contracts or briefs used commonly by people in the legal and other industries where documents can become extremely large and manually sifting through pages to find specific text won’t be feasible.
With OCR technology the “find” feature can be used which will save lots of time.
Scanning books electronically (e.g. Google Books)
Another common use-case is scanning books and large documents into electronic format, an example where this was used at scale was when Google launched Google Books where they had many employees scanning hundreds upon hundreds of books which at the time caused some controversy.
OCR was critical for this process to scan large volumes of text and convert into an electronic format and is still used inside all scanners.
Data entry processes
Image credit: placementindia.com
General data entry tasks may utilize OCR such as scanning and working with the following types of documents:
- Passports
- Drivers license
- Invoice
- Bank statements and receipts
- Other types of record keeping for physical documents or ID’s used in Government departments
Traffic sign recognition
Different types of image recognition technology that can read traffic signs and number plates of cars makes use of OCR technology and is used in a variety of software from consumer apps that read text to police or law enforcement who may use it to identify vehicles based off their registration plates.
Assistive technology for blind and visually impaired users
With the use of assistive technology, you can perform tasks that your impairment prevents you from performing on your own. People can use assistive technology to do a task more quickly or safely.
Examples of assistive technology that employs OCR include:
- Software that formats documents or web pages so they can be read by screen readers
- Extracting content from documents which would be images otherwise and then applying text to speech to read out the words to help people who are blind or have low vision
Translating languages
Language translation services such as Google translate, Bing translate, Yandex translate and other similar applications have functionality that allows users to take a picture of an object such as a sign which contains text in another language and have it translated in real-time to their preferred language.
Frequently Asked Questions (FAQ) about the subject
Can Google translate make use of OCR?
Yes if you use the image scanning feature inside the Google translate mobile app it makes use of OCR technology in order to extract the text out of images your phone camera hovers over.
Do all printers have OCR?
While almost all modern printers do support OCR in some way or another not all do especially older printers or ones that lack any scanning functionality.
Conclusion
If you made it this far, thank you for reading, and we hope you found this page useful; if so, please share it on social media and follow us on our profiles such as Facebook to stay up to date on new content we share.
We also share other tips on our blog and offer various business services such as websites, digital marketing, and more.
Related content from our knowledge base:
API (Application Programming Interface)
Bibliography & citation(s):
“Optical character recognition” Wikipedia, 14 Apr. 2002, en.wikipedia.org/wiki/Optical_character_recognition. Accessed 7 Jan. 2023.
“Www.Ndis.Gov.Au” www.ndis.gov.au/participants/assistive-technology-explained. Accessed 8 Jan. 2023.
“Types of Assistive Technology” Web Access, webaccess.berkeley.edu/resources/assistive-technology. Accessed 8 Jan. 2023.
S. McDaniel, (2021). OCR (Optical Character Recognition) for Accessibility, Annotation, and PDF Optimization [Online]. Academic Technology Solutions. Available at: https://academictech.uchicago.edu/2021/11/03/ocr-optical-character-recognition-for-accessibility-annotation-and-pdf-optimization/ (Accessed: 8 January 2023).