Optical Character Recognition or OCR is the technology that is used to convert characters or text that is either handwritten or printed in the form of paper, scanned document, advertisements, photos etc. into machine encoded text or we can say in the digital form. The steps involved in OCR is basically processing the input, recognizing the text and processing it further for the required purpose.
In OCR, single characters are recognized at a time. When words or lines or more text are recognized, it is called optical text recognition or text recognition models. When handwritten and cursive or this type of characters is recognized using machine learning, it is termed as Intelligent Character Recognition (ICR). And when it is dealing with words, it is called Intelligent Word Recognition (IWR).
How does OCR works?
Now while dealing with images for character recognition, the first step is image preprocessing. What to do in image preprocessing is converting the input image into proper format and doing like a clean-up where if the image has some defects or unnecessary things that would hinder the process of recognition.
Also, the image is configured dimensionally to give it as the input to the model for recognition.
Processes like image or document alignment is done in proper way i.e. if the image is tilted or reversed, it is fixed. Color coding is changed like converting to grayscale which helps is better feature extraction and hence efficient recognition. Layout analysis is done which involves finding paragraphs or columns etc. Many more operations are also involved and finally after all these preprocessing, image is fit for recognition.
The second step involved is the recognition step. The most important part in recognition is extraction of features in an efficient and reliable way such that required results can be achieved.
The finals step can be called as post-processing. After the character is identified, it can be converted into required types or forms such as ASCII etc. which can be used by computers for further purposes like storing data, verifying information, decoding message etc.
Architecture of OCR
The following diagram represents the architecture of OCR:
So, from the above diagram, we can summarize the steps in the following way:
- First the user gives an image or other format as the input.
- Next the image is preprocessed and it is converted into the format such that the model can work or recognition.
- Next, the model does the work of feature extraction. This is a very essential step as the recognition is based on the efficiency of features extraction and preservation.
- Next the features extracted are stored in a proper way.
- After this the classifier/model works in recognition part. The mapping function is applied after the classifier.
- Finally, the recognized output is displayed or processed for further needs.
Applications of OCR
- Optical Character Recognition can be used to enter data into documents such as business documents, bank documents, journals etc.
- This can be used to scan and identify text or information from some source that needs to be edited and hence edited text can be stored.
- OCR can be used for deciphering documents into text that can be played or read aloud for visually impaired.
- One of the most popular use of OCR is conversion of text from one language to another.
- Translation of words in an image to another language for sake of understanding.
- OCR can be used in security systems that reads objects like number plate, id cards etc.
- Historic paper and documents can be identified and even stored and summarized using OCR.
- OCR can be employed on verifying checks and hence processing them electronically.
- Sorting letters in a mail delivery is another good application.
- In health care field, OCR can be used to record the data of patients electronically and this is really helpful as large number of people are going to hospitals.
- In airports, OCR is used for passport analysis.
- From business cards, details can be extracted into contact database.
Advantages of OCR
The most important advantage of OCR is that it helps in saving time as the digital conversion is directly done. Also, as the human efforts are saved for the process, other activities can be completed in the meantime.
OCR can help in summarizing large chunks of text as important words and characters can be highlighted. Now as the data is stored digitally, it can be edited any time you want it to change.
OCR helps in reducing the cost for the data storage. Also, it provides security in terms of data being destroyed and can be easily backed up. Recovery of data becomes much easier.