Windows 10 IoT Serials 10-How to use OCR engine for text recognition

Windows 10 IoT Serials 10-How to use OCR engine for text recognition

1 Introduction   

    OCR (Optical Character Recognition, OCR ) refers to an electronic device (e.g. a scanner or digital camera) check characters printed on paper, by detecting dark, bright mode determining its shape, and then the character recognition computer to translate into shape The process of text; that is, for printed characters, the text in a paper document is optically converted into a black and white dot-matrix image file, and the text in the image is converted into a text format through the recognition software for the word processing software to further Edit processing technology.

    In the Windows 10 universal application UWP sample, the OCR application is included. For details, please refer to ( https://github.com/Microsoft/Windows-universal-samples/tree/master/Samples/OCR ). Use this application to Users can complete the following functions:

    1. Check the OCR language supported by the current device

    2. Get the OCR language available for the current device

    3. Create an OCR recognition instance for a certain language

    4. Load the picture and recognize the text in the picture

    5. Recognize text from pictures captured by the camera

    6. Overlay the recognized text on the picture

2. Problem

    The universal application runs on the PC platform without any problems. But after deploying to Windows 10 IoT Core devices, the following errors will appear: "No available OCR languages.", "English is not supported", as shown in the figure below.

3. Solution

    The above problem occurs because there is no OCR related resources on the Windows 10 IoT Core device, which causes the program to fail to run normally. The solution is as follows:

    1. copy the C:\\Windows\OCR directory of the Windows 10 device to the c$\Windows directory of the Windows 10 IoT Core device , as shown in the figure below.

    Next, copy the Microsoft-Windows-LanguageFeatures-OCR-en-us-Package... .cat file in C:\Windows\System32\CatRoot\{*****} to the c$ of the Windows 10 IoT Core device :\Windows\System32\CatRoot\{*****} directory, as shown in the figure below.

4. Debugging

    This debugging was carried out on the MBM board, the OS version number of the Windows 10 IoT Core device was v.10.0.16299.192 , and the camera used was Microsoft LifeCam HD - 3000 .

    1. debug the Chinese recognition of OCR images, and the results are as follows:

    It can be seen that the recognition accuracy of Chinese is quite high, and it is basically recognized.

    Then, perform OCR Chinese recognition and OCR English recognition on the debugging camera, and the results are shown in the following figure.

    It can be seen from the figure that the result of camera recognition depends on factors such as light and camera resolution. The better the ambient light and the higher the camera resolution, the higher the recognition accuracy.

Reference: https://cloud.tencent.com/developer/article/1075618 Windows 10 IoT Serials 10-How to use OCR engine for text recognition-Cloud + Community-Tencent Cloud