Skip to content

thaiscaldeira/steel-coil-ocr

Repository files navigation

Industrial OCR System for Hot Laminated Steel Coils

This is the repository for the codes related to the article "Industrial OCR System for Hot Laminated Steel Coils", submited to the Journal of Control, Automation and Electrical Systems.

The archives whose name start with ocrMultiFiles contain the main processing algorithms. In order to use it, one must indicate the path for a folder with all archives to be processed, whose names must contain the expected identification to be extracted from the images. The program also receives the path for the trained CNN, responsible for the classification tasks. After all the processing, two folders "Success Files" and "Fail Files" will be created in the original folder indicated and the final processed images will be saved in them. In the Fail Files folder the original images will be saved as well for further analysis reasons. Three of these ocrMultiFiles are available, each one of the ocrMultiFiles different processing alternatives are available, indicated in the file name: Filtering, NoFiltering, Pytesseract. The first two adopt the CNN as classificator, while the second one use the Pytesseract engine.

The convCompareNets notebook is used for the training of the CNN networks used for the classification task. In order to use it, one must indicate the path of the folder containing the training images organized in subfolders whose name must be numbers indicating the different classes. It is also necessary to indicate if the training images are binarized or in gray scale and which architecture will be used in the process. In the last step, one must also indicate where the trained model shall be saved.

About

Industrial OCR System for Hot Laminated Steel Coils

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published