For structured (bounding box based) text extraction, it becomes imperative that the received image and target image are aligned properly and to scale. OpenCV is a great image processing library that has a ton of features.
To align source and template images, following steps are required.
I have been working on ML projects that require image preprocessing and text extraction. To improve the quality of text extraction, there are many preprocessing steps that we need to do, they are elicited below. We use OpenCV for doing the preprocessing and tesseract-ocr for text extraction.