malexandersalazar / tools-python-image-to-text Public

Notifications You must be signed in to change notification settings
Fork 0
Star 1

A Python tool based on OpenCV, Tesseract OCR and spaCy for reading and recognize the text in an image from Windows.

1 star 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
dist		dist
img		img
raw		raw
src		src
README.md		README.md

Repository files navigation

Image to text tool

A Python tool based on OpenCV, Tesseract OCR and spaCy for reading and recognize the text in an image from Windows.

This script processes the image generating 30 variants using OpenCV adaptiveThreshold to then measure with spaCy the relevance and number of words obtained by Tesseract OCR and choose the best reading.

Installation

Tesseract OCR

The latest installers for Windows can be downloaded here.

For more information about languages supported in different versions of Tesseract visit the following link.

spaCy

To enable spaCy we must download the pre-trained models as indicated on its official site.

pip install -U spacy

Installing English:

python -m spacy download en_core_web_md

Installing Spanish:

python -m spacy download es_core_news_md

Image to text tool

Just copy the itt.py script located in the dist folder and update the Tesseract path if necessary.

import pytesseract as pyt

pyt.pytesseract.tesseract_cmd = "C:/Program Files/Tesseract-OCR/tesseract.exe"

Getting Started

To use the script we only have to indicate the path of the image that we want to read.

python itt.py W:\malexandersalazar\tools-python-image-to-text\raw

You can also set the language as a parameter. For now it only supports English ("en") and Spanish ("es").

python itt.py W:\malexandersalazar\tools-python-image-to-text\raw -l=en

If we want to support more languages we must install the necessary spaCy models and make sure that Tesseract OCR can support them as well.

Dependencies

python (== 3.11.3)
pytesseract (== 0.3.10)
cv2 (== 4.7.0)
spacy (== 3.6.0)
pandas (== 2.0.2)

License

This project is licenced under the MIT License.

About

A Python tool based on OpenCV, Tesseract OCR and spaCy for reading and recognize the text in an image from Windows.

python opencv tesseract-ocr spacy-nlp

Report repository

Languages

Jupyter Notebook 100.0%