OCR¶
A filter that performs optical character recognition on video frames.
Tesseract 3.04.00 language data files are required. See the datapath parameter.
- ocr.Recognize(clip clip[, string datapath, string language="", string[] options])¶
This function runs Tesseract on each video frame and adds the following properties:
- OCRString
The OCR result as UTF-8 string.
- OCRConfidence
Confidence value for each recognized word as an array of integers in range 0-100. The number of confidence values should correspond to the number of space-delimited words in
OCRString
.
Parameters:
- clip
Clip to be processed. Must be grayscale with 8 bits per sample.
- datapath
Path to a folder containing a “tessdata” folder, in which Tesseract’s data files must be found. Must have a trailing slash.
In Windows, this parameter’s default value is the folder where the Ocr plugin DLL resides. In other operating systems, this parameter’s default value is empty, and Tesseract’s default data path will be used.
- language
An ISO 639-3 language string. Uses Tesseract’s default language if unset (usually
eng
). The language may be a string of the form[~]<lang>[+[~]<lang>]*
, indicating that multiple languages are to be loaded. E.g.hin+eng
will load Hindi and English. Languages may specify internally that they want to be loaded with one or more other languages, so the~
sign is available to override that. E.g. ifhin
were set to loadeng
by default, thenhin+~eng
would force loading onlyhin
. The number of loaded languages is limited only by memory, with the caveat that loading additional languages will impact both speed and accuracy, as there is more work to do to decide on the applicable language, and there is more chance of hallucinating incorrect words.- options
Options to be passed to Tesseract, as a list of (key, value) pairs. Available options are documented in
tesseractclass.h
of Tesseract’s source code.Warning
Tesseract is not completely thread-safe. Changing any of the options starting with
classify
ortextord
will change them for all instances of this filter.
Example:
ret = core.ocr.Recognize(src, language="eng", options=["tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.:;,-!?\"'"])
Note
This only really works on frames that contain nothing but text, so make sure to filter the input appropriately if this is not the case.