DocumentRecognitionSettings

DocumentRecognitionSettings class

Settings for the pdf recognition.
Contains elements that allow customizing the recognition process.

The DocumentRecognitionSettings type exposes the following members:

Constructors

NameDescription
DocumentRecognitionSettings(start_page, pages_number)Initializes a new instance of the DocumentRecognitionSettings class
DocumentRecognitionSettings(start_page, pages_number, language, detect_areas, auto_skew, threshold)Initializes a new instance of the DocumentRecognitionSettings class

Properties

NameDescription
ignored_symbolsSets blacklist for recognition symbols.
ignored_charactersSets blacklist for recognition symbols.
allowed_symbolsSet the allowed characters with alphabet property.
lines_filtrationAllows to recognize text in the tables (regions surrounded lines).
preprocessing_filtersAllows to prepare the image for OCR by adjusting pre-processing methods.
auto_contrastAllows using an additional contrast correction algorithm for the image before recognition.
allowed_charactersAllowed characters set. Determines the type of characters allowed for recognition result.
detect_areas_modeAllows to select the optimal mode for document type areas: document, photo, plain text, column, image.
auto_denoisingEnables the use of an additional neural network to improve the image - reduce noise.
Useful for images with scan artifacts, distortion, spots, flares, gradients, foreign elements.
upscale_small_fontAllows you to use additional algorithms specifically for small font recognition.
Useful for images with small size characters.
start_pageSet the first page for recognition.
pages_numberSet the number of pages for recognition multipage pdf file.

See Also